research-article

Relevance Scores for Triples from Type-Like Relations

Authors:
Hannah Bast

University of Freiburg, Freiburg im Breisgau, Germany

University of Freiburg, Freiburg im Breisgau, Germany
View Profile

,
Björn Buchhold

University of Freiburg, Freiburg im Breisgau, Germany

University of Freiburg, Freiburg im Breisgau, Germany
View Profile

,
Elmar Haussmann

University of Freiburg, Freiburg im Breisgau, Germany

University of Freiburg, Freiburg im Breisgau, Germany
View Profile

SIGIR '15: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information RetrievalAugust 2015Pages 243–252https://doi.org/10.1145/2766462.2767734

Published:09 August 2015Publication History

SIGIR '15: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval

Pages 243–252

ABSTRACT

We compute and evaluate relevance scores for knowledge-base triples from type-like relations. Such a score measures the degree to which an entity "belongs" to a type. For example, Quentin Tarantino has various professions, including Film Director, Screenwriter, and Actor. The first two would get a high score in our setting, because those are his main professions. The third would get a low score, because he mostly had cameo appearances in his own movies. Such scores are essential in the ranking for entity queries, e.g. "American actors" or "Quentin Tarantino professions". These scores are different from scores for "correctness" or "accuracy" (all three professions above are correct and accurate). We propose a variety of algorithms to compute these scores. For our evaluation we designed a new benchmark, which includes a ground truth based on about 14K human judgments obtained via crowdsourcing. Inter-judge agreement is slightly over 90%. Existing approaches from the literature give results far from the optimum. Our best algorithms achieve an agreement of about 80% with the ground truth.

References

A. Balmin, V. Hristidis, and Y. Papakonstantinou. ObjectRank: Authority-based keyword search in databases. In VLDB, pages 564--575, 2004. Google ScholarDigital Library
K. Balog, P. Serdyukov, and A. P. de Vries. Overview of the TREC 2011 Entity Track. In TREC, 2011.Google Scholar
H. Bast, F. Baurle, B. Buchhold, and E. Haussmann. Broccoli: Semantic full-text search at your fingertips. CoRR, abs/1207.2615, 2012.Google Scholar
H. Bast and E. Haussmann. Open information extraction via contextual sentence decomposition. In ICSC, pages 154--159, 2013. Google ScholarDigital Library
G. Bhalotia, A. Hulgeri, C. Nakhe, S. Chakrabarti, and S. Sudarshan. Keyword searching and browsing in databases using BANKS. In ICDE, pages 431--440, 2002. Google ScholarDigital Library
D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. In NIPS, pages 601--608, 2001.Google ScholarDigital Library
K. D. Bollacker, C. Evans, P. Paritosh, T. Sturge, and J. Taylor. Freebase: a collaboratively created graph database for structuring human knowledge. In SIGMOD, pages 1247--1250, 2008. Google ScholarDigital Library
J. P. Cede\ no and K. S. Candan. R2DF framework for ranked path queries over weighted RDF graphs. In WIMS, page 40, 2011. Google ScholarDigital Library
S. Chaudhuri, G. Das, V. Hristidis, and G. Weikum. Probabilistic ranking of database query results. In VLDB, pages 888--899, 2004. Google ScholarDigital Library
A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc., pages 1--38, 1977.Google ScholarCross Ref
R. Q. Dividino, G. Gröner, S. Scheglmann, and M. Thimm. Ranking RDF with provenance via preference aggregation. In EKAW, pages 154--163, 2012. Google ScholarDigital Library
X. L. Dong, E. Gabrilovich, G. Heitz, W. Horn, K. Murphy, S. Sun, and W. Zhang. From data fusion to knowledge fusion. PVLDB, 7(10):881--892, 2014. Google ScholarDigital Library
S. Elbassuoni, M. Ramanath, R. Schenkel, M. Sydow, and G. Weikum. Language-model-based ranking for queries on RDF-graphs. In CIKM, pages 977--986, 2009. Google ScholarDigital Library
R. Fagin, R. Kumar, M. Mahdian, D. Sivakumar, and E. Vee. Comparing and aggregating rankings with ties. In PODS, pages 47--58, 2004. Google ScholarDigital Library
J. L. Fleiss. Measuring nominal scale agreement among many raters. Psychological bulletin, 76(5):378, 1971.Google ScholarCross Ref
T. Franz, A. Schultz, S. Sizov, and S. Staab. TripleRank: Ranking semantic web data by tensor decomposition. In ISWC, pages 213--228, 2009. Google ScholarDigital Library
T. Hofmann. Probabilistic latent semantic indexing. In SIGIR, pages 50--57, 1999. Google ScholarDigital Library
C. N. S. Jr. and A. A. Freitas. A survey of hierarchical classification across different application domains. Data Min. Knowl. Discov., 22(1--2):31--72, 2011. Google ScholarDigital Library
C. D. Paice. Another stemmer. SIGIR Forum, 24(3):56--61, 1990. Google ScholarDigital Library
S. Parsons. Current approaches to handling imperfect information in data and knowledge bases. IEEE Trans. Knowl. Data Eng., 8(3):353--372, 1996. Google ScholarDigital Library
D. Ramage, D. L. W. Hall, R. Nallapati, and C. D. Manning. Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora. In EMNLP, pages 248--256, 2009. Google ScholarDigital Library
C. J. Van Rijsbergen, S. E. Robertson, and M. F. Porter. New models in probabilistic information retrieval. Computer Laboratory, University of Cambridge, 1980.Google Scholar

Index Terms

Relevance Scores for Triples from Type-Like Relations
1. Information systems
  1. Information retrieval
    1. Document representation

Recommendations

Displaying relevance scores for search results
SIGIR '13: Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

Internet search engines typically compute a relevance score for webpages given the query terms, and then rank the pages by decreasing relevance scores. The popular search engines do not, however, present the relevance scores that were computed during ...
Read More
Predicting clinical scores using semi-supervised multimodal relevance vector regression
MLMI'11: Proceedings of the Second international conference on Machine learning in medical imaging

We present a novel semi-supervised multimodal relevance vector regression (SM-RVR) method for predicting clinical scores of neurological diseases from multimodal brain images, to help evaluate pathological stage and predict future progression of ...
Read More
Relevance assessment: are judges exchangeable and does it matter
SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval

We investigate to what extent people making relevance judgements for a reusable IR test collection are exchangeable. We consider three classes of judge: "gold standard" judges, who are topic originators and are experts in a particular information ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR '15: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval
August 2015
1198 pages
ISBN:9781450336215
DOI:10.1145/2766462
General Chair:
Ricardo Baeza-Yates
Yahoo Labs, USA
,
Program Chairs:
Mounia Lalmas
Yahoo Labs, UK
,
Alistair Moffat
University of Melbourne, Australia
,
Berthier Ribeiro-Neto
Google, Brazil, and UFMG, Brazil
Copyright © 2015 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 9 August 2015
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
semantic ranking
triple scores
Qualifiers
- research-article
Conference

Acceptance Rates
SIGIR '15 Paper Acceptance Rate70of351submissions,20%Overall Acceptance Rate792of3,983submissions,20%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 13
  Total Citations
  View Citations
- 429
  Total Downloads
- Downloads (Last 12 months)14
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Relevance Scores for Triples from Type-Like Relations

SIGIR '15: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

Displaying relevance scores for search results

Predicting clinical scores using semi-supervised multimodal relevance vector regression

Relevance assessment: are judges exchangeable and does it matter