Skip to main content

Are Semantically Related Links More Effective for Retrieval?

  • Conference paper
Advances in Information Retrieval (ECIR 2011)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6611))

Included in the following conference series:

Abstract

Why do links work? Link-based ranking algorithms are based on the often implicit assumption that linked documents are semantically related to each other, and that link information is therefore useful for retrieval. Although the benefits of link information are well researched, this underlying assumption on why link evidence works remains untested, and the main aim of this paper is to do exactly that. Specifically, we use Wikipedia because it has a dense link structure in combination with a large category structure, which allows for an independent measurement of the semantic relatedness of linked documents. Our main findings are that: 1) global, query-independent link evidence, is not affected by the semantic nature of the links, and 2) for local, query-dependent link evidence, the effectiveness of links increases as their semantic distance decreases. That is, we directly observe that links between semantically related pages are more effective for ad hoc retrieval than links between unrelated ones. These findings confirm and quantify the underlying assumption of existing link-based methods, which sheds further light on our understanding of the nature of link evidence. Such deeper understanding is instrumental for the development of novel link-based methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Budanitsky, A., Hirst, G.: Evaluating wordnet-based measures of lexical semantic relatedness. Computational Linguistics 32(1), 13–47 (2006)

    Article  MATH  Google Scholar 

  2. Cohen, P.R., Kjeldsen, R.: Information retrieval by constrained spreading activation in semantic networks. Inf. Process. Manage. 23(4), 255–268 (1987)

    Article  Google Scholar 

  3. Craswell, N., Robertson, S., Zaragoza, H., Taylor, M.: Relevance weighting for query independent evidence. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 416–423. ACM, New York (2005) ISBN 1-59593-034-5

    Google Scholar 

  4. Davison, B.D.: Topical locality in the web. In: Research and Development in Information Retrieval (SIGIR), pp. 272–279 (2000)

    Google Scholar 

  5. Denoyer, L., Gallinari, P.: The Wikipedia XML Corpus. SIGIR Forum. 40(1), 64–69 (2006)

    Article  Google Scholar 

  6. Fuhr, N., Kamps, J., Lalmas, M., Malik, S., Trotman, A.: Overview of the INEX 2007 ad hoc track. In: Fuhr, N., Kamps, J., Lalmas, M., Trotman, A. (eds.) INEX 2007. LNCS, vol. 4862, pp. 1–23. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  7. Kamps, J., Koolen, M.: Is Wikipedia link structure different? In: Proceedings of the Second ACM International Conference on Web Search and Data Mining (WSDM 2009), pp. 232–241. ACM Press, New York (2009)

    Chapter  Google Scholar 

  8. Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. Journal of the ACM 46(5), 604–632 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  9. Koolen, M., Kamps, J.: What’s in a link? from document importance to topical relevance. In: Azzopardi, L., Kazai, G., Robertson, S., Rüger, S., Shokouhi, M., Song, D., Yilmaz, E. (eds.) ICTIR 2009. LNCS, vol. 5766, pp. 313–321. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  10. Kurland, O., Lee, L.: Pagerank without hyperlinks: structural re-ranking using links induced by language models. In: SIGIR, pp. 306–313. ACM, New York (2005)

    Google Scholar 

  11. Kurland, O., Lee, L.: Respect my authority!: Hits without hyperlinks, utilizing cluster-based language models. In: SIGIR, pp. 83–90. ACM, New York (2006)

    Google Scholar 

  12. Lempel, R., Moran, S.: Salsa: the stochastic approach for link-structure analysis. ACM Trans. Inf. Syst. 19(2), 131–160 (2001)

    Article  Google Scholar 

  13. Malik, S., Trotman, A., Lalmas, M., Fuhr, N.: Overview of INEX 2006. In: Fuhr, N., Lalmas, M., Trotman, A. (eds.) INEX 2006. LNCS, vol. 4518, pp. 1–11. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  14. Najork, M.: Comparing the effectiveness of hits and salsa. In: CIKM, pp. 157–164. ACM, New York (2007)

    Google Scholar 

  15. Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: Bringing order to the web. Technical report, Stanford Digital Library Technologies Project (1998)

    Google Scholar 

  16. Picard, J., Savoy, J.: Enhancing retrieval with hyperlinks: A general model based on propositional argumentation systems. JASIST 54(4), 347–355 (2003)

    Article  Google Scholar 

  17. Rada, R., Mili, H., Bicknell, E., Blettner, M.: Development and application of a metric on semantic nets. IEEE Transactions on Systems, Man, and Cybernetics 19(1), 17–30 (1989)

    Article  Google Scholar 

  18. Resnik, P.: Using information content to evaluate semantic similarity in a taxanomy. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI 1995), pp. 448–453 (1995)

    Google Scholar 

  19. Shakery, A., Zhai, C.: A probabilistic relevance propagation model for hypertext retrieval. In: CIKM, pp. 550–558. ACM, New York (2006)

    Google Scholar 

  20. Strube, M., Ponzetto, S.P.: Wikirelate! computing semantic relatedness using wikipedia. In: Proceedings of the Twenty-First National Conference on Artificial Intelligence (July 2006)

    Google Scholar 

  21. Zesch, T., Gurevych, I.: Analysis of the wikipedia category graph for nlp applications. In: Proceedings of the TextGraphs-2 Workshop (NAACL-HLT 2007), pp. 1–8 (April 2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Koolen, M., Kamps, J. (2011). Are Semantically Related Links More Effective for Retrieval?. In: Clough, P., et al. Advances in Information Retrieval. ECIR 2011. Lecture Notes in Computer Science, vol 6611. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20161-5_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-20161-5_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-20160-8

  • Online ISBN: 978-3-642-20161-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics