skip to main content
10.1145/967900.968118acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
Article

The effectiveness of combining information retrieval strategies for European languages

Published:14 March 2004Publication History

ABSTRACT

Building an effective Information Retrieval system requires various design choices, ranging from the weighting scheme of the type of morphological normalization. The combination of runs has become a standard technique to reap the benefits of different run types. Until now, systematic studies of the effectiveness of combination strategies have only been carried out for English. This paper provides an exploratory overview of the effectiveness of combination methods in nine European languages. We demonstrate that the combination of effective information retrieval strategies can lead to significant improvements of retrieval effectiveness. Furthermore, we analyze the relative impact of retrieving more relevant documents and of improved ranking of relevant documents. The experimental evidence is obtained using the 2003 testsuite of the cross-language evaluation forum (CLEF).

References

  1. J. A. Aslam and M. Montague. Bayes optimal metasearch: A probablistic model for combining the results of multiple retrieval systems. In SIGIR'00, pp. 379--381, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. J. A. Aslam and M. Montague. Models for metasearch. In SIGIR'01, pp. 276--284, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. S. M. Beitzel et al. Disproving the fusion hypothesis: An analysis of data fusion via effective information retrieval strategies. In ACM SAC'03, pp. 823--827, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. N. J. Belkin, C. Cool, W. B. Croft, and J. P. Callan. The effect of multiple query representations on information retrieval system performance. In SIGIR'93, pp. 339--346, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. C. Buckley, A. Singhal, and M. Mitra. New retrieval approaches using SMART: TREC 4. In TREC-4, pp. 25--48, 1996.Google ScholarGoogle Scholar
  6. A. Chowdhurry, O. Frieder, D. Grossman, and C. McCabe. Analyses of multiple-evidence combinations for retrieval strategies. In SIGIR'01, pp. 394--395, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. W. B. Croft. Combining approaches to information retrieval. In Advances in Information Retrieval, pp. 1--36. Kluwer, 2000.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. T. de Heer. The application of the concept of homeosemy to natural language information retrieval. IP&M, 18:229--236, 1982.Google ScholarGoogle Scholar
  9. B. Efron. Bootstrap methods: Another look at the jackknife. Annals of Statistics, 7:1--26, 1979.Google ScholarGoogle ScholarCross RefCross Ref
  10. B. Efron and R. J. Tibshirani. An Introduction to the Bootstrap. Chapman and Hall, New York, 1993.Google ScholarGoogle ScholarCross RefCross Ref
  11. E. A. Fox and J. A. Shaw. Combination of multiple searches. In TREC-2, pp. 243--252, 1994.Google ScholarGoogle Scholar
  12. V. Hollink, J. Kamps, C. Monz, and M. de Rijke. Monolingual document retrieval for European languages. IR, 6, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. H. Lee. Combining multiple evidence from different properties of weighting schemes. In SIGIR'95, pp. 180--188. 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. H. Lee. Analyses of multiple evidence combination. In SIGIR'97, pp. 267--276, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. P. McNamee and J. Mayfield. Character n-gram tokenization for European language text retrieval. IR, 6, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. C. Monz and M. de Rijke. Shallow morphological analysis in monolingual information retrieval for Dutch, German and Italian. In CLEF-2001, pp. 262--277, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. M. Porter. An algorithm for suffix stripping. Program, 14 (3):130--137, 1980.Google ScholarGoogle ScholarCross RefCross Ref
  18. J. J. Rocchio, Jr. Relevance feedback in information retrieval. In The SMART Retrieval System: Experiments in Automatic Document Processing, chapter 14, pp. 313--323. Prentice-Hall, 1971.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. T. Saracevic and P. B. Kantor. A study of information seeking and retrieving. III. searchers, searches, overlap. JASIST, 39:197--216, 1988.Google ScholarGoogle ScholarCross RefCross Ref
  20. J. Savoy. Combining multiple strategies for effective monolingual and cross-language retrieval. IR, 6, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Snowball. Snowball stemmers, 2003. http://snowball.tartarus.org/.Google ScholarGoogle Scholar
  22. H. Turtle and W. B. Croft. Evaluation of an inference network-based retrieval model. ACM TOIS, 9:187--222, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. C. C. Vogt and G. W. Cottrell. Predicting the performance of linearly combined IR systems. In SIGIR'98, pp. 190--196, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    SAC '04: Proceedings of the 2004 ACM symposium on Applied computing
    March 2004
    1733 pages
    ISBN:1581138121
    DOI:10.1145/967900

    Copyright © 2004 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 14 March 2004

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • Article

    Acceptance Rates

    Overall Acceptance Rate1,650of6,669submissions,25%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader