Abstract
Pseudo-relevance feedback (PRF) can enhance average retrieval effectiveness over a sufficiently large number of queries. However, PRF often introduces a drift into the original information need, thus hurting the retrieval effectiveness of several queries. While a selective application of PRF can potentially alleviate this issue, previous approaches have largely relied on unsupervised or feature-based learning to determine whether a query should be expanded. In contrast, we revisit the problem of selective PRF from a deep learning perspective, presenting a model that is entirely data-driven and trained in an end-to-end manner. The proposed model leverages a transformer-based bi-encoder architecture. Additionally, to further improve retrieval effectiveness with this selective PRF approach, we make use of the model’s confidence estimates to combine the information from the original and expanded queries. In our experiments, we apply this selective feedback on a number of different combinations of ranking and feedback models, and show that our proposed approach consistently improves retrieval effectiveness for both sparse and dense ranking models, with the feedback models being either sparse, dense or generative.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Implementation available at: https://github.com/suchanadatta/AdaptiveRLM.git.
References
Bashir, S., Rauber, A.: Improving retrievability of patents with cluster-based pseudo-relevance feedback documents selection. In: Proceedings of CIKM 2009, pp. 1863–1866. ACM, New York (2009)
Belkin, N.J., Oddy, R.N., Brooks, H.M.: Ask for information retrieval: Part I. Background and theory. J. Doc. 38(2), 61–71 (1982)
Billerbeck, B., Zobel, J.: Questioning query expansion: an examination of behaviour and parameters. In: Proceedings of 15th Australasian Database Conference - Volume 27, ADC 2004, pp. 69–76. Australian Computer Society Inc, AUS (2004)
Cao, G., Nie, J.Y., Gao, J., Robertson, S.: Selecting good expansion terms for pseudo-relevance feedback. In: Proceedings of SIGIR 2008, pp. 243–250. ACM, New York (2008)
Cohen, D., Mitra, B., Lesota, O., Rekabsaz, N., Eickhoff, C.: Not all relevance scores are equal: efficient uncertainty and calibration modeling for deep retrieval models. In: Proceedings of SIGIR 2021, pp. 654–664. ACM, New York (2021)
Cormack, G.V., Clarke, C.L.A., Buettcher, S.: Reciprocal rank fusion outperforms condorcet and individual rank learning methods. In: Proceedings of SIGIR 2009, pp. 758–759. ACM, New York (2009)
Craswell, N., Mitra, B., Yilmaz, E., Campos, D.: Overview of the TREC 2020 deep learning track. In: Proceedings of TREC 2020, vol. 1266. NIST Special Publication (2020)
Craswell, N., Mitra, B., Yilmaz, E., Campos, D., Voorhees, E.M.: Overview of the TREC 2019 deep learning track (2019)
Cronen-Townsend, S., Zhou, Y., Croft, W.B.: A framework for selective query expansion. In: Proceedings of CIKM 2004, pp. 236–237. ACM, New York (2004)
Cronen-Townsend, S., Zhou, Y., Croft, W.B.: Predicting query performance. In: Proceedings of SIGIR 2002, pp. 299–306. ACM, New York (2002)
Datta, S., Ganguly, D., Greene, D., Mitra, M.: Deep-QPP: a pairwise interaction-based deep learning model for supervised query performance prediction. In: Proceedings of WSDM 2022, pp. 201–209. ACM, New York (2022)
Datta, S., MacAvaney, S., Ganguly, D., Greene, D.: A ‘pointwise-query, listwise-document’ based query performance prediction approach. In: Proceedings of SIGIR 2022, pp. 2148–2153. ACM, New York (2022)
Deveaud, R., Mothe, J., Ullah, M.Z., Nie, J.Y.: Learning to adaptively rank document retrieval system configurations. ACM Trans. Inf. Syst. 37(1), 1–41 (2018)
Ganguly, D., Leveling, J., Jones, G.J.F.: Cross-lingual topical relevance models. In: COLING, pp. 927–942. Indian Institute of Technology Bombay, India (2012)
He, B., Ounis, I.: Combining fields for query expansion and adaptive query expansion. Inf. Process. Manage. 43(5), 1294–1307 (2007)
He, B., Ounis, I.: Finding good feedback documents. In: Proceedings of CIKM 2009, pp. 2011–2014. ACM, New York (2009)
Jaleel, N.A., et al.: UMass at TREC 2004: novelty and HARD. In: TREC 2004 (2004)
Khattab, O., Zaharia, M.: ColBERT: efficient and effective passage search via contextualized late interaction over BERT, pp. 39–48. ACM, New York (2020)
Lavrenko, V., Croft, W.B.: Relevance based language models. In: Proceedings of SIGIR 2001, pp. 120–127. ACM, New York (2001)
Lee, K.S., Croft, W.B., Allan, J.: A cluster-based resampling method for pseudo-relevance feedback. In: Proceedings of SIGIR 2008, pp. 235–242. ACM, New York (2008)
Li, C., et al.: NPRF: a neural pseudo relevance feedback framework for ad-hoc information retrieval. In: Proceedings of EMNLP 2018, Brussels, Belgium, pp. 4482–4491. ACL (2018)
Li, H., Mourad, A., Koopman, B., Zuccon, G.: How does feedback signal quality impact effectiveness of pseudo relevance feedback for passage retrieval. In: Proceedings of SIGIR 2022, pp. 2154–2158. ACM, New York (2022)
Li, H., et al.: To interpolate or not to interpolate: PRF, dense and sparse retrievers. In: Proceedings of SIGIR 2022, pp. 2495–2500. ACM, New York (2022)
Lin, J.: Divergence measures based on the shannon entropy. IEEE Trans. Inf. Theor. 37(1), 145–151 (2006)
Lv, Y., Zhai, C.: Adaptive relevance feedback in information retrieval. In: Proceedings of CIKM 2009, pp. 255–264. ACM, New York (2009)
Mackie, I., Chatterjee, S., Dalton, J.: Generative relevance feedback with large language models. In: Proceedings of SIGIR 2023, pp. 2026–2031. ACM, New York (2023)
Mitra, M., Singhal, A., Buckley, C.: Improving automatic query expansion. In: Proceedings of SIGIR 1998, pp. 206–214. ACM, New York (1998)
Montazeralghaem, A., Zamani, H., Allan, J.: A reinforcement learning framework for relevance feedback. In: Proceedings of SIGIR 2020, pp. 59–68. ACM, New York (2020)
Naseri, S., Dalton, J., Yates, A., Allan, J.: CEQE: contextualized embeddings for query expansion. In: Hiemstra, D., Moens, M.F., Mothe, J., Perego, R., Potthast, M., Sebastiani, F. (eds.) ECIR 2021. LNCS, vol. 12656, pp. 467–482. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-72113-8_31
Nguyen, T., et al.: MS MARCO: a human generated machine reading comprehension dataset. In: CoCo@NIPS. CEUR Workshop Proceedings, vol. 1773 (2016)
Nogueira, R.F., Yang, W., Cho, K., Lin, J.: Multi-stage document ranking with BERT. CoRR abs/1910.14424 (2019)
Ogilvie, P., Voorhees, E., Callan, J.: On the number of terms used in automatic query expansion. Inf. Retrieval 12(6), 666–679 (2009)
Robertson, S., Walker, S., Beaulieu, M., Gatford, M., Payne, A.: Okapi at TREC-4 (1996)
Rocchio, J.J.: Relevance Feedback in Information Retrieval. Prentice Hall, Englewood Cliffs (1971)
Roy, D., Ganguly, D., Mitra, M., Jones, G.J.: Word vector compositionality based relevance feedback using kernel density estimation. In: Proceedings of CIKM 2016, pp. 1281–1290. ACM, New York (2016)
Salakhutdinov, R., Mnih, A.: Bayesian probabilistic matrix factorization using Markov chain Monte Carlo. In: Proceedings of ICML 2008, pp. 880–887. ACM, New York (2008)
Shtok, A., Kurland, O., Carmel, D.: Using statistical decision theory and relevance models for query-performance prediction. In: Proceedings of SIGIR 2010, pp. 259–266. ACM, New York (2010)
Shtok, A., Kurland, O., Carmel, D., Raiber, F., Markovits, G.: Predicting query performance by query-drift estimation. ACM Trans. Inf. Syst. 30(2), 1–35 (2012)
Terra, E., Warren, R.: Poison pills: harmful relevant documents in feedback. In: Proceedings of CIKM 2005, pp. 319–320. ACM, New York (2005)
Wang, X., Macdonald, C., Tonellotto, N., Ounis, I.: Pseudo-relevance feedback for multiple representation dense retrieval. In: ICTIR, pp. 297–306. ACM, New York (2021)
Wang, X., MacDonald, C., Tonellotto, N., Ounis, I.: ColBERT-PRF: semantic pseudo-relevance feedback for dense passage and document retrieval. ACM Trans. Web 17(1), 1–39 (2023)
Xiong, L., et al.: Approximate nearest neighbor negative contrastive learning for dense text retrieval. In: ICLR (2021)
Xu, J., Croft, W.B.: Improving the effectiveness of information retrieval with local context analysis. ACM Trans. Inf. Syst. 18(1), 79–112 (2000)
Yu, H., Xiong, C., Callan, J.: Improving query representations for dense retrieval with pseudo relevance feedback, pp. 3592–3596. ACM, New York (2021)
Zamani, H., Dadashkarimi, J., Shakery, A., Croft, W.B.: Pseudo-relevance feedback based on matrix factorization. In: Proceedings CIKM 2016, pp. 1483–1492. ACM, New York (2016)
Zheng, Z., Hui, K., He, B., Han, X., Sun, L., Yates, A.: BERT-QE: contextualized query expansion for document re-ranking. In: Findings of the ACL: EMNLP 2020, pp. 4718–4728. ACL (2020)
Zhou, Y., Croft, W.B.: Query performance prediction in web search environments. In: Proceedings of SIGIR 2007, pp. 543–550. ACM, New York (2007)
Zhuang, S., Li, H., Zuccon, G.: Implicit feedback for dense passage retrieval: a counterfactual approach. In: Proceedings of SIGIR 2022, pp. 18–28. ACM, New York (2022)
Acknowledgement
The first and the fourth authors were partially supported by Science Foundation Ireland (SFI) grant number SFI/12/RC/2289_P2.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Datta, S., Ganguly, D., MacAvaney, S., Greene, D. (2024). A Deep Learning Approach for Selective Relevance Feedback. In: Goharian, N., et al. Advances in Information Retrieval. ECIR 2024. Lecture Notes in Computer Science, vol 14609. Springer, Cham. https://doi.org/10.1007/978-3-031-56060-6_13
Download citation
DOI: https://doi.org/10.1007/978-3-031-56060-6_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-56059-0
Online ISBN: 978-3-031-56060-6
eBook Packages: Computer ScienceComputer Science (R0)