short-paper

Simulating Users in Interactive Web Table Retrieval

Authors:
Björn Engelmann

TH Köln (University of Applied Sciences), Köln, Germany

TH Köln (University of Applied Sciences), Köln, Germany

0009-0000-7074-9066
View Profile

,
Timo Breuer

TH Köln (University of Applied Sciences), Köln, Germany

TH Köln (University of Applied Sciences), Köln, Germany

0000-0002-1765-2449
View Profile

,
Philipp Schaer

TH Köln (University of Applied Sciences), Köln, Germany

TH Köln (University of Applied Sciences), Köln, Germany

0000-0002-8817-4632
View Profile

CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge ManagementOctober 2023Pages 3875–3879https://doi.org/10.1145/3583780.3615187

Published:21 October 2023Publication History

CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management

Pages 3875–3879

ABSTRACT

Considering the multimodal signals of search items is beneficial for retrieval effectiveness. Especially in web table retrieval (WTR) experiments, accounting for multimodal properties of tables boosts effectiveness. However, it still remains an open question how the single modalities affect user experience in particular. Previous work analyzed WTR performance in ad-hoc retrieval benchmarks, which neglects interactive search behavior and limits the conclusion about the implications for real-world user environments.

To this end, this work presents an in-depth evaluation of simulated interactive WTR search sessions as a more cost-efficient and reproducible alternative to real user studies. As a first of its kind, we introduce interactive query reformulation strategies based on Doc2Query, incorporating cognitive states of simulated user knowledge. Our evaluations include two perspectives on user effectiveness by considering different cost paradigms, namely query-wise and time-oriented measures of effort. Our multi-perspective evaluation scheme reveals new insights about query strategies, the impact of modalities, and different user types in simulated WTR search sessions.

References

Leif Azzopardi, Maarten de Rijke, and Krisztian Balog. 2007. Building Simulated Queries for Known-Item Topics: An Analysis Using Six European Languages. In SIGIR 2007: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Amsterdam, the Netherlands, July 23--27, 2007, , Wessel Kraaij, Arjen P. de Vries, Charles L. A. Clarke, Norbert Fuhr, and Noriko Kando (Eds.). ACM, 455--462. https://doi.org/10.1145/1277741.1277820 https://doi.org/10.1145/1277741.1277820.Google ScholarDigital Library
Peter Bailey, Alistair Moffat, Falk Scholer, and Paul Thomas. 2015. User Variability and IR System Evaluation. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, Santiago, Chile, August 9--13, 2015, Ricardo Baeza-Yates, Mounia Lalmas, Alistair Moffat, and Berthier A. Ribeiro-Neto (Eds.). ACM, 625--634. https://doi.org/10.1145/2766462.2767728 https://doi.org/10.1145/2766462.2767728.Google ScholarDigital Library
Krisztian Balog, David Maxwell, Paul Thomas, and Shuo Zhang. 2021. Report on the 1st Simulation for Information Retrieval Workshop (Sim4IR 2021) at SIGIR 2021. SIGIR Forum, Vol. 55, 2 (2021), 10:1--10:16.Google Scholar
Krisztian Balog, David Maxwell, Paul Thomas, and Shuo Zhang. 2022. Report on the 1st Simulation for Information Retrieval Workshop (Sim4IR 2021) at SIGIR 2021. SIGIR Forum, Vol. 55, 2, Article 10 (mar 2022), 16 pages. https://doi.org/10.1145/3527546.3527559Google ScholarDigital Library
Feza Baskaya, Heikki Keskustalo, and Kalervo J"arvelin. 2013. Modeling Behavioral Factors in Interactive Information Retrieval. In 22nd ACM International Conference on Information and Knowledge Management, CIKM '13, San Francisco, CA, USA, October 27 - November 1, 2013, , Qi He, Arun Iyengar, Wolfgang Nejdl, Jian Pei, and Rajeev Rastogi (Eds.). ACM, 2297--2302. https://doi.org/10.1145/2505515.2505660 https://doi.org/10.1145/2505515.2505660.Google ScholarDigital Library
Ben Carterette, Ashraf Bah, and Mustafa Zengin. 2015a. Dynamic Test Collections for Retrieval Evaluation. In Proceedings of the 2015 International Conference on The Theory of Information Retrieval (Northampton, Massachusetts, USA) (ICTIR '15). Association for Computing Machinery, New York, NY, USA, 91--100. https://doi.org/10.1145/2808194.2809470Google ScholarDigital Library
Ben Carterette, Ashraf Bah, and Mustafa Zengin. 2015b. Dynamic Test Collections for Retrieval Evaluation. In Proceedings of the 2015 International Conference on the Theory of Information Retrieval, ICTIR 2015, Northampton, Massachusetts, USA, September 27--30, 2015, , James Allan, W. Bruce Croft, Arjen P. de Vries, and Chengxiang Zhai (Eds.). ACM, 91--100. https://doi.org/10.1145/2808194.2809470 https://doi.org/10.1145/2808194.2809470.Google ScholarDigital Library
Zhiyu Chen, Shuo Zhang, and Brian D. Davison. 2021. WTR : A Test Collection for Web Table Retrieval. In SIGIR. ACM, 2514--2520.Google Scholar
Aleksandr Chuklin, Ilya Markov, and Maarten de Rijke. 2015. Click Models for Web Search. Morgan & Claypool Publishers. https://doi.org/10.2200/S00654ED1V01Y201507ICR043 https://doi.org/10.2200/S00654ED1V01Y201507ICR043.Google ScholarCross Ref
Sebastian Günther and Matthias Hagen. 2021. Assessing Query Suggestions for Search Session Simulation. Sim4IR: The SIGIR 2021 Workshop on Simulation for Information Retrieval Evaluation (2021). http://ceur-ws.org/Vol-2911/paper6.pdf.Google Scholar
Katja Hofmann, Anne Schuth, Shimon Whiteson, and Maarten de Rijke. 2013. Reusing Historical Interaction Data for Faster Online Learning to Rank for IR. In Sixth ACM International Conference on Web Search and Data Mining, WSDM 2013, Rome, Italy, February 4--8, 2013, , Stefano Leonardi, Alessandro Panconesi, Paolo Ferragina, and Aristides Gionis (Eds.). ACM, 183--192. https://doi.org/10.1145/2433396.2433419 https://doi.org/10.1145/2433396.2433419.Google ScholarDigital Library
Kalervo J"arvelin, Susan L. Price, Lois M. L. Delcambre, and Marianne Lykke Nielsen. 2008. Discounted Cumulated Gain Based Evaluation of Multiple-Query IR Sessions. In Advances in Information Retrieval, 30th European Conference on IR Research, ECIR 2008, Glasgow, UK, March 30-April 3, 2008. Proceedings (Lecture Notes in Computer Science, Vol. 4956), Craig Macdonald, Iadh Ounis, Vassilis Plachouras, Ian Ruthven, and Ryen W. White (Eds.). Springer, 4--15. https://doi.org/10.1007/978--3--540--78646--7_4 https://doi.org/10.1007/978--3--540--78646--7_4.Google Scholar
Chris Jordan, Carolyn R. Watters, and Qigang Gao. 2006. Using Controlled Query Generation to Evaluate Blind Relevance Feedback Algorithms. In JCDL. ACM, 286--295.Google Scholar
Craig Macdonald, Nicola Tonellotto, Sean MacAvaney, and Iadh Ounis. 2021. PyTerrier : Declarative Experimentation in Python from BM25 to Dense Retrieval. In CIKM. ACM, 4526--4533.Google Scholar
David Maxwell and Leif Azzopardi. 2016a. Agents, Simulated Users and Humans : An Analysis of Performance and Behaviour. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management (CIKM '16). Association for Computing Machinery, New York, NY, USA, 731--740. https://doi.org/10.1145/2983323.2983805 https://dl.acm.org/doi/10.1145/2983323.2983805.Google ScholarDigital Library
David Maxwell and Leif Azzopardi. 2016b. Simulating Interactive Information Retrieval: SimIIR : A Framework for the Simulation of Interaction. In SIGIR. ACM, 1141--1144.Google Scholar
David Maxwell and Leif Azzopardi. 2018. Information Scent, Searching and Stopping - Modelling SERP Level Stopping Behaviour. In Advances in Information Retrieval - 40th European Conference on IR Research, ECIR 2018, Grenoble, France, March 26--29, 2018, Proceedings (Lecture Notes in Computer Science, Vol. 10772), , Gabriella Pasi, Benjamin Piwowarski, Leif Azzopardi, and Allan Hanbury (Eds.). Springer, 210--222. https://doi.org/10.1007/978--3--319--76941--7_16 https://doi.org/10.1007/978--3--319--76941--7_16.Google Scholar
Rodrigo Frassetto Nogueira, Wei Yang, Jimmy Lin, and Kyunghyun Cho. 2019. Document Expansion by Query Prediction. CoRR , Vol. abs/1904.08375 (2019).Google Scholar
OpenAI. 2023. Model Index for Researchers. https://platform.openai.com/docs/model-index-for-researchers.Google Scholar
Teemu P"a"akkönen, Jaana Kek"al"ainen, Heikki Keskustalo, Leif Azzopardi, David Maxwell, and Kalervo J"arvelin. 2017. Validating Simulated Interaction for Retrieval Evaluation. , Vol. 20, 4 (2017), 338--362. https://doi.org/10.1007/s10791-017--9301--2 https://doi.org/10.1007/s10791-017--9301--2.Google Scholar
Gustavo Penha, Arthur Câmara, and Claudia Hauff. 2022. Evaluating the Robustness of Retrieval Pipelines with Query Variation Generators. In ECIR (1) (Lecture Notes in Computer Science, Vol. 13185). Springer, 397--412.Google Scholar
Roee Shraga, Haggai Roitman, Guy Feigenblat, and Mustafa Canim. 2020. Web Table Retrieval Using Multimodal Deep Learning. In SIGIR. ACM, 1399--1408.Google Scholar
Mark D. Smucker and Charles L. A. Clarke. 2012. Time-Based Calibration of Effectiveness Measures. In The 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '12, Portland, OR, USA, August 12--16, 2012, , William R. Hersh, Jamie Callan, Yoelle Maarek, and Mark Sanderson (Eds.). ACM, 95--104. https://doi.org/10.1145/2348283.2348300 https://doi.org/10.1145/2348283.2348300.Google ScholarDigital Library
Paul Thomas, Alistair Moffat, Peter Bailey, and Falk Scholer. 2014. Modeling Decision Points in User Search Behavior. In Fifth Information Interaction in Context Symposium, IIiX '14, Regensburg, Germany, August 26--29, 2014, , David Elsweiler, Bernd Ludwig, Leif Azzopardi, and Max L. Wilson (Eds.). ACM, 239--242. https://doi.org/10.1145/2637002.2637032 https://doi.org/10.1145/2637002.2637032.Google ScholarDigital Library
Mohamed Trabelsi, Zhiyu Chen, Shuo Zhang, Brian D. Davison, and Jeff Heflin. 2022. StruBERT : Structure-aware BERT for Table Search and Matching. In Proceedings of the ACM Web Conference 2022. 442--451. https://doi.org/10.1145/3485447.3511972 arxiv: 2203.14278 [cs] http://arxiv.org/abs/2203.14278.Google ScholarDigital Library
Hong Wang, Anqi Liu, Jing Wang, Brian D. Ziebart, Clement T. Yu, and Warren Shen. 2015. Context Retrieval for Web Tables. In ICTIR. ACM, 251--260.Google Scholar
Zhao Yan, Duyu Tang, Nan Duan, Junwei Bao, Yuanhua Lv, Ming Zhou, and Zhoujun Li. 2017. Content-Based Table Retrieval for Web Queries. http://arxiv.org/abs/1706.02427. arxiv: 1706.02427 [cs]Google Scholar
Saber Zerhoudi, Sebastian Günther, Kim Plassmeier, Timo Borst, Christin Seifert, Matthias Hagen, and Michael Granitzer. 2022. The SimIIR 2.0 Framework: User Types, Markov Model-Based Interaction Simulation, and Advanced Query Generation. In CIKM. ACM, 4661--4666.Google Scholar
Shuo Zhang and Krisztian Balog. 2018. Ad Hoc Table Retrieval Using Semantic Similarity. In Proceedings of the 2018 World Wide Web Conference on World Wide Web - WWW '19. 1553--1562. https://doi.org/10.1145/3178876.3186067 arxiv: 1802.06159 [cs] http://arxiv.org/abs/1802.06159.Google ScholarDigital Library
Shuo Zhang and Krisztian Balog. 2019. Web Table Extraction, Retrieval and Augmentation. In SIGIR. ACM, 1409--1410.Google Scholar
Yinan Zhang, Xueqing Liu, and ChengXiang Zhai. 2017a. Information Retrieval Evaluation as Search Simulation: A General Formal Framework for IR Evaluation. In Proceedings of the ACM SIGIR International Conference on Theory of Information Retrieval (Amsterdam, The Netherlands) (ICTIR '17). Association for Computing Machinery, New York, NY, USA, 193--200. https://doi.org/10.1145/3121050.3121070Google ScholarDigital Library
Yinan Zhang, Xueqing Liu, and ChengXiang Zhai. 2017b. Information Retrieval Evaluation as Search Simulation: A General Formal Framework for IR Evaluation. In ICTIR. ACM, 193--200.Google Scholar
Justin Zobel. 2022. When Measurement Misleads : The Limits of Batch Assessment of Retrieval Systems. ACM SIGIR Forum, Vol. 56, 1 (2022), 20.Google ScholarDigital Library

Index Terms

Simulating Users in Interactive Web Table Retrieval
1. Human-centered computing
  1. Human computer interaction (HCI)
    1. HCI design and evaluation methods
      1. User models
2. Information systems
  1. Information retrieval
    1. Information retrieval query processing
      1. Query reformulation
    2. Users and interactive retrieval

Recommendations

Context-Driven Interactive Query Simulations Based on Generative Large Language Models
Advances in Information Retrieval
Abstract
Simulating user interactions enables a more user-oriented evaluation of information retrieval (IR) systems. While user simulations are cost-efficient and reproducible, many approaches often lack fidelity regarding real user behavior. Most notably,...
Read More
EmoUS: Simulating User Emotions in Task-Oriented Dialogues
SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval

Existing user simulators (USs) for task-oriented dialogue systems only model user behaviour on semantic and natural language levels without considering the user persona and emotions. Optimising dialogue systems with generic user policies, which cannot ...
Read More
Interactive relevance feedback with graded relevance and sentence extraction: simulated user experiments
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management

Research on relevance feedback (RFB) in information retrieval (IR) has given mixed results. Success in RFB seems to depend on the searcher's willingness to provide feedback and ability to identify relevant documents or query keys. The paper is based on ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management
October 2023
5508 pages
ISBN:9798400701245
DOI:10.1145/3583780
General Chairs:
Ingo Frommholz
University of Wolverhampton, UK
,
Frank Hopfgartner
University of Koblenz, Germany
,
Mark Lee
University of Birmingham, UK
,
Michael Oakes
University of Birmingham, UK
,
Program Chairs:
Mounia Lalmas
Spotify, UK
,
Min Zhang
Tsinghua University, China
,
Rodrygo Santos
Federal University of Minas Gerais, Brazil
Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 21 October 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
interactive table retrieval
multimodality
query generation
user simulation
Qualifiers
- short-paper
Conference

Acceptance Rates
Overall Acceptance Rate1,861of8,427submissions,22%
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 57
  Total Downloads
- Downloads (Last 12 months)57
- Downloads (Last 6 weeks)10
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Simulating Users in Interactive Web Table Retrieval

CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management

ABSTRACT

References

Cited By

Index Terms

Recommendations

Context-Driven Interactive Query Simulations Based on Generative Large Language Models

EmoUS: Simulating User Emotions in Task-Oriented Dialogues

Interactive relevance feedback with graded relevance and sentence extraction: simulated user experiments