The Information Retrieval Experiment Platform

Fröbe, Maik; Reimer, Jan Heinrich; MacAvaney, Sean; Deckers, Niklas; Reich, Simon; Bevendorff, Janek; Stein, Benno; Hagen, Matthias; Potthast, Martin

doi:10.1145/3539618.3591888

Computer Science > Information Retrieval

arXiv:2305.18932 (cs)

[Submitted on 30 May 2023]

Title:The Information Retrieval Experiment Platform

Authors:Maik Fröbe, Jan Heinrich Reimer, Sean MacAvaney, Niklas Deckers, Simon Reich, Janek Bevendorff, Benno Stein, Matthias Hagen, Martin Potthast

View PDF

Abstract:We integrate ir_datasets, ir_measures, and PyTerrier with TIRA in the Information Retrieval Experiment Platform (TIREx) to promote more standardized, reproducible, scalable, and even blinded retrieval experiments. Standardization is achieved when a retrieval approach implements PyTerrier's interfaces and the input and output of an experiment are compatible with ir_datasets and ir_measures. However, none of this is a must for reproducibility and scalability, as TIRA can run any dockerized software locally or remotely in a cloud-native execution environment. Version control and caching ensure efficient (re)execution. TIRA allows for blind evaluation when an experiment runs on a remote server or cloud not under the control of the experimenter. The test data and ground truth are then hidden from public access, and the retrieval software has to process them in a sandbox that prevents data leaks.
We currently host an instance of TIREx with 15 corpora (1.9 billion documents) on which 32 shared retrieval tasks are based. Using Docker images of 50 standard retrieval approaches, we automatically evaluated all approaches on all tasks (50 $\cdot$ 32 = 1,600~runs) in less than a week on a midsize cluster (1,620 CPU cores and 24 GPUs). This instance of TIREx is open for submissions and will be integrated with the IR Anthology, as well as released open source.

Comments:	11 pages. To be published in the proceedings of SIGIR 2023
Subjects:	Information Retrieval (cs.IR)
Cite as:	arXiv:2305.18932 [cs.IR]
	(or arXiv:2305.18932v1 [cs.IR] for this version)
	https://doi.org/10.48550/arXiv.2305.18932
Related DOI:	https://doi.org/10.1145/3539618.3591888

Submission history

From: Maik Fröbe [view email]
[v1] Tue, 30 May 2023 10:48:50 UTC (473 KB)

Computer Science > Information Retrieval

Title:The Information Retrieval Experiment Platform

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Information Retrieval

Title:The Information Retrieval Experiment Platform

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators