abstract

Document Distance Metric Learning in an Interactive Exploration Process

Author:
Marco Wrzalik

RheinMain University of Applied Sciences, Wiesbaden, Germany

RheinMain University of Applied Sciences, Wiesbaden, Germany
View Profile

SIGIR'19: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information RetrievalJuly 2019Pages 1452https://doi.org/10.1145/3331184.3331420

Published:18 July 2019Publication History

SIGIR'19: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval

Pages 1452

ABSTRACT

Visualization of inter-document similarities is widely used for the exploration of document collections and interactive retrieval. However, similarity relationships between documents are multifaceted and measured distances by a given metric often do not match the perceived similarity of human beings. Furthermore, the user's notion of similarity can drastically change with the exploration objective or task at hand. Therefore, this research proposes to investigate online adjustments to the similarity model using feedback generated during exploration or exploratory search. In this course, rich visualizations and interactions will support users to give valuable feedback. Based on this, metric learning methodologies will be applied to adjust a similarity model in order to improve the exploration experience. At the same time, trained models are considered as valuable outcomes whose benefits for similarity-based tasks such as query-by-example retrieval or classification will be tested.

References

Alex Endert, Patrick Fiaux, and Chris North. 2012. Semantic interaction for visual text analytics. In Proc. of CHI '12. ACM, 473--482. Google ScholarDigital Library
Florian Heimerl, Markus John, Qi Han, Steffen Koch, and Thomas Ertl. 2016. DocuCompass: Effective exploration of document landscapes. In 2016 IEEE Conference on Visual Analytics Science and Technology (VAST) .Google ScholarCross Ref
Katja Hofmann, Shimon Whiteson, and Maarten De Rijke. 2011. Balancing exploration and exploitation in learning to rank online. In European Conference on Information Retrieval. Springer, 251--263. Google ScholarDigital Library
Youngho Kim, Ahmed Hassan, Ryen W White, and Imed Zitouni. 2014. Modeling dwell time to predict click-level satisfaction. In Proc. of WSDM '14. ACM, 193--202. Google ScholarDigital Library
Matt Kusner, Yu Sun, Nicholas Kolkin, and Kilian Weinberger. 2015. From word embeddings to document distances. In Proceedings of ICML'15. 957--966. Google ScholarDigital Library
Jonas Mueller and Aditya Thyagarajan. 2016. Siamese recurrent architectures for learning sentence similarity. In Proceedings of AAAI'16. Google ScholarDigital Library
Vasile Rus, Nobal Niraula, and Rajendra Banjade. 2013. Similarity measures based on latent dirichlet allocation. In Proc. of CICLing '13. Springer, 459--470. Google ScholarDigital Library
Aliaksei Severyn and Alessandro Moschitti. 2015. Learning to rank short text pairs with convolutional deep neural networks. In Proceedings of SIGIR '15. Google ScholarDigital Library
John S Whissell and Charles LA Clarke. 2013. Effective measures for inter-document similarity. In Proceeedings of CIKM '13. ACM, 1361--1370. Google ScholarDigital Library
Wen-tau Yih, Kristina Toutanova, John C Platt, and Christopher Meek. 2011. Learning discriminative projections for text similarity measures. In Proceedings of CoNLL '11. Association for Computational Linguistics, 247--256. Google ScholarDigital Library

Index Terms

Document Distance Metric Learning in an Interactive Exploration Process
1. Computing methodologies
  1. Machine learning
    1. Learning settings
      1. Online learning settings
2. Information systems
  1. Information retrieval
    1. Retrieval models and ranking
      1. Similarity measures
    2. Users and interactive retrieval

Recommendations

Interactive visualization for opportunistic exploration of large document collections

Finding relevant information in a large and comprehensive collection of cross-referenced documents like Wikipedia usually requires a quite accurate idea where to look for the pieces of data being sought. A user might not yet have enough domain-specific ...
Read More
On the similarity metric and the distance metric

Similarity and dissimilarity measures are widely used in many research areas and applications. When a dissimilarity measure is used, it is normally required to be a distance metric. However, when a similarity measure is used, there is no formal ...
Read More
Value and Relation Display for Interactive Exploration of High Dimensional Datasets
INFOVIS '04: Proceedings of the IEEE Symposium on Information Visualization

Traditional multi-dimensional visualization techniques, such as glyphs, parallel coordinates and scatterplot matrices, suffer from clutter at the display level and difficult user navigation among dimensions when visualizing high dimensional datasets. In ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR'19: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval
July 2019
1512 pages
ISBN:9781450361729
DOI:10.1145/3331184
General Chairs:
Benjamin Piwowarski
CNRS - Sorbonne Universite, France
,
Max Chevalier
Universite de Toulouse, CNRS, France
,
Eric Gaussier
Universite Grenoble Alpes, CNRS, France
,
Program Chairs:
Yoelle Maarek
Amazon Research, Israel
,
Jian-Yun Nie
University of Montreal, Canada
,
Falk Scholer
RMIT University, Australia
Copyright © 2019 Owner/Author
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 18 July 2019
Check for updates
Qualifiers
- abstract
Conference

Acceptance Rates
SIGIR'19 Paper Acceptance Rate84of426submissions,20%Overall Acceptance Rate792of3,983submissions,20%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 191
  Total Downloads
- Downloads (Last 12 months)10
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Document Distance Metric Learning in an Interactive Exploration Process

SIGIR'19: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

Interactive visualization for opportunistic exploration of large document collections

On the similarity metric and the distance metric

Value and Relation Display for Interactive Exploration of High Dimensional Datasets