ABSTRACT
Visualization of inter-document similarities is widely used for the exploration of document collections and interactive retrieval. However, similarity relationships between documents are multifaceted and measured distances by a given metric often do not match the perceived similarity of human beings. Furthermore, the user's notion of similarity can drastically change with the exploration objective or task at hand. Therefore, this research proposes to investigate online adjustments to the similarity model using feedback generated during exploration or exploratory search. In this course, rich visualizations and interactions will support users to give valuable feedback. Based on this, metric learning methodologies will be applied to adjust a similarity model in order to improve the exploration experience. At the same time, trained models are considered as valuable outcomes whose benefits for similarity-based tasks such as query-by-example retrieval or classification will be tested.
- Alex Endert, Patrick Fiaux, and Chris North. 2012. Semantic interaction for visual text analytics. In Proc. of CHI '12. ACM, 473--482. Google ScholarDigital Library
- Florian Heimerl, Markus John, Qi Han, Steffen Koch, and Thomas Ertl. 2016. DocuCompass: Effective exploration of document landscapes. In 2016 IEEE Conference on Visual Analytics Science and Technology (VAST) .Google ScholarCross Ref
- Katja Hofmann, Shimon Whiteson, and Maarten De Rijke. 2011. Balancing exploration and exploitation in learning to rank online. In European Conference on Information Retrieval. Springer, 251--263. Google ScholarDigital Library
- Youngho Kim, Ahmed Hassan, Ryen W White, and Imed Zitouni. 2014. Modeling dwell time to predict click-level satisfaction. In Proc. of WSDM '14. ACM, 193--202. Google ScholarDigital Library
- Matt Kusner, Yu Sun, Nicholas Kolkin, and Kilian Weinberger. 2015. From word embeddings to document distances. In Proceedings of ICML'15. 957--966. Google ScholarDigital Library
- Jonas Mueller and Aditya Thyagarajan. 2016. Siamese recurrent architectures for learning sentence similarity. In Proceedings of AAAI'16. Google ScholarDigital Library
- Vasile Rus, Nobal Niraula, and Rajendra Banjade. 2013. Similarity measures based on latent dirichlet allocation. In Proc. of CICLing '13. Springer, 459--470. Google ScholarDigital Library
- Aliaksei Severyn and Alessandro Moschitti. 2015. Learning to rank short text pairs with convolutional deep neural networks. In Proceedings of SIGIR '15. Google ScholarDigital Library
- John S Whissell and Charles LA Clarke. 2013. Effective measures for inter-document similarity. In Proceeedings of CIKM '13. ACM, 1361--1370. Google ScholarDigital Library
- Wen-tau Yih, Kristina Toutanova, John C Platt, and Christopher Meek. 2011. Learning discriminative projections for text similarity measures. In Proceedings of CoNLL '11. Association for Computational Linguistics, 247--256. Google ScholarDigital Library
Index Terms
- Document Distance Metric Learning in an Interactive Exploration Process
Recommendations
Interactive visualization for opportunistic exploration of large document collections
Finding relevant information in a large and comprehensive collection of cross-referenced documents like Wikipedia usually requires a quite accurate idea where to look for the pieces of data being sought. A user might not yet have enough domain-specific ...
On the similarity metric and the distance metric
Similarity and dissimilarity measures are widely used in many research areas and applications. When a dissimilarity measure is used, it is normally required to be a distance metric. However, when a similarity measure is used, there is no formal ...
Value and Relation Display for Interactive Exploration of High Dimensional Datasets
INFOVIS '04: Proceedings of the IEEE Symposium on Information VisualizationTraditional multi-dimensional visualization techniques, such as glyphs, parallel coordinates and scatterplot matrices, suffer from clutter at the display level and difficult user navigation among dimensions when visualizing high dimensional datasets. In ...
Comments