A Comparative Study of Training Objectives for Clarification Facet Generation

Authors:
Shiyu Ni

CAS Key Lab of Network Data Science and Technology, ICT, CAS, China and University of Chinese Academy of Sciences, China

CAS Key Lab of Network Data Science and Technology, ICT, CAS, China and University of Chinese Academy of Sciences, China

0009-0001-7965-7771
View Profile

,
Keping Bi

CAS Key Lab of Network Data Science and Technology, ICT, CAS, China and University of Chinese Academy of Sciences, China

CAS Key Lab of Network Data Science and Technology, ICT, CAS, China and University of Chinese Academy of Sciences, China

0000-0001-5123-4999
View Profile

,
Jiafeng Guo

CAS Key Lab of Network Data Science and Technology, ICT, CAS, China and University of Chinese Academy of Sciences, China

CAS Key Lab of Network Data Science and Technology, ICT, CAS, China and University of Chinese Academy of Sciences, China

0000-0002-9509-8674
View Profile

,
Xueqi Cheng

CAS Key Lab of Network Data Science and Technology, ICT, CAS, China and University of Chinese Academy of Sciences, China

CAS Key Lab of Network Data Science and Technology, ICT, CAS, China and University of Chinese Academy of Sciences, China

0000-0002-5201-8195
View Profile

SIGIR-AP '23: Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific RegionNovember 2023Pages 1–10https://doi.org/10.1145/3624918.3625332

Published:26 November 2023Publication History

SIGIR-AP '23: Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region

Pages 1–10

ABSTRACT

Due to the ambiguity and vagueness of a user query, it is essential to identify the query facets for the clarification of user intents. Existing work on query facet generation has achieved compelling performance by sequentially predicting the next facet given previously generated facets based on pre-trained language generation models such as BART. Given a query, there are mainly two types of training objectives to guide the facet generation models. One is to generate the default sequence of ground-truth facets, and the other is to enumerate all the permutations of ground-truth facets and use the sequence that has the minimum loss for model updates. The second is permutation-invariant while the first is not. In this paper, we aim to conduct a systematic comparative study of various types of training objectives, with different properties of not only whether it is permutation-invariant but also whether it conducts sequential prediction and whether it can control the count of output facets. To this end, we propose another three training objectives of different aforementioned properties. For comprehensive comparisons, besides the commonly used evaluation that measures the matching with ground-truth facets, we also introduce two diversity metrics to measure the diversity of the generated facets. Based on an open-domain query facet dataset, i.e., MIMICS, we conduct extensive analyses and show the pros and cons of each method, which could shed light on model training for clarification facet generation. The code can be found at https://github.com/ShiyuNee/Facet-Generation.

References

Mohammad Aliannejadi, Julia Kiseleva, Aleksandr Chuklin, Jeff Dalton, and Mikhail Burtsev. 2020. ConvAI3: Generating clarifying questions for open-domain dialogue systems (ClariQ). arXiv preprint arXiv:2009.11352 (2020).Google Scholar
Mohammad Aliannejadi, Hamed Zamani, Fabio Crestani, and W Bruce Croft. 2019. Asking clarifying questions in open-domain information-seeking conversations. In Proceedings of the 42nd international acm sigir conference on research and development in information retrieval. 475–484.Google ScholarDigital Library
Keping Bi, Qingyao Ai, and W Bruce Croft. 2021. Asking clarifying questions based on negative feedback in conversational search. In Proceedings of the 2021 ACM SIGIR International Conference on Theory of Information Retrieval. 157–166.Google ScholarDigital Library
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, 2020. Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877–1901.Google Scholar
Wisam Dakka and Panagiotis G Ipeirotis. 2008. Automatic extraction of useful facet hierarchies from text databases. In 2008 IEEE 24th International Conference on Data Engineering. IEEE, 466–475.Google ScholarDigital Library
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 4171–4186.Google Scholar
Kaustubh D Dhole. 2020. Resolving intent ambiguities by retrieving discriminative clarifying questions. arXiv preprint arXiv:2008.07559 (2020).Google Scholar
Zhicheng Dou, Sha Hu, Yulong Luo, Ruihua Song, and Ji-Rong Wen. 2011. Finding dimensions for queries. In Proceedings of the 20th ACM international conference on Information and knowledge management. 1311–1320.Google ScholarDigital Library
Helia Hashemi, Hamed Zamani, and W Bruce Croft. 2020. Guided transformer: Leveraging multiple external sources for representation learning in conversational search. In Proceedings of the 43rd international acm sigir conference on research and development in information retrieval. 1131–1140.Google ScholarDigital Library
Helia Hashemi, Hamed Zamani, and W Bruce Croft. 2021. Learning multiple intent representations for search queries. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management. 669–679.Google ScholarDigital Library
Helia Hashemi, Hamed Zamani, and W Bruce Croft. 2022. Stochastic Optimization of Text Set Generation for Learning Multiple Query Intent Representations. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management. 4003–4008.Google ScholarDigital Library
Kimiya Keyvan and Jimmy Xiangji Huang. 2022. How to approach ambiguous queries in conversational search: A survey of techniques, approaches, tools, and challenges. Comput. Surveys 55, 6 (2022), 1–40.Google ScholarDigital Library
Christian Kohlschütter, Paul-Alexandru Chirita, and Wolfgang Nejdl. 2006. Using link analysis to identify aspects in faceted web search. In SIGIR’2006 Faceted Search Workshop. Citeseer, 55–59.Google Scholar
Weize Kong and James Allan. 2013. Extracting query facets from search results. In Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval. 93–102.Google ScholarDigital Library
Weize Kong and James Allan. 2014. Extending faceted search to the general web. In Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management. 839–848.Google ScholarDigital Library
Weize Kong and James Allan. 2016. Precision-oriented query facet extraction. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. 1433–1442.Google ScholarDigital Library
Harold W Kuhn. 1955. The Hungarian method for the assignment problem. Naval research logistics quarterly 2, 1-2 (1955), 83–97.Google Scholar
K. Latha, K. Rathna Veni, and R. Rajaram. 2010. AFGF: An Automatic Facet Generation Framework for Document Retrieval. In 2010 International Conference on Advances in Computer Engineering. 110–114.Google Scholar
Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer. 2020. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 7871–7880.Google ScholarCross Ref
Chengkai Li, Ning Yan, Senjuti B Roy, Lekhendro Lisham, and Gautam Das. 2010. Facetedpedia: dynamic generation of query-dependent faceted interfaces for wikipedia. In Proceedings of the 19th international conference on World wide web. 651–660.Google ScholarDigital Library
Yu-An Liu, Ruqing Zhang, Jiafeng Guo, Maarten de Rijke, Wei Chen, Yixing Fan, and Xueqi Cheng. 2023. Topic-Oriented Adversarial Attacks against Black-Box Neural Ranking Models. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1700–1709.Google ScholarDigital Library
Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics. 311–318.Google Scholar
Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever, 2019. Language models are unsupervised multitask learners. OpenAI blog 1, 8 (2019), 9.Google Scholar
Sudha Rao and Hal Daumé III. 2018. Learning to Ask Good Questions: Ranking Clarification Questions using Neural Expected Value of Perfect Information. (2018), 2737–2746.Google Scholar
Sudha Rao and Hal Daumé III. 2019. Answer-based adversarial training for generating clarification questions. arXiv preprint arXiv:1904.02281 (2019).Google Scholar
Chris Samarinas, Arkin Dharawat, and Hamed Zamani. 2022. Revisiting Open Domain Query Facet Extraction and Generation. In Proceedings of the 2022 ACM SIGIR International Conference on Theory of Information Retrieval. 43–50.Google ScholarDigital Library
Ivan Sekulić, Mohammad Aliannejadi, and Fabio Crestani. 2021. Towards facet-driven generation of clarifying questions for conversational search. In Proceedings of the 2021 ACM SIGIR international conference on theory of information retrieval. 167–175.Google ScholarDigital Library
Emilia Stoica, Marti A Hearst, and Megan Richardson. 2007. Automating creation of hierarchical faceted metadata structures. In Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference. 244–251.Google Scholar
Ashwin Vijayakumar, Michael Cogswell, Ramprasaath Selvaraju, Qing Sun, Stefan Lee, David Crandall, and Dhruv Batra. 2018. Diverse beam search for improved description of complex scenes. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32.Google ScholarCross Ref
Jian Wang and Wenjie Li. 2021. Template-guided clarifying question generation for web search clarification. In Proceedings of the 30th ACM international conference on information & knowledge management. 3468–3472.Google ScholarDigital Library
Zhenduo Wang, Yuancheng Tu, Corby Rosset, Nick Craswell, Ming Wu, and Qingyao Ai. 2023. Zero-shot Clarifying Question Generation for Conversational Search. In Proceedings of the ACM Web Conference 2023. 3288–3298.Google ScholarDigital Library
Hamed Zamani, Susan Dumais, Nick Craswell, Paul Bennett, and Gord Lueck. 2020. Generating clarifying questions for information retrieval. In Proceedings of the web conference 2020. 418–428.Google ScholarDigital Library
Hamed Zamani, Gord Lueck, Everest Chen, Rodolfo Quispe, Flint Luu, and Nick Craswell. 2020. Mimics: A large-scale data collection for search clarification. In Proceedings of the 29th ACM international conference on information & knowledge management. 3189–3196.Google ScholarDigital Library
Tianyi Zhang*, Varsha Kishore*, Felix Wu*, Kilian Q. Weinberger, and Yoav Artzi. 2020. BERTScore: Evaluating Text Generation with BERT. In International Conference on Learning Representations.Google Scholar
Ziliang Zhao, Zhicheng Dou, Jiaxin Mao, and Ji-Rong Wen. 2022. Generating clarifying questions with web search results. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 234–244.Google ScholarDigital Library

Index Terms

A Comparative Study of Training Objectives for Clarification Facet Generation
1. Information systems
  1. Information retrieval
    1. Information retrieval query processing
      1. Query intent

Recommendations

Revisiting Open Domain Query Facet Extraction and Generation
ICTIR '22: Proceedings of the 2022 ACM SIGIR International Conference on Theory of Information Retrieval

Web search queries can often be characterized by various facets. Extracting and generating query facets has various real-world applications, such as displaying facets to users in a search interface, search result diversification, clarifying question ...
Read More
Lifting theorems and facet characterization for a class of clique partitioning inequalities

In this paper we prove two lifting theorems for the clique partitioning polytope, which provide sufficient conditions for a valid inequality to be facet-defining. In particular, if a valid inequality defines a facet of the polytope corresponding to the ...
Read More
Search Clarification Selection via Query-Intent-Clarification Graph Attention
Advances in Information Retrieval
Abstract
Proactively asking clarifications in response to search queries is a useful technique for revealing the intent of the query. Search clarification is important for both web and conversational search. This paper focuses on the clarification ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR-AP '23: Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region
November 2023
324 pages
ISBN:9798400704086
DOI:10.1145/3624918
Editors:
Qingyao Ai
Tsinghua University, China
,
Yiqin Liu
Tsinghua University, China
,
Alistair Moffat
The University of Melbourne, Australia
,
Xuanjing Huang
Fudan University, China
,
Tetsuya Sakai
Waseda University, Japan
,
Justin Zobel
The University of Melbourne, Australia
Copyright © 2023 Owner/Author
This work is licensed under a Creative Commons Attribution International 4.0 License.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 26 November 2023
Check for updates
Author Tags
Facet Generation
Query Facet
Search Clarification
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 95
  Total Downloads
- Downloads (Last 12 months)95
- Downloads (Last 6 weeks)28
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

A Comparative Study of Training Objectives for Clarification Facet Generation

SIGIR-AP '23: Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region

ABSTRACT

References

Cited By

Index Terms

Recommendations

Revisiting Open Domain Query Facet Extraction and Generation

Lifting theorems and facet characterization for a class of clique partitioning inequalities

Search Clarification Selection via Query-Intent-Clarification Graph Attention