ABSTRACT
Due to the ambiguity and vagueness of a user query, it is essential to identify the query facets for the clarification of user intents. Existing work on query facet generation has achieved compelling performance by sequentially predicting the next facet given previously generated facets based on pre-trained language generation models such as BART. Given a query, there are mainly two types of training objectives to guide the facet generation models. One is to generate the default sequence of ground-truth facets, and the other is to enumerate all the permutations of ground-truth facets and use the sequence that has the minimum loss for model updates. The second is permutation-invariant while the first is not. In this paper, we aim to conduct a systematic comparative study of various types of training objectives, with different properties of not only whether it is permutation-invariant but also whether it conducts sequential prediction and whether it can control the count of output facets. To this end, we propose another three training objectives of different aforementioned properties. For comprehensive comparisons, besides the commonly used evaluation that measures the matching with ground-truth facets, we also introduce two diversity metrics to measure the diversity of the generated facets. Based on an open-domain query facet dataset, i.e., MIMICS, we conduct extensive analyses and show the pros and cons of each method, which could shed light on model training for clarification facet generation. The code can be found at https://github.com/ShiyuNee/Facet-Generation.
- Mohammad Aliannejadi, Julia Kiseleva, Aleksandr Chuklin, Jeff Dalton, and Mikhail Burtsev. 2020. ConvAI3: Generating clarifying questions for open-domain dialogue systems (ClariQ). arXiv preprint arXiv:2009.11352 (2020).Google Scholar
- Mohammad Aliannejadi, Hamed Zamani, Fabio Crestani, and W Bruce Croft. 2019. Asking clarifying questions in open-domain information-seeking conversations. In Proceedings of the 42nd international acm sigir conference on research and development in information retrieval. 475–484.Google ScholarDigital Library
- Keping Bi, Qingyao Ai, and W Bruce Croft. 2021. Asking clarifying questions based on negative feedback in conversational search. In Proceedings of the 2021 ACM SIGIR International Conference on Theory of Information Retrieval. 157–166.Google ScholarDigital Library
- Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, 2020. Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877–1901.Google Scholar
- Wisam Dakka and Panagiotis G Ipeirotis. 2008. Automatic extraction of useful facet hierarchies from text databases. In 2008 IEEE 24th International Conference on Data Engineering. IEEE, 466–475.Google ScholarDigital Library
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 4171–4186.Google Scholar
- Kaustubh D Dhole. 2020. Resolving intent ambiguities by retrieving discriminative clarifying questions. arXiv preprint arXiv:2008.07559 (2020).Google Scholar
- Zhicheng Dou, Sha Hu, Yulong Luo, Ruihua Song, and Ji-Rong Wen. 2011. Finding dimensions for queries. In Proceedings of the 20th ACM international conference on Information and knowledge management. 1311–1320.Google ScholarDigital Library
- Helia Hashemi, Hamed Zamani, and W Bruce Croft. 2020. Guided transformer: Leveraging multiple external sources for representation learning in conversational search. In Proceedings of the 43rd international acm sigir conference on research and development in information retrieval. 1131–1140.Google ScholarDigital Library
- Helia Hashemi, Hamed Zamani, and W Bruce Croft. 2021. Learning multiple intent representations for search queries. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management. 669–679.Google ScholarDigital Library
- Helia Hashemi, Hamed Zamani, and W Bruce Croft. 2022. Stochastic Optimization of Text Set Generation for Learning Multiple Query Intent Representations. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management. 4003–4008.Google ScholarDigital Library
- Kimiya Keyvan and Jimmy Xiangji Huang. 2022. How to approach ambiguous queries in conversational search: A survey of techniques, approaches, tools, and challenges. Comput. Surveys 55, 6 (2022), 1–40.Google ScholarDigital Library
- Christian Kohlschütter, Paul-Alexandru Chirita, and Wolfgang Nejdl. 2006. Using link analysis to identify aspects in faceted web search. In SIGIR’2006 Faceted Search Workshop. Citeseer, 55–59.Google Scholar
- Weize Kong and James Allan. 2013. Extracting query facets from search results. In Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval. 93–102.Google ScholarDigital Library
- Weize Kong and James Allan. 2014. Extending faceted search to the general web. In Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management. 839–848.Google ScholarDigital Library
- Weize Kong and James Allan. 2016. Precision-oriented query facet extraction. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. 1433–1442.Google ScholarDigital Library
- Harold W Kuhn. 1955. The Hungarian method for the assignment problem. Naval research logistics quarterly 2, 1-2 (1955), 83–97.Google Scholar
- K. Latha, K. Rathna Veni, and R. Rajaram. 2010. AFGF: An Automatic Facet Generation Framework for Document Retrieval. In 2010 International Conference on Advances in Computer Engineering. 110–114.Google Scholar
- Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer. 2020. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 7871–7880.Google ScholarCross Ref
- Chengkai Li, Ning Yan, Senjuti B Roy, Lekhendro Lisham, and Gautam Das. 2010. Facetedpedia: dynamic generation of query-dependent faceted interfaces for wikipedia. In Proceedings of the 19th international conference on World wide web. 651–660.Google ScholarDigital Library
- Yu-An Liu, Ruqing Zhang, Jiafeng Guo, Maarten de Rijke, Wei Chen, Yixing Fan, and Xueqi Cheng. 2023. Topic-Oriented Adversarial Attacks against Black-Box Neural Ranking Models. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1700–1709.Google ScholarDigital Library
- Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics. 311–318.Google Scholar
- Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever, 2019. Language models are unsupervised multitask learners. OpenAI blog 1, 8 (2019), 9.Google Scholar
- Sudha Rao and Hal Daumé III. 2018. Learning to Ask Good Questions: Ranking Clarification Questions using Neural Expected Value of Perfect Information. (2018), 2737–2746.Google Scholar
- Sudha Rao and Hal Daumé III. 2019. Answer-based adversarial training for generating clarification questions. arXiv preprint arXiv:1904.02281 (2019).Google Scholar
- Chris Samarinas, Arkin Dharawat, and Hamed Zamani. 2022. Revisiting Open Domain Query Facet Extraction and Generation. In Proceedings of the 2022 ACM SIGIR International Conference on Theory of Information Retrieval. 43–50.Google ScholarDigital Library
- Ivan Sekulić, Mohammad Aliannejadi, and Fabio Crestani. 2021. Towards facet-driven generation of clarifying questions for conversational search. In Proceedings of the 2021 ACM SIGIR international conference on theory of information retrieval. 167–175.Google ScholarDigital Library
- Emilia Stoica, Marti A Hearst, and Megan Richardson. 2007. Automating creation of hierarchical faceted metadata structures. In Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference. 244–251.Google Scholar
- Ashwin Vijayakumar, Michael Cogswell, Ramprasaath Selvaraju, Qing Sun, Stefan Lee, David Crandall, and Dhruv Batra. 2018. Diverse beam search for improved description of complex scenes. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32.Google ScholarCross Ref
- Jian Wang and Wenjie Li. 2021. Template-guided clarifying question generation for web search clarification. In Proceedings of the 30th ACM international conference on information & knowledge management. 3468–3472.Google ScholarDigital Library
- Zhenduo Wang, Yuancheng Tu, Corby Rosset, Nick Craswell, Ming Wu, and Qingyao Ai. 2023. Zero-shot Clarifying Question Generation for Conversational Search. In Proceedings of the ACM Web Conference 2023. 3288–3298.Google ScholarDigital Library
- Hamed Zamani, Susan Dumais, Nick Craswell, Paul Bennett, and Gord Lueck. 2020. Generating clarifying questions for information retrieval. In Proceedings of the web conference 2020. 418–428.Google ScholarDigital Library
- Hamed Zamani, Gord Lueck, Everest Chen, Rodolfo Quispe, Flint Luu, and Nick Craswell. 2020. Mimics: A large-scale data collection for search clarification. In Proceedings of the 29th ACM international conference on information & knowledge management. 3189–3196.Google ScholarDigital Library
- Tianyi Zhang*, Varsha Kishore*, Felix Wu*, Kilian Q. Weinberger, and Yoav Artzi. 2020. BERTScore: Evaluating Text Generation with BERT. In International Conference on Learning Representations.Google Scholar
- Ziliang Zhao, Zhicheng Dou, Jiaxin Mao, and Ji-Rong Wen. 2022. Generating clarifying questions with web search results. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 234–244.Google ScholarDigital Library
Index Terms
- A Comparative Study of Training Objectives for Clarification Facet Generation
Recommendations
Revisiting Open Domain Query Facet Extraction and Generation
ICTIR '22: Proceedings of the 2022 ACM SIGIR International Conference on Theory of Information RetrievalWeb search queries can often be characterized by various facets. Extracting and generating query facets has various real-world applications, such as displaying facets to users in a search interface, search result diversification, clarifying question ...
Lifting theorems and facet characterization for a class of clique partitioning inequalities
In this paper we prove two lifting theorems for the clique partitioning polytope, which provide sufficient conditions for a valid inequality to be facet-defining. In particular, if a valid inequality defines a facet of the polytope corresponding to the ...
Search Clarification Selection via Query-Intent-Clarification Graph Attention
Advances in Information RetrievalAbstractProactively asking clarifications in response to search queries is a useful technique for revealing the intent of the query. Search clarification is important for both web and conversational search. This paper focuses on the clarification ...
Comments