skip to main content
10.1145/3624918.3625332acmconferencesArticle/Chapter ViewAbstractPublication Pagessigir-apConference Proceedingsconference-collections
research-article
Open Access

A Comparative Study of Training Objectives for Clarification Facet Generation

Authors Info & Claims
Published:26 November 2023Publication History

ABSTRACT

Due to the ambiguity and vagueness of a user query, it is essential to identify the query facets for the clarification of user intents. Existing work on query facet generation has achieved compelling performance by sequentially predicting the next facet given previously generated facets based on pre-trained language generation models such as BART. Given a query, there are mainly two types of training objectives to guide the facet generation models. One is to generate the default sequence of ground-truth facets, and the other is to enumerate all the permutations of ground-truth facets and use the sequence that has the minimum loss for model updates. The second is permutation-invariant while the first is not. In this paper, we aim to conduct a systematic comparative study of various types of training objectives, with different properties of not only whether it is permutation-invariant but also whether it conducts sequential prediction and whether it can control the count of output facets. To this end, we propose another three training objectives of different aforementioned properties. For comprehensive comparisons, besides the commonly used evaluation that measures the matching with ground-truth facets, we also introduce two diversity metrics to measure the diversity of the generated facets. Based on an open-domain query facet dataset, i.e., MIMICS, we conduct extensive analyses and show the pros and cons of each method, which could shed light on model training for clarification facet generation. The code can be found at https://github.com/ShiyuNee/Facet-Generation.

References

  1. Mohammad Aliannejadi, Julia Kiseleva, Aleksandr Chuklin, Jeff Dalton, and Mikhail Burtsev. 2020. ConvAI3: Generating clarifying questions for open-domain dialogue systems (ClariQ). arXiv preprint arXiv:2009.11352 (2020).Google ScholarGoogle Scholar
  2. Mohammad Aliannejadi, Hamed Zamani, Fabio Crestani, and W Bruce Croft. 2019. Asking clarifying questions in open-domain information-seeking conversations. In Proceedings of the 42nd international acm sigir conference on research and development in information retrieval. 475–484.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Keping Bi, Qingyao Ai, and W Bruce Croft. 2021. Asking clarifying questions based on negative feedback in conversational search. In Proceedings of the 2021 ACM SIGIR International Conference on Theory of Information Retrieval. 157–166.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, 2020. Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877–1901.Google ScholarGoogle Scholar
  5. Wisam Dakka and Panagiotis G Ipeirotis. 2008. Automatic extraction of useful facet hierarchies from text databases. In 2008 IEEE 24th International Conference on Data Engineering. IEEE, 466–475.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 4171–4186.Google ScholarGoogle Scholar
  7. Kaustubh D Dhole. 2020. Resolving intent ambiguities by retrieving discriminative clarifying questions. arXiv preprint arXiv:2008.07559 (2020).Google ScholarGoogle Scholar
  8. Zhicheng Dou, Sha Hu, Yulong Luo, Ruihua Song, and Ji-Rong Wen. 2011. Finding dimensions for queries. In Proceedings of the 20th ACM international conference on Information and knowledge management. 1311–1320.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Helia Hashemi, Hamed Zamani, and W Bruce Croft. 2020. Guided transformer: Leveraging multiple external sources for representation learning in conversational search. In Proceedings of the 43rd international acm sigir conference on research and development in information retrieval. 1131–1140.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Helia Hashemi, Hamed Zamani, and W Bruce Croft. 2021. Learning multiple intent representations for search queries. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management. 669–679.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Helia Hashemi, Hamed Zamani, and W Bruce Croft. 2022. Stochastic Optimization of Text Set Generation for Learning Multiple Query Intent Representations. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management. 4003–4008.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Kimiya Keyvan and Jimmy Xiangji Huang. 2022. How to approach ambiguous queries in conversational search: A survey of techniques, approaches, tools, and challenges. Comput. Surveys 55, 6 (2022), 1–40.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Christian Kohlschütter, Paul-Alexandru Chirita, and Wolfgang Nejdl. 2006. Using link analysis to identify aspects in faceted web search. In SIGIR’2006 Faceted Search Workshop. Citeseer, 55–59.Google ScholarGoogle Scholar
  14. Weize Kong and James Allan. 2013. Extracting query facets from search results. In Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval. 93–102.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Weize Kong and James Allan. 2014. Extending faceted search to the general web. In Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management. 839–848.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Weize Kong and James Allan. 2016. Precision-oriented query facet extraction. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. 1433–1442.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Harold W Kuhn. 1955. The Hungarian method for the assignment problem. Naval research logistics quarterly 2, 1-2 (1955), 83–97.Google ScholarGoogle Scholar
  18. K. Latha, K. Rathna Veni, and R. Rajaram. 2010. AFGF: An Automatic Facet Generation Framework for Document Retrieval. In 2010 International Conference on Advances in Computer Engineering. 110–114.Google ScholarGoogle Scholar
  19. Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer. 2020. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 7871–7880.Google ScholarGoogle ScholarCross RefCross Ref
  20. Chengkai Li, Ning Yan, Senjuti B Roy, Lekhendro Lisham, and Gautam Das. 2010. Facetedpedia: dynamic generation of query-dependent faceted interfaces for wikipedia. In Proceedings of the 19th international conference on World wide web. 651–660.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Yu-An Liu, Ruqing Zhang, Jiafeng Guo, Maarten de Rijke, Wei Chen, Yixing Fan, and Xueqi Cheng. 2023. Topic-Oriented Adversarial Attacks against Black-Box Neural Ranking Models. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1700–1709.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics. 311–318.Google ScholarGoogle Scholar
  23. Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever, 2019. Language models are unsupervised multitask learners. OpenAI blog 1, 8 (2019), 9.Google ScholarGoogle Scholar
  24. Sudha Rao and Hal Daumé III. 2018. Learning to Ask Good Questions: Ranking Clarification Questions using Neural Expected Value of Perfect Information. (2018), 2737–2746.Google ScholarGoogle Scholar
  25. Sudha Rao and Hal Daumé III. 2019. Answer-based adversarial training for generating clarification questions. arXiv preprint arXiv:1904.02281 (2019).Google ScholarGoogle Scholar
  26. Chris Samarinas, Arkin Dharawat, and Hamed Zamani. 2022. Revisiting Open Domain Query Facet Extraction and Generation. In Proceedings of the 2022 ACM SIGIR International Conference on Theory of Information Retrieval. 43–50.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Ivan Sekulić, Mohammad Aliannejadi, and Fabio Crestani. 2021. Towards facet-driven generation of clarifying questions for conversational search. In Proceedings of the 2021 ACM SIGIR international conference on theory of information retrieval. 167–175.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Emilia Stoica, Marti A Hearst, and Megan Richardson. 2007. Automating creation of hierarchical faceted metadata structures. In Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference. 244–251.Google ScholarGoogle Scholar
  29. Ashwin Vijayakumar, Michael Cogswell, Ramprasaath Selvaraju, Qing Sun, Stefan Lee, David Crandall, and Dhruv Batra. 2018. Diverse beam search for improved description of complex scenes. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32.Google ScholarGoogle ScholarCross RefCross Ref
  30. Jian Wang and Wenjie Li. 2021. Template-guided clarifying question generation for web search clarification. In Proceedings of the 30th ACM international conference on information & knowledge management. 3468–3472.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Zhenduo Wang, Yuancheng Tu, Corby Rosset, Nick Craswell, Ming Wu, and Qingyao Ai. 2023. Zero-shot Clarifying Question Generation for Conversational Search. In Proceedings of the ACM Web Conference 2023. 3288–3298.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Hamed Zamani, Susan Dumais, Nick Craswell, Paul Bennett, and Gord Lueck. 2020. Generating clarifying questions for information retrieval. In Proceedings of the web conference 2020. 418–428.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Hamed Zamani, Gord Lueck, Everest Chen, Rodolfo Quispe, Flint Luu, and Nick Craswell. 2020. Mimics: A large-scale data collection for search clarification. In Proceedings of the 29th ACM international conference on information & knowledge management. 3189–3196.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Tianyi Zhang*, Varsha Kishore*, Felix Wu*, Kilian Q. Weinberger, and Yoav Artzi. 2020. BERTScore: Evaluating Text Generation with BERT. In International Conference on Learning Representations.Google ScholarGoogle Scholar
  35. Ziliang Zhao, Zhicheng Dou, Jiaxin Mao, and Ji-Rong Wen. 2022. Generating clarifying questions with web search results. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 234–244.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A Comparative Study of Training Objectives for Clarification Facet Generation

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGIR-AP '23: Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region
      November 2023
      324 pages
      ISBN:9798400704086
      DOI:10.1145/3624918

      Copyright © 2023 Owner/Author

      This work is licensed under a Creative Commons Attribution International 4.0 License.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 26 November 2023

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited
    • Article Metrics

      • Downloads (Last 12 months)95
      • Downloads (Last 6 weeks)28

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format