skip to main content
10.1145/3589334.3645406acmconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article
Free Access

A Knowledge-Injected Curriculum Pretraining Framework for Question Answering

Authors Info & Claims
Published:13 May 2024Publication History

ABSTRACT

Knowledge-based question answering (KBQA) is a key task in natural language processing research, and also an approach to access the web data and knowledge, which requires exploiting knowledge graphs (KGs) for reasoning. In the literature, one promising solution for KBQA is to incorporate the pretrained language model (LM) with KGs by generating KG-centered pretraining corpus, which has shown its superiority. However, these methods often depend on specific techniques and resources to work, which may not always be available and restrict its application. Moreover, existing methods focus more on improving language understanding with KGs, while neglect the more important human-like complex reasoning. To this end, in this paper, we propose a general K nowledge-I njected C urriculum P retraining framework (KICP) to achieve comprehensive KG learning and exploitation for KBQA tasks, which is composed of knowledge injection (KI), knowledge adaptation (KA) and curriculum reasoning (CR). Specifically, the KI module first injects knowledge into the LM by generating KG-centered pretraining corpus, and generalizes the process into three key steps that could work with different implementations for flexible application. Next, the KA module learns knowledge from the generated corpus with LM equipped with an adapter as well as keeps its original natural language understanding ability to reduce the negative impacts of the difference between the generated and natural corpus. Last, to enable the LM with complex reasoning, the CR module follows human reasoning patterns to construct three corpora with increasing difficulties of reasoning, and further trains the LM from easy to hard in a curriculum manner to promote model learning. We provide an implementation of the general framework, and evaluate the proposed KICP on four real-word datasets. The results demonstrate that our framework can achieve higher performances, and have good generalization ability to other QA tasks.

Skip Supplemental Material Section

Supplemental Material

rfp0526.mp4

Supplemental video

mp4

26.8 MB

References

  1. Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, et al. 2023. Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023).Google ScholarGoogle Scholar
  2. Oshin Agarwal, Heming Ge, Siamak Shakeri, and Rami Al-Rfou. 2021. Knowledge Graph Based Synthetic Corpus Generation for Knowledge-Enhanced Language Model Pre-training. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 3554--3565.Google ScholarGoogle ScholarCross RefCross Ref
  3. Wenhu Chen, Yu Su, Xifeng Yan, and William Yang Wang. 2020. KGPT: Knowledge-Grounded Pre-Training for Data-to-Text Generation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 8635--8648.Google ScholarGoogle ScholarCross RefCross Ref
  4. Xiang Chen, Ningyu Zhang, Xin Xie, Shumin Deng, Yunzhi Yao, Chuanqi Tan, Fei Huang, Luo Si, and Huajun Chen. 2022. Knowprompt: Knowledge-aware prompt-tuning with synergistic optimization for relation extraction. In Proceedings of the ACM Web conference 2022. 2778--2788.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, and Ziqing Yang. 2021. Pre-training with whole word masking for chinese bert. IEEE/ACM Transactions on Audio, Speech, and Language Processing , Vol. 29 (2021), 3504--3514.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 4171--4186.Google ScholarGoogle Scholar
  7. Ning Ding, Yujia Qin, Guang Yang, Fuchao Wei, Zonghan Yang, Yusheng Su, Shengding Hu, Yulin Chen, Chi-Min Chan, Weize Chen, et al. 2022. Delta tuning: A comprehensive study of parameter efficient methods for pre-trained language models. arXiv preprint arXiv:2203.06904 (2022).Google ScholarGoogle Scholar
  8. Zhengxiao Du, Yujie Qian, Xiao Liu, Ming Ding, Jiezhong Qiu, Zhilin Yang, and Jie Tang. 2022. GLM: General Language Model Pretraining with Autoregressive Blank Infilling. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 320--335.Google ScholarGoogle ScholarCross RefCross Ref
  9. Shangbin Feng, Vidhisha Balachandran, Yuyang Bai, and Yulia Tsvetkov. 2023. Factkb: Generalizable factuality evaluation using language models enhanced with factual knowledge. arXiv preprint arXiv:2305.08281 (2023).Google ScholarGoogle Scholar
  10. Ziniu Hu, Yichong Xu, Wenhao Yu, Shuohang Wang, Ziyi Yang, Chenguang Zhu, Kai-Wei Chang, and Yizhou Sun. 2022. Empowering language models with knowledge graph reasoning for open-domain question answering. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. 9562--9581.Google ScholarGoogle ScholarCross RefCross Ref
  11. Xiao Huang, Jingyuan Zhang, Dingcheng Li, and Ping Li. 2019. Knowledge graph embedding based question answering. In Proceedings of the twelfth ACM international conference on web search and data mining. 105--113.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Zhenya Huang, Xin Lin, Hao Wang, Qi Liu, Enhong Chen, Jianhui Ma, Yu Su, and Wei Tong. 2021. Disenqnet: Disentangled representation learning for educational questions. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 696--704.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Kelvin Jiang, Dekun Wu, and Hui Jiang. 2019. FreebaseQA: A New Factoid QA Data Set Matching Trivia-Style Question-Answer Pairs with Freebase. In Proceedings of NAACL-HLT. 318--323.Google ScholarGoogle Scholar
  14. Bill Yuchen Lin, Xinyue Chen, Jamin Chen, and Xiang Ren. 2019. KagNet: Knowledge-Aware Graph Networks for Commonsense Reasoning. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2829--2839.Google ScholarGoogle ScholarCross RefCross Ref
  15. Xin Lin, Zhenya Huang, Hongke Zhao, Enhong Chen, Qi Liu, Defu Lian, Xin Li, and Hao Wang. 2023. Learning Relation-Enhanced Hierarchical Solver for Math Word Problems. IEEE Transactions on Neural Networks and Learning Systems (2023).Google ScholarGoogle ScholarCross RefCross Ref
  16. Jiayu Liu, Zhenya Huang, Xin Lin, Qi Liu, Jianhui Ma, and Enhong Chen. 2022a. A cognitive solver with autonomously knowledge learning for reasoning mathematical answers. In 2022 IEEE International Conference on Data Mining (ICDM). IEEE, 269--278.Google ScholarGoogle ScholarCross RefCross Ref
  17. Jiayu Liu, Zhenya Huang, Zhiyuan Ma, Qi Liu, Enhong Chen, Tianhuang Su, and Haifeng Liu. 2023 b. Guiding Mathematical Reasoning via Mastering Commonsense Formula Knowledge. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 1477--1488.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Jiayu Liu, Zhenya Huang, Chengxiang Zhai, and Qi Liu. 2023 c. Learning by Applying: A General Framework for Mathematical Reasoning via Enhancing Explicit Knowledge Learning. arXiv preprint arXiv:2302.05717 (2023).Google ScholarGoogle Scholar
  19. Lihui Liu, Yuzhong Chen, Mahashweta Das, Hao Yang, and Hanghang Tong. 2023 a. Knowledge Graph Question Answering with Ambiguous Query. In Proceedings of the ACM Web Conference 2023. 2477--2486.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Linlin Liu, Xin Li, Ruidan He, Lidong Bing, Shafiq Joty, and Luo Si. 2022b. Enhancing multilingual language model with massive multilingual knowledge triples. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. 6878--6890.Google ScholarGoogle ScholarCross RefCross Ref
  21. Weijie Liu, Peng Zhou, Zhe Zhao, Zhiruo Wang, Qi Ju, Haotang Deng, and Ping Wang. 2020. K-bert: Enabling language representation with knowledge graph. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 2901--2908.Google ScholarGoogle ScholarCross RefCross Ref
  22. Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019).Google ScholarGoogle Scholar
  23. Robert Logan, Nelson F Liu, Matthew E Peters, Matt Gardner, and Sameer Singh. 2019. Barack's Wife Hillary: Using Knowledge Graphs for Fact-Aware Language Modeling. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 5962--5971.Google ScholarGoogle ScholarCross RefCross Ref
  24. Ilya Loshchilov and Frank Hutter. 2019. Decoupled Weight Decay Regularization. In International Conference on Learning Representations.Google ScholarGoogle Scholar
  25. Pan Lu, Liang Qiu, Kai-Wei Chang, Ying Nian Wu, Song-Chun Zhu, Tanmay Rajpurohit, Peter Clark, and Ashwin Kalyan. 2022b. Dynamic prompt learning via policy gradient for semi-structured mathematical reasoning. arXiv preprint arXiv:2209.14610 (2022).Google ScholarGoogle Scholar
  26. Yinquan Lu, Haonan Lu, Guirong Fu, and Qun Liu. 2022a. KELM: Knowledge Enhanced Pre-Trained Language Representations with Message Passing on Hierarchical Relational Graphs. In ICLR 2022 Workshop on Deep Learning on Graphs for Natural Language Processing.Google ScholarGoogle Scholar
  27. Denis Lukovnikov, Asja Fischer, and Jens Lehmann. 2019. Pretrained Transformers for Simple Question Answering over Knowledge Graphs. In The Semantic Web--ISWC 2019: 18th International Semantic Web Conference, Auckland, New Zealand, October 26--30, 2019, Proceedings, Part I. 470--486.Google ScholarGoogle Scholar
  28. Shangwen Lv, Daya Guo, Jingjing Xu, Duyu Tang, Nan Duan, Ming Gong, Linjun Shou, Daxin Jiang, Guihong Cao, and Songlin Hu. 2020. Graph-based reasoning over heterogeneous external knowledge for commonsense question answering. In Proceedings of the AAAI conference on artificial intelligence, Vol. 34. 8449--8456.Google ScholarGoogle ScholarCross RefCross Ref
  29. Yu Meng, Yunyi Zhang, Jiaxin Huang, Yu Zhang, and Jiawei Han. 2022. Topic discovery via latent space clustering of pretrained language model representations. In Proceedings of the ACM Web Conference 2022. 3143--3152.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, et al. 2022. Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems , Vol. 35 (2022), 27730--27744.Google ScholarGoogle Scholar
  31. Matthew E Peters, Mark Neumann, Robert Logan, Roy Schwartz, Vidur Joshi, Sameer Singh, and Noah A Smith. 2019. Knowledge Enhanced Contextual Word Representations. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 43--54.Google ScholarGoogle ScholarCross RefCross Ref
  32. Fanchao Qi, Chenghao Yang, Zhiyuan Liu, Qiang Dong, Maosong Sun, and Zhendong Dong. 2019. OpenHowNet: An Open Sememe-based Lexical Knowledge Base. arXiv preprint arXiv:1901.09957 (2019).Google ScholarGoogle Scholar
  33. Apoorv Saxena, Aditay Tripathi, and Partha Talukdar. 2020. Improving multi-hop question answering over knowledge graphs using knowledge base embeddings. In Proceedings of the 58th annual meeting of the association for computational linguistics. 4498--4507.Google ScholarGoogle ScholarCross RefCross Ref
  34. Yu Sun, Shuohuan Wang, Yukun Li, Shikun Feng, Xuyi Chen, Han Zhang, Xin Tian, Danxiang Zhu, Hao Tian, and Hua Wu. 2019. Ernie: Enhanced representation through knowledge integration. arXiv preprint arXiv:1904.09223 (2019).Google ScholarGoogle Scholar
  35. Alon Talmor and Jonathan Berant. 2018. The Web as a Knowledge-Base for Answering Complex Questions. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). 641--651.Google ScholarGoogle ScholarCross RefCross Ref
  36. Ruize Wang, Duyu Tang, Nan Duan, Zhongyu Wei, Xuan-Jing Huang, Jianshu Ji, Guihong Cao, Daxin Jiang, and Ming Zhou. 2021b. K-Adapter: Infusing Knowledge into Pre-Trained Models with Adapters. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. 1405--1418.Google ScholarGoogle ScholarCross RefCross Ref
  37. Xiaozhi Wang, Tianyu Gao, Zhaocheng Zhu, Zhengyan Zhang, Zhiyuan Liu, Juanzi Li, and Jian Tang. 2021a. KEPLER: A unified model for knowledge embedding and pre-trained language representation. Transactions of the Association for Computational Linguistics , Vol. 9 (2021), 176--194.Google ScholarGoogle ScholarCross RefCross Ref
  38. Yan Wang, Xiaojiang Liu, and Shuming Shi. 2017. Deep neural solver for math word problems. In Proceedings of the 2017 conference on empirical methods in natural language processing. 845--854.Google ScholarGoogle ScholarCross RefCross Ref
  39. Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed Chi, Quoc V Le, Denny Zhou, et al. 2022a. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems , Vol. 35 (2022), 24824--24837.Google ScholarGoogle Scholar
  40. Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed H Chi, Quoc V Le, Denny Zhou, et al. 2022b. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. In Advances in NeurIPS.Google ScholarGoogle Scholar
  41. Qinzhuo Wu, Qi Zhang, Jinlan Fu, and Xuan-Jing Huang. 2020. A knowledge-aware sequence-to-tree network for math word problem solving. In Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP). 7137--7146.Google ScholarGoogle ScholarCross RefCross Ref
  42. Wenhan Xiong, Jingfei Du, William Yang Wang, and Veselin Stoyanov. 2020. Pretrained Encyclopedia: Weakly Supervised Knowledge-Pretrained Language Model. In International Conference on Learning Representations.Google ScholarGoogle Scholar
  43. Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik R Narasimhan, and Yuan Cao. 2022. ReAct: Synergizing Reasoning and Acting in Language Models. In The Eleventh International Conference on Learning Representations.Google ScholarGoogle Scholar
  44. Michihiro Yasunaga, Antoine Bosselut, Hongyu Ren, Xikun Zhang, Christopher D Manning, Percy S Liang, and Jure Leskovec. 2022. Deep bidirectional language-knowledge graph pretraining. Advances in Neural Information Processing Systems , Vol. 35 (2022), 37309--37323.Google ScholarGoogle Scholar
  45. Michihiro Yasunaga, Hongyu Ren, Antoine Bosselut, Percy Liang, and Jure Leskovec. 2021. QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 535--546.Google ScholarGoogle ScholarCross RefCross Ref
  46. Hongbin Ye, Ningyu Zhang, Shumin Deng, Xiang Chen, Hui Chen, Feiyu Xiong, Xi Chen, and Huajun Chen. 2022. Ontology-enhanced Prompt-tuning for Few-shot Learning. In Proceedings of the ACM Web Conference 2022. 778--787.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Donghan Yu, Chenguang Zhu, Yiming Yang, and Michael Zeng. 2022. Jaket: Joint pre-training of knowledge graph and language understanding. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 11630--11638.Google ScholarGoogle ScholarCross RefCross Ref
  48. Wen Zhang, Yushan Zhu, Mingyang Chen, Yuxia Geng, Yufeng Huang, Yajing Xu, Wenting Song, and Huajun Chen. 2023. Structure Pretraining and Prompt Tuning for Knowledge Graph Transfer. In Proceedings of the ACM Web Conference 2023. 2581--2590.Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Xikun Zhang, Antoine Bosselut, Michihiro Yasunaga, Hongyu Ren, Percy Liang, Christopher D Manning, and Jure Leskovec. 2022. GreaseLM: Graph REASoning Enhanced Language Models. In International conference on learning representations.Google ScholarGoogle Scholar
  50. Zhengyan Zhang, Xu Han, Zhiyuan Liu, Xin Jiang, Maosong Sun, and Qun Liu. 2019. ERNIE: Enhanced Language Representation with Informative Entities. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 1441--1451.Google ScholarGoogle ScholarCross RefCross Ref
  51. Wayne Xin Zhao, Kun Zhou, Zheng Gong, Beichen Zhang, Yuanhang Zhou, Jing Sha, Zhigang Chen, Shijin Wang, Cong Liu, and Ji-Rong Wen. 2022. JiuZhang: A Chinese Pre-trained Language Model for Mathematical Problem Understanding. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 4571--4581.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Denny Zhou, Nathanael Sch"arli, Le Hou, Jason Wei, Nathan Scales, Xuezhi Wang, Dale Schuurmans, Claire Cui, Olivier Bousquet, Quoc V Le, et al. 2022. Least-to-Most Prompting Enables Complex Reasoning in Large Language Models. In The Eleventh International Conference on Learning Representations.Google ScholarGoogle Scholar
  53. Chenguang Zhu, Yichong Xu, Xiang Ren, Bill Yuchen Lin, Meng Jiang, and Wenhao Yu. 2023. Knowledge-augmented methods for natural language processing. In Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining. 1228--1231.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A Knowledge-Injected Curriculum Pretraining Framework for Question Answering

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      WWW '24: Proceedings of the ACM on Web Conference 2024
      May 2024
      4826 pages
      ISBN:9798400701719
      DOI:10.1145/3589334

      Copyright © 2024 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 13 May 2024

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate1,899of8,196submissions,23%
    • Article Metrics

      • Downloads (Last 12 months)33
      • Downloads (Last 6 weeks)33

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader