Commonsense Knowledge Transfer for Pre-trained Language Models

Zhou, Wangchunshu; Bras, Ronan Le; Choi, Yejin

Computer Science > Computation and Language

arXiv:2306.02388 (cs)

[Submitted on 4 Jun 2023]

Title:Commonsense Knowledge Transfer for Pre-trained Language Models

Authors:Wangchunshu Zhou, Ronan Le Bras, Yejin Choi

View PDF

Abstract:Despite serving as the foundation models for a wide range of NLP benchmarks, pre-trained language models have shown limited capabilities of acquiring implicit commonsense knowledge from self-supervision alone, compared to learning linguistic and factual knowledge that appear more explicitly in the surface patterns in text. In this work, we introduce commonsense knowledge transfer, a framework to transfer the commonsense knowledge stored in a neural commonsense knowledge model to a general-purpose pre-trained language model. It first exploits general texts to form queries for extracting commonsense knowledge from the neural commonsense knowledge model and then refines the language model with two self-supervised objectives: commonsense mask infilling and commonsense relation prediction, which align human language with the underlying commonsense knowledge. Empirical results show that our approach consistently improves the model's performance on downstream tasks that require commonsense reasoning. Moreover, we find that the improvement is more significant in the few-shot setting. This suggests that our approach helps language models better transfer to downstream tasks without extensive supervision by injecting commonsense knowledge into their parameters.

Comments:	ACL 2023 Findings
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2306.02388 [cs.CL]
	(or arXiv:2306.02388v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2306.02388

Submission history

From: Wangchunshu Zhou [view email]
[v1] Sun, 4 Jun 2023 15:44:51 UTC (10,036 KB)

Computer Science > Computation and Language

Title:Commonsense Knowledge Transfer for Pre-trained Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Commonsense Knowledge Transfer for Pre-trained Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators