Human-Instruction-Free LLM Self-Alignment with Limited Samples

Guo, Hongyi; Yao, Yuanshun; Shen, Wei; Wei, Jiaheng; Zhang, Xiaoying; Wang, Zhaoran; Liu, Yang

Computer Science > Computation and Language

arXiv:2401.06785 (cs)

[Submitted on 6 Jan 2024]

Title:Human-Instruction-Free LLM Self-Alignment with Limited Samples

Authors:Hongyi Guo, Yuanshun Yao, Wei Shen, Jiaheng Wei, Xiaoying Zhang, Zhaoran Wang, Yang Liu

View PDF

Abstract:Aligning large language models (LLMs) with human values is a vital task for LLM practitioners. Current alignment techniques have several limitations: (1) requiring a large amount of annotated data; (2) demanding heavy human involvement; (3) lacking a systematic mechanism to continuously improve. In this work, we study aligning LLMs to a new domain with limited samples (e.g. < 100). We propose an algorithm that can self-align LLMs iteratively without active human involvement. Unlike existing works, our algorithm relies on neither human-crafted instructions nor labeled rewards, significantly reducing human involvement. In addition, our algorithm can self-improve the alignment continuously. The key idea is to first retrieve high-quality samples related to the target domain and use them as In-context Learning examples to generate more samples. Then we use the self-generated samples to finetune the LLM iteratively. We show that our method can unlock the LLMs' self-generalization ability to perform alignment with near-zero human supervision. We test our algorithm on three benchmarks in safety, truthfulness, and instruction-following, and show good performance in alignment, domain adaptability, and scalability.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2401.06785 [cs.CL]
	(or arXiv:2401.06785v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2401.06785

Submission history

From: Hongyi Guo [view email]
[v1] Sat, 6 Jan 2024 14:00:12 UTC (187 KB)

Computer Science > Computation and Language

Title:Human-Instruction-Free LLM Self-Alignment with Limited Samples

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Human-Instruction-Free LLM Self-Alignment with Limited Samples

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators