Reliable Data Generation and Selection for Low-Resource Relation Extraction

Junjie Yu; Xing Wang; Wenliang Chen

doi:10.1609/aaai.v38i17.29915

Authors

Junjie Yu Soochow University
Xing Wang Tencent AI Lab
Wenliang Chen Soochow University

DOI:

https://doi.org/10.1609/aaai.v38i17.29915

Keywords:

NLP: Information Extraction, NLP: Generation

Abstract

Automated construction of annotated data holds significant importance in Relation Extraction (RE) tasks due to the hardness and cost of human annotation. In this work, we propose Self-RDGS, a method for Self-supervised Reliable Data Generation and Selection in low-resource RE tasks. At first, we fully utilize the knowledge of triplets as prompts to generate sentences by employing the Large Language Models (LLMs). Since the auto-generated data contains noise, we then propose a ranking-based data selection method to select reliable sentences. Finally, we integrate the data selection and RE model training within a self-supervised iterative framework. Through experimentation on three datasets with low-resource settings, we demonstrate the effectiveness of our proposed approach in constructing annotated data and achieving noteworthy improvements in comparison to multiple baselines. Code, data and models are available at https://github.com/jjyunlp/GenerationRE.

Reliable Data Generation and Selection for Low-Resource Relation Extraction

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription