Three Heads Are Better than One: Improving Cross-Domain NER with Progressive Decomposed Network

Xuming Hu; Zhaochen Hong; Yong Jiang; Zhichao Lin; Xiaobin Wang; Pengjun Xie; Philip S. Yu

doi:10.1609/aaai.v38i16.29785

Authors

Xuming Hu AI Thrust, Hong Kong University of Science and Technology (Guangzhou) School of Software, Tsinghua University
Zhaochen Hong School of Software, Tsinghua University
Yong Jiang DAMO Academy, Alibaba Group
Zhichao Lin DAMO Academy, Alibaba Group
Xiaobin Wang DAMO Academy, Alibaba Group
Pengjun Xie DAMO Academy, Alibaba Group
Philip S. Yu Department of Computer Science, University of Illinois Chicago

DOI:

https://doi.org/10.1609/aaai.v38i16.29785

Keywords:

NLP: Information Extraction, ML: Transfer, Domain Adaptation, Multi-Task Learning, ML: Life-Long and Continual Learning

Abstract

Cross-domain named entity recognition (NER) tasks encourage NER models to transfer knowledge from data-rich source domains to sparsely labeled target domains. Previous works adopt the paradigms of pre-training on the source domain followed by fine-tuning on the target domain. However, these works ignore that general labeled NER source domain data can be easily retrieved in the real world, and soliciting more source domains could bring more benefits. Unfortunately, previous paradigms cannot efficiently transfer knowledge from multiple source domains. In this work, to transfer multiple source domains' knowledge, we decouple the NER task into the pipeline tasks of mention detection and entity typing, where the mention detection unifies the training object across domains, thus providing the entity typing with higher-quality entity mentions. Additionally, we request multiple general source domain models to suggest the potential named entities for sentences in the target domain explicitly, and transfer their knowledge to the target domain models through the knowledge progressive networks implicitly. Furthermore, we propose two methods to analyze in which source domain knowledge transfer occurs, thus helping us judge which source domain brings the greatest benefit. In our experiment, we develop a Chinese cross-domain NER dataset. Our model improved the F1 score by an average of 12.50% across 8 Chinese and English datasets compared to models without source domain data.

Three Heads Are Better than One: Improving Cross-Domain NER with Progressive Decomposed Network

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription