Three Heads Are Better than One: Improving Cross-Domain NER with Progressive Decomposed Network

Authors

  • Xuming Hu AI Thrust, Hong Kong University of Science and Technology (Guangzhou) School of Software, Tsinghua University
  • Zhaochen Hong School of Software, Tsinghua University
  • Yong Jiang DAMO Academy, Alibaba Group
  • Zhichao Lin DAMO Academy, Alibaba Group
  • Xiaobin Wang DAMO Academy, Alibaba Group
  • Pengjun Xie DAMO Academy, Alibaba Group
  • Philip S. Yu Department of Computer Science, University of Illinois Chicago

DOI:

https://doi.org/10.1609/aaai.v38i16.29785

Keywords:

NLP: Information Extraction, ML: Transfer, Domain Adaptation, Multi-Task Learning, ML: Life-Long and Continual Learning

Abstract

Cross-domain named entity recognition (NER) tasks encourage NER models to transfer knowledge from data-rich source domains to sparsely labeled target domains. Previous works adopt the paradigms of pre-training on the source domain followed by fine-tuning on the target domain. However, these works ignore that general labeled NER source domain data can be easily retrieved in the real world, and soliciting more source domains could bring more benefits. Unfortunately, previous paradigms cannot efficiently transfer knowledge from multiple source domains. In this work, to transfer multiple source domains' knowledge, we decouple the NER task into the pipeline tasks of mention detection and entity typing, where the mention detection unifies the training object across domains, thus providing the entity typing with higher-quality entity mentions. Additionally, we request multiple general source domain models to suggest the potential named entities for sentences in the target domain explicitly, and transfer their knowledge to the target domain models through the knowledge progressive networks implicitly. Furthermore, we propose two methods to analyze in which source domain knowledge transfer occurs, thus helping us judge which source domain brings the greatest benefit. In our experiment, we develop a Chinese cross-domain NER dataset. Our model improved the F1 score by an average of 12.50% across 8 Chinese and English datasets compared to models without source domain data.

Published

2024-03-24

How to Cite

Hu, X., Hong, Z., Jiang, Y., Lin, Z., Wang, X., Xie, P., & Yu, P. S. (2024). Three Heads Are Better than One: Improving Cross-Domain NER with Progressive Decomposed Network. Proceedings of the AAAI Conference on Artificial Intelligence, 38(16), 18261-18269. https://doi.org/10.1609/aaai.v38i16.29785

Issue

Section

AAAI Technical Track on Natural Language Processing I