计算机科学 ›› 2024, Vol. 51 ›› Issue (3): 198-204.doi: 10.11896/jsjkx.230200114

• 人工智能 • 上一篇    下一篇

基于标签信息融合与多任务学习的中文命名实体识别

廖梦1, 贾真1, 李天瑞1,2,3   

  1. 1 西南交通大学计算机与人工智能学院 成都611756
    2 四川省制造业产业链协同与信息化支撑技术重点实验室 成都611756
    3 综合交通大数据应用技术国家工程实验室 成都611756
  • 收稿日期:2023-02-17 修回日期:2023-06-06 出版日期:2024-03-15 发布日期:2024-03-13
  • 通讯作者: 李天瑞(trli@swjtu.edu.cn)
  • 作者简介:(liaomeng28@163.com)
  • 基金资助:
    国家自然科学基金面上项目(62176221)

Chinese Named Entity Recognition Based on Label Information Fusion and Multi-task Learning

LIAO Meng1, JIA Zhen1, LI Tianrui1,2,3   

  1. 1 School of Computing and Artificial Intelligence,Southwest Jiaotong University,Chengdu 611756,China
    2 Manufacturing Industry Chains Collaboration and Information Support Technology Key Laboratory of Sichuan Province,Chengdu 611756,China
    3 National Engineering Laboratory of Integrated Transportation Big Data Application Technology,Chengdu 611756,China
  • Received:2023-02-17 Revised:2023-06-06 Online:2024-03-15 Published:2024-03-13
  • About author:LIAO Meng,born in 1997,postgra-duate.His main research interests include information extraction and natural language processing.LI Tianrui,born in 1969,Ph.D,professor,Ph.D supervisor,is a distinguished member of CCF(No.05237D).His mainresearch interests include big data intelligence,urban computing,rough sets and granular computing.
  • Supported by:
    National Natural Science Foundation of China(62176221).

摘要: 随着中文命名实体识别研究的不断深入,大多数模型关注融入词汇或字形信息来丰富特征表示,但是却忽略了标签信息。因此文中提出了一种融合标签信息的中文命名实体识别模型。首先,通过预训练模型BERT-wwm得到字符的嵌入表示,并将标签向量化,使用Transformer解码器结构将字符表示与标签表示进行交互学习,捕捉字符与标签的相互依赖关系,丰富字符的特征表示。为了促进标签信息的学习,构建了基于文本句的监督信号,增加了多标签文本分类任务,采用多任务学习的方式进行训练。其中,命名实体识别任务采用条件随机场进行解码预测,多标签文本分类任务采用双仿射机制进行解码预测,两任务共享除解码层以外的所有参数,保证了不同的监督信息反馈到每个子任务。在公开数据集MSRA,Weibo和Resume上进行了多组对比实验,分别获得了95.75%,72.17%,96.23%的F1值。与多个基准模型相比,所提模型的实验效果有一定的提升,证明了该模型的有效性与可行性。

关键词: 命名实体识别, 标签信息, 注意力机制, 双仿射机制, 预训练模型

Abstract: With the development of Chinese named entity recognition research,most models focus on enriching feature representation by integrating vocabulary or glyph information but ignore label information.Therefore,a Chinese named entity recognition model integrating label information is proposed in this paper.Firstly,the embedding representation of characters is obtained by pre-trained model BERT-wwm,and labels are represented as vectors.The character representation and label representation are interactively learned by using the Transformer decoder structure to capture the interdependence between characters and labels and enrich the feature representation of characters.To promote the learning of label information,a supervision signal based on text sentences is constructed,multi-label text classification tasks are added,and multi-task learning is used for training.Among them,the named entity recognition task uses a conditional random field for decoding and prediction,and the multi-label text classification task uses a biaffine mechanism for decoding and prediction.The two tasks share all parameters except the decoding layer,which ensures that different supervision information is fed back to each subtask.Several groups of comparative experiments are carried out on the public data sets MSRA,Weibo,and Resume,and the F1 values of 95.75%,72.17%,and 96.23% are obtained respectively.Compared with several benchmark models,experimental result of the proposed model is improved to some extent,which validates its effectiveness and feasibility.

Key words: Named entity recognition, Label information, Attention mechanism, Biaffine mechanism, Pre-trained model

中图分类号: 

  • TP391
[1]LI J Q,CHEN X J,WANG D K,et al.Enhancing Label Representations with Relational Inductive Bias Constraint for Fine-Grained Entity Typing[C]//International Joint Conferences on Artificial Intelligence.2021:3843-3849.
[2]LIN Y,JI H.An attentive fine-grained entity typing model with latent type representation[C]//Proceedings of the 2019 Confe-rence on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing(EMNLP-IJCNLP).2019:6197-6202.
[3]LI J Q,ZHAO S H,YANG J J,et al.WCP-RNN:a novel RNN-based approach for Bio-NER in Chinese EMRs[J].The journal of supercomputing,2020,76(3):1450-1467.
[4]JIA Y Z,MA X P.Attention in character-Based BiLSTM-CRF for Chinese named entity recognition[C]//Proceedings of the 2019 4th International Conference on Mathematics and Artificial Intelligence.2019:1-4.
[5]PENG D L,WANG Y R,LIU C,et al.TL-NER:A transferlearning model for Chinese named entity recognition[J].Information Systems Frontiers,2020,22(6):1291-1304.
[6]VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[C]//Advances in Neural Information Processing Systems.2017:5998-6008.
[7]WANG C Q,CHEN W,XU B.Named entity recognition withgated convolutional neural networks[C]//Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data.2017:110-121.
[8]YAN H,DENG B C,LI X N,et al.TENER:adapting transfor-mer encoder for named entity recognition[J].arXiv:1911.04474,2019.
[9]JIN Y L,XIE J F,GUO W S,et al.LSTM-CRF neural network with gated self attention for Chinese NER[J].IEEE Access,2019,7:136694-136703.
[10]CHANG Y,KONG L,JIA K J,et al.Chinese named entity recognition method based on BERT[C]//2021 IEEE International Conference on Data Science and Computer Application(ICDSCA).2021:294-299.
[11]DONG C H,ZHANG J J,ZONG C Q,et al.Character-based LSTM-CRF with radical-level features for Chinese named entity recognition[C]//5th CCF Conference on Natural Language Processing and Chinese Computing.2016:239-250.
[12]LIU Y H,LIU C J,XU R F,et al.Utilizing glyph feature and ite-rative learning for named entity recognition in finance text[J].Journal of Chinese Information Processing,2020,34(11):74-83.
[13]ZHANG D,WANG M T,CHEN W L.Named entity recognition combining wubi glyphs with contextualized character embeddings[J].Computer Engineering,2021,47(3):94-101.
[14]MENG Y X,WU W,WANG F,et al.Glyce:Glyph-vectors forchinese character representations[J].Advances in Neural Information Processing Systems,2019,32:2746-2757.
[15]SONG C H,SEHANOBISH A.Using chinese glyphs for named entity recognition[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020:13921-13922.
[16]XUAN Z Y,BAO R,JIANG S Y.FGN:Fusion glyph networkfor Chinese named entity recognition[C]//China Conference on Knowledge Graph and Semantic Computing.2020:28-40.
[17]ZHANG Y,YANG J.Chinese NER Using Lattice LSTM[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers).2018:1554-1564.
[18]GUI T,ZOU Y C,ZHANG Q,et al.A lexicon-based graph neural network for chinese ner[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing(EMNLP-IJCNLP).2019:1039-1049.
[19]SUI D B,CHEN Y B,LIU K,et al.Leverage lexical knowledge for chinese named entity recognition via collaborative graph network[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing(EMNLP-IJCNLP).2019:3821-3831.
[20]LI X N,YAN H,QIU X P,et al.FLAT:Chinese NER Using Flat-Lattice Transformer[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.2020:6836-6842.
[21]LIU W,XU T G,XU Q H,et al.An Encoding Strategy Based Word-Character LSTM for Chinese NER[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies,Volume 1(Long and Short Papers).2019:2379-2389.
[22]MA R T,PENG M L,ZHANG Q,et al.Simplify the Usage of Lexicon in Chinese NER[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.2020:5951-5960.
[23]LIU W,FU X Y,ZHANG Y,et al.Lexicon Enhanced Chinese Sequence Labeling Using BERT Adapter[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing(Volume 1:Long Papers).2021:5847-5858.
[24]LI X Y,FENG J R,MENG Y X,et al.A Unified MRC Framework for Named Entity Recognition[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.2020:5849-5859.
[25]YAN H,GUI T,DAI J Q,et al.A Unified Generative Framework for Various NER Subtasks[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing(Volume 1:Long Papers).2021:5808-5822.
[26]JIMENEZ G B,MCNEAL N,WASHINGTON C,et al.Thin-king about GPT-3 In-Context Learning for Biomedical IE? Think Again[C]//Findings of the Association for Computa-tional Linguistics:EMNLP 2022.2022:4497-4512.
[27]LI J Y,FEI H,LIU J,et al.Unified named entity recognition as word-word relation classification[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2022:10965-10973.
[28]CUI Y M,CHE W X,LIU T,et al.Pre-training with whole word masking for chinese bert[J].IEEE/ACM Transactions on Au-dio,Speech,and Language Processing,2021,29:3504-3514.
[29]CUI L Y,ZHANG Y.Hierarchically-Refined Label AttentionNetwork for Sequence Labeling[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Proces-sing and the 9th International Joint Conference on Natural Language Processing(EMNLP-IJCNLP).2019:4115-4128.
[30]DONG Y,CORDONNIER J B,LOUKAS A.Attention is not all you need:Pure attention loses rank doubly exponentially with depth[C]//International Conference on Machine Learning.2021:2793-2803.
[31]LEVOW G A.The third international Chinese language processing bakeoff:Word segmentation and named entity recognition[C]//Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing.2006:108-117.
[32]PENG N,DREDZE M.Named entity recognition for chinese social media with jointly trained embeddings[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.2015:548-554.
[33]ZHU Y Y,WANG G X.CAN-NER:Convolutional AttentionNetwork for Chinese Named Entity Recognition[C]//Procee-dings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies,Volume 1(Long and Short Papers).2019:3384-3393.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!