Learning to select pseudo labels: a semi-supervised method for named entity recognition

Li, Zhen-zhen; Feng, Da-wei; Li, Dong-sheng; Lu, Xi-cheng

doi:10.1631/FITEE.1800743

Learning to select pseudo labels: a semi-supervised method for named entity recognition

Published: 27 December 2019

Volume 21, pages 903–916, (2020)
Cite this article

Frontiers of Information Technology & Electronic Engineering Aims and scope Submit manuscript

311 Accesses
9 Citations
6 Altmetric
Explore all metrics

Abstract

Deep learning models have achieved state-of-the-art performance in named entity recognition (NER); the good performance, however, relies heavily on substantial amounts of labeled data. In some specific areas such as medical, financial, and military domains, labeled data is very scarce, while unlabeled data is readily available. Previous studies have used unlabeled data to enrich word representations, but a large amount of entity information in unlabeled data is neglected, which may be beneficial to the NER task. In this study, we propose a semi-supervised method for NER tasks, which learns to create high-quality labeled data by applying a pre-trained module to filter out erroneous pseudo labels. Pseudo labels are automatically generated for unlabeled data and used as if they were true labels. Our semi-supervised framework includes three steps: constructing an optimal single neural model for a specific NER task, learning a module that evaluates pseudo labels, and creating new labeled data and improving the NER model iteratively. Experimental results on two English NER tasks and one Chinese clinical NER task demonstrate that our method further improves the performance of the best single neural model. Even when we use only pre-trained static word embeddings and do not rely on any external knowledge, our method achieves comparable performance to those state-of-the-art models on the CoNLL-2003 and OntoNotes 5.0 English NER tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

External Knowledge-Based Weakly Supervised Learning Approach on Chinese Clinical Named Entity Recognition

Named entity recognition: a semi-supervised learning approach

Article 24 May 2020

Multi-task learning for Chinese clinical named entity recognition with external knowledge

Article Open access 31 December 2021

References

Akbik A, Blythe D, Vollgraf R, 2018. Contextual string embeddings for sequence labeling. Proc 27^th Int Conf on Computational Linguistics, p.1638–1649.
Chang CC, Lin CJ, 2011. LIBSVM—a library for support vector machines. ACM Trans Intell Syst Technol, 2, Article 27. https://doi.org/10.1145/1961189.1961199
Chawla NV, Bowyer KW, Hall LO, et al., 2002. SMOTE: synthetic minority over-sampling technique. J Artif Intell Res, 16:321–357. https://doi.org/10.1613/jair.953
Article Google Scholar
Chiu JPC, Nichols E, 2016. Named entity recognition with bidirectional LSTM-CNNs. Trans Assoc Comput Ling, 4:357–370. https://doi.org/10.1162/tacl_a_00104
Google Scholar
Collobert R, Weston J, Bottou L, et al., 2011. Natural language processing (almost) from scratch. J Mach Learn Res, 12:2493–2537.
MATH Google Scholar
Cortes C, Vapnik V, 1995. Support-vector networks. Mach Learn, 20(3):273–297. https://doi.org/10.1007/BF00994018
MATH Google Scholar
Devlin J, Chang MW, Lee K, et al., 2018. BERT: pretraining of deep bidirectional transformers for language understanding. https://arxiv.org/abs/1810.04805
Ghaddar A, Langlais P, 2018. Robust lexical features for improved neural network named-entity recognition. Proc 27^th Int Conf on Computational Linguistics, p.1896–1907.
Grandvalet Y, Bengio Y, 2006. Entropy regularization. In: Chapelle O, Schölkopf B, Zien A (Eds.), Semisupervised Learning. MIT Press, Cambridge, Mass, p.151–168. https://doi.org/10.7551/mitpress/9780262033589.001.0001
Google Scholar
Hochreiter S, Schmidhuber J, 1997. Long short-term memory. Neur Comput, 9(8):1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
Article Google Scholar
Hu J, Shi X, Liu Z, et al., 2017. HITSZ_CNER: a hybrid system for entity recognition from Chinese clinical text. China Conf on Knowledge Graph and Semantic Computing, p.1–6.
Huang Z, Xu W, Yu K, 2015. Bidirectional LSTM-CRF models for sequence tagging. https://arxiv.org/abs/1508.01991
Jagannatha AN, Yu H, 2016. Structured prediction models for RNN based sequence labeling in clinical text. Proc Conf on Empirical Methods in Natural Language Processing, p.856. https://doi.org/10.18653/v1/D16-1082
Lafferty JD, McCallum A, Pereira FCN, 2001. Conditional random fields: probabilistic models for segmenting and labeling sequence data. Proc 18^th Int Conf on Machine Learning, p.282–289.
Lample G, Ballesteros M, Subramanian S, et al., 2016. Neural architectures for named entity recognition. North American Chapter of the Association for Computational Linguistics, p.260–270. https://doi.org/10.18653/v1/N16-1030
Lee DH, 2013. Pseudo-label: the simple and efficient semisupervised learning method for deep neural networks. Work Shop on Challenges in Representation Learning, p.1–6.
Li PH, Dong RP, Wang YS, et al., 2017. Leveraging linguistic structures for named entity recognition with bidirectional recursive neural networks. Proc Conf on Empirical Methods in Natural Language Processing, p.2664–2669. https://doi.org/10.18653/v1/D17-1282
Liao WH, Veeramachaneni S, 2009. A simple semi-supervised algorithm for named entity recognition. Proc NAACL HLT Workshop on Semi-supervised Learning for Natural Language Processing, p.58–65.
Ma XZ, Hovy E, 2016. End-to-end sequence labeling via bidirectional LSTM-CNNs-CRF. Proc 54^th Annual Meeting of the Association for Computational Linguistics, p.1064–1074. https://doi.org/10.13140/RG.2.1.2182.5685
Mesnil G, He X, Deng L, et al., 2013. Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding. Interspeech, p.1–5.
Mikolov T, Sutskever I, Chen K, et al., 2013. Distributed representations of words and phrases and their compositionality. Proc 26^th Int Conf on Neural Information Processing Systems, p.3111–3119.
Pennington J, Socher R, Manning CD, 2014. Glove: global vectors for word representation. Proc Empirical Methods in Natural Language Processing, p.1532–1543.
Peters ME, Ammar W, Bhagavatula C, et al., 2017. Semisupervised sequence tagging with bidirectional language models. Proc 55^th Annual Meeting of the Association for Computational Linguistics, p.1756–1765. https://doi.org/10.18653/v1/P17-1161
Peters ME, Neumann M, Iyyer M, et al., 2018. Deep contextualized word representations. https://arxiv.org/abs/1802.05365
Pradhan S, Moschitti A, Xue N, et al., 2013. Towards robust linguistic analysis using ontonotes. Proc 7^th Conf on Computational Natural Language Learning, p.143–152.
Qi YJ, Collobert R, Kuksa P, et al., 2009. Combining labeled and unlabeled data with word-class distribution learning. Proc 18^th ACM Conf on Information and Knowledge Management, p.1737–1740. https://doi.org/10.1145/1645953.1646218
Rei M, 2017. Semi-supervised multitask learning for sequence labeling. 55^th Annual Meeting of the Association for Computational Linguistics, p.2121–2130. https://doi.org/10.18653/v1/P17-1194
Schmidhuber J, 2015. Deep learning in neural networks: an overview. Neur Netw, 61:85–117. https://doi.org/10.1016/j.neunet.2014.09.003
Article Google Scholar
Shen YY, Yun H, Lipton ZC, et al., 2017. Deep active learning for named entity recognition. https://arxiv.org/abs/1707.05928
Strubell E, Verga P, Belanger D, et al., 2017. Fast and accurate entity recognition with iterated dilated convolutions. Proc Conf on Empirical Methods in Natural Language Processing, p.2670–2680.
Sun YQ, Li L, Xie ZW, et al., 2017. Co-training an improved recurrent neural network with probability statistic models for named entity recognition. Int Conf on Database Systems for Advanced Applications, p.545–555. https://doi.org/10.1007/978-3-319-55699-4_33
Tjong Kim Sang EF, de Meulder F, 2003. Introduction to the CoNLL-2003 shared task: language-independent named entity recognition. Proc 7^th Conf on Natural Language Learning at HLT-NAACL, p.142–147. https://doi.org/10.3115/1119176.1119195
Wu H, Prasad S, 2018. Semi-supervised deep learning using pseudo labels for hyperspectral image classification. IEEE Trans Image Process, 27(3):1259–1270. https://doi.org/10.1109/TIP.2017.2772836
Article MathSciNet Google Scholar
Xia Y, Wang Q, 2017. Clinical named entity recognition: ECUST in the CCKS-2017 shared task 2. CEUR Workshop Proc, p.43–48.
Xiao Y, Wang Z, 2017. Clinical Named Entity Recognition Evaluation Tasks at CCKS 2017. http://ceur-ws.org/Vol-1976/
Yang J, Zhang Y, 2018. NCRF++: an open-source neural sequence labeling toolkit. Proc 56^th Annual Meeting of the Association for Computational Linguistics, p.74–79. http://aclweb.org/anthology/P18-4013
Zhai F, Potdar S, Xiang B, et al., 2017. Neural models for sequence chunking. Proc 31^st AAAI Conf on Artificial Intelligence, p.3365–3371.

Download references

Author information

Authors and Affiliations

College of Computer, National University of Defense Technology, Changsha, 410073, China
Zhen-zhen Li, Da-wei Feng, Dong-sheng Li & Xi-cheng Lu

Authors

Zhen-zhen Li
View author publications
You can also search for this author in PubMed Google Scholar
Da-wei Feng
View author publications
You can also search for this author in PubMed Google Scholar
Dong-sheng Li
View author publications
You can also search for this author in PubMed Google Scholar
Xi-cheng Lu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dong-sheng Li.

Additional information

Compliance with ethics guidelines

Zhen-zhen LI, Da-wei FENG, Dong-sheng LI, and Xicheng LU declare that they have no conflict of interest.

Project supported by the National Key Research and Development Program of China (No. 2016YFB0201305) and the National Natural Science Foundation of China (No. 61872376)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, Zz., Feng, Dw., Li, Ds. et al. Learning to select pseudo labels: a semi-supervised method for named entity recognition. Front Inform Technol Electron Eng 21, 903–916 (2020). https://doi.org/10.1631/FITEE.1800743

Download citation

Received: 22 November 2018
Accepted: 17 April 2019
Published: 27 December 2019
Issue Date: June 2020
DOI: https://doi.org/10.1631/FITEE.1800743

Key words

CLC number

TP391.1

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning to select pseudo labels: a semi-supervised method for named entity recognition

Abstract

Access this article

Similar content being viewed by others

External Knowledge-Based Weakly Supervised Learning Approach on Chinese Clinical Named Entity Recognition

Named entity recognition: a semi-supervised learning approach

Multi-task learning for Chinese clinical named entity recognition with external knowledge

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Compliance with ethics guidelines

Rights and permissions

About this article

Cite this article

Key words

CLC number

Navigation

Learning to select pseudo labels: a semi-supervised method for named entity recognition

Abstract

Access this article

Similar content being viewed by others

External Knowledge-Based Weakly Supervised Learning Approach on Chinese Clinical Named Entity Recognition

Named entity recognition: a semi-supervised learning approach

Multi-task learning for Chinese clinical named entity recognition with external knowledge

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Compliance with ethics guidelines

Rights and permissions

About this article

Cite this article

Share this article

Key words

CLC number

Search

Navigation