Using Pseudo-Labelled Data for Zero-Shot Text Classification

Wang, Congcong; Nulty, Paul; Lillis, David

doi:10.1007/978-3-031-08473-7_4

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13286))

Included in the following conference series:

International Conference on Applications of Natural Language to Information Systems

1604 Accesses
1 Altmetric

Abstract

Existing Zero-Shot Learning (ZSL) techniques for text classification typically assign a label to a piece of text by building a matching model to capture the semantic similarity between the text and the label descriptor. This is expensive at inference time as it requires the text paired with every label to be passed forward through the matching model. The existing approaches to alleviate this issue are based on exact-word matching between the label surface names and an unlabelled target-domain corpus to get pseudo-labelled data for model training, making them difficult to generalise to ZS classification in multiple domains, In this paper, we propose an approach called P-ZSC to leverage pseudo-labelled data for zero-shot text classification. Our approach generates the pseudo-labelled data through a matching algorithm between the unlabelled target-domain corpus and the label vocabularies that consist of in-domain relevant phrases via expansion from label names. By evaluating our approach on several benchmarking datasets from a variety of domains, the results show that our system substantially outperforms the baseline systems especially in datasets whose classes are imbalanced.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Since the label surface names are available at testing time with no need for human supervision, we describe it as classifier-based ZSL. In addition, no task-specific labelled data is used, thus meeting the definition of label-fully-unseen ZSL in [28].
2.
We use the deepset/sentence_bert breakpoint from Huggingface model hub [18, 24].
3.
The similarity threshold was chosen through preliminary experimentation on a another dataset.
4.
Emotion labels like “joy”, “sadness” are more abstract than topic labels like “sports” and “politics & government.”.

References

Alam, F., Qazi, U., Imran, M., Ofli, F.: Humaid: human-annotated disaster incidents data from twitter with deep learning benchmarks. In: Proceedings of the International AAAI Conference on Web and Social Media, vol. 15, pp. 933–942 (2021)
Google Scholar
Davison, J.: Zero-shot classifier distillation (2021). https://github.com/huggingface/transformers/tree/master/examples/research_projects/zero-shot-distillation
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1 (Long and Short Papers), pp. 4171–4186. Association for Computational Linguistics, Minneapolis (June 2019). https://doi.org/10.18653/v1/N19-1423
Karamanolakis, G., Mukherjee, S., Zheng, G., Hassan, A.: Self-training with weak supervision. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 845–863 (2021)
Google Scholar
Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP, pp. 1746–1751. Association for Computational Linguistics, Doha (October 2014). https://doi.org/10.3115/v1/D14-1181
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Proceedings of the 3rd International Conference for Learning Representations, San Diego (2015)
Google Scholar
Klinger, R., et al.: An analysis of annotated corpora for emotion classification in text. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 2104–2119 (2018)
Google Scholar
Lee, D.H., et al.: Pseudo-label: the simple and efficient semi-supervised learning method for deep neural networks. In: Workshop on challenges in representation learning, ICML (2013)
Google Scholar
Levy, O., Seo, M., Choi, E., Zettlemoyer, L.: Zero-shot relation extraction via reading comprehension. In: Proceedings of the 21st Conference on Computational Natural Language Learning, CoNLL 2017, pp. 333–342. Association for Computational Linguistics, Vancouver (August 2017). https://doi.org/10.18653/v1/K17-1034
Mekala, D., Shang, J.: Contextualized weak supervision for text classification. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 323–333 (2020)
Google Scholar
Mekala, D., Zhang, X., Shang, J.: Meta: metadata-empowered weak supervision for text classification. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP, pp. 8351–8361 (2020)
Google Scholar
Meng, Y., Shen, J., Zhang, C., Han, J.: Weakly-supervised neural text classification. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management, CIKM 2018, vol. 2018 (2018)
Google Scholar
Meng, Y., et al.: Text classification using label names only: a language model self-training approach. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP, pp. 9006–9017. Association for Computational Linguistics (November 2020). https://doi.org/10.18653/v1/2020.emnlp-main.724
Müller, T., Pérez-Torró, G., Franco-Salvador, M.: Few-shot learning with siamese networks and label tuning (2022). arXiv preprint, arXiv:2203.14655
Obamuyide, A., Vlachos, A.: Zero-shot relation classification as textual entailment. In: Proceedings of the First Workshop on Fact Extraction and VERification, FEVER, pp. 72–78 (2018)
Google Scholar
Puri, R., Catanzaro, B.: Zero-shot text classification with generative language models (2019). CoRR, abs/1912.10165
Google Scholar
Pushp, P.K., Srivastava, M.M.: Train once, test anywhere: zero-shot learning for text classification (2017). CoRR, abs/1712.05972
Google Scholar
Reimers, N., Gurevych, I.: Sentence-bert: sentence embeddings using siamese bert-networks. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP, pp. 3973–3983 (2019)
Google Scholar
Saravia, E., Liu, H.C.T., Huang, Y.H., Wu, J., Chen, Y.S.: Carer: contextualized affect representations for emotion recognition. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 3687–3697 (2018)
Google Scholar
Veeranna, S.P., Nam, J., Mencıa, E.L., Fürnkranz, J.: Using semantic similarity for multi-label zero-shot classification of text documents. In: Proceeding of European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, pp. 423–428. Elsevier, Bruges (2016)
Google Scholar
Wang, C., Lillis, D.: A comparative study on word embeddings in deep learning for text classification. In: Proceedings of the 4th International Conference on Natural Language Processing and Information Retrieval, NLPIR 2020, Seoul, South Korea (December 2020). https://doi.org/10.1145/3443279.3443304
Wang, C., Nulty, P., Lillis, D.: Transformer-based multi-task learning for disaster tweet categorisation. In: Adrot, A., Grace, R., Moore, K., Zobel, C.W. (eds.) ISCRAM 2021 Conference Proceedings – 18th International Conference on Information Systems for Crisis Response and Management, pp. 705–718. Virginia Tech., Blacksburg (2021)
Google Scholar
Wang, Z., Mekala, D., Shang, J.: X-class: text classification with extremely weak supervision. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 3043–3053 (2021)
Google Scholar
Wolf, T., et al.: Transformers:state-of-the-art natural language processing. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 38–45. Association for Computational Linguistics (October 2020). https://doi.org/10.18653/v1/2020.emnlpdemos.6
Xia, C., Zhang, C., Yan, X., Chang, Y., Philip, S.Y.: Zero-shot user intent detection via capsule neural networks. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 3090–3099 (2018)
Google Scholar
Xie, J., Girshick, R., Farhadi, A.: Unsupervised deep embedding for clustering analysis. In: International Conference on Machine Learning, PMLR, pp. 478–487 (2016)
Google Scholar
Ye, Z., et al.: Zero-shot text classification via reinforced self-training. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 3014–3024 (2020)
Google Scholar
Yin, W., Hay, J., Roth, D.: Benchmarking zero-shot text classification: datasets, evaluation and entailment approach. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP, pp. 3905–3914 (2019)
Google Scholar
Zhang, J., Lertvittayakumjorn, P., Guo, Y.: Integrating semantic knowledge to tackle zero-shot text classification. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1 (Long and Short Papers), pp. 1031–1040 (2019)
Google Scholar
Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: Proceedings of the 28th International Conference on Neural Information Processing Systems, vol. 1, pp. 649–657 (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science, University College Dublin, Dublin, Ireland
Congcong Wang & David Lillis
Department of Computer Science and Information Systems, Birkbeck, University of London, London, UK
Paul Nulty

Authors

Congcong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Paul Nulty
View author publications
You can also search for this author in PubMed Google Scholar
David Lillis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to David Lillis .

Editor information

Editors and Affiliations

Universitat Politècnica de València, Valencia, Spain
Paolo Rosso
University of Turin, Torino, Italy
Valerio Basile
Universidad Nacional de Educación a Distancia, Madrid, Spain
Raquel Martínez
Conservatoire National des Arts et Métiers, Paris, France
Elisabeth Métais
University of Derby, Derby, UK
Farid Meziane

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, C., Nulty, P., Lillis, D. (2022). Using Pseudo-Labelled Data for Zero-Shot Text Classification. In: Rosso, P., Basile, V., Martínez, R., Métais, E., Meziane, F. (eds) Natural Language Processing and Information Systems. NLDB 2022. Lecture Notes in Computer Science, vol 13286. Springer, Cham. https://doi.org/10.1007/978-3-031-08473-7_4

Download citation

DOI: https://doi.org/10.1007/978-3-031-08473-7_4
Published: 13 June 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-08472-0
Online ISBN: 978-3-031-08473-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Using Pseudo-Labelled Data for Zero-Shot Text Classification