Skip to main content

Using Pseudo-Labelled Data for Zero-Shot Text Classification

  • Conference paper
  • First Online:
Natural Language Processing and Information Systems (NLDB 2022)

Abstract

Existing Zero-Shot Learning (ZSL) techniques for text classification typically assign a label to a piece of text by building a matching model to capture the semantic similarity between the text and the label descriptor. This is expensive at inference time as it requires the text paired with every label to be passed forward through the matching model. The existing approaches to alleviate this issue are based on exact-word matching between the label surface names and an unlabelled target-domain corpus to get pseudo-labelled data for model training, making them difficult to generalise to ZS classification in multiple domains, In this paper, we propose an approach called P-ZSC to leverage pseudo-labelled data for zero-shot text classification. Our approach generates the pseudo-labelled data through a matching algorithm between the unlabelled target-domain corpus and the label vocabularies that consist of in-domain relevant phrases via expansion from label names. By evaluating our approach on several benchmarking datasets from a variety of domains, the results show that our system substantially outperforms the baseline systems especially in datasets whose classes are imbalanced.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Since the label surface names are available at testing time with no need for human supervision, we describe it as classifier-based ZSL. In addition, no task-specific labelled data is used, thus meeting the definition of label-fully-unseen ZSL in [28].

  2. 2.

    We use the deepset/sentence_bert breakpoint from Huggingface model hub [18, 24].

  3. 3.

    The similarity threshold was chosen through preliminary experimentation on a another dataset.

  4. 4.

    Emotion labels like “joy”, “sadness” are more abstract than topic labels like “sports” and “politics & government.”.

References

  1. Alam, F., Qazi, U., Imran, M., Ofli, F.: Humaid: human-annotated disaster incidents data from twitter with deep learning benchmarks. In: Proceedings of the International AAAI Conference on Web and Social Media, vol. 15, pp. 933–942 (2021)

    Google Scholar 

  2. Davison, J.: Zero-shot classifier distillation (2021). https://github.com/huggingface/transformers/tree/master/examples/research_projects/zero-shot-distillation

  3. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1 (Long and Short Papers), pp. 4171–4186. Association for Computational Linguistics, Minneapolis (June 2019). https://doi.org/10.18653/v1/N19-1423

  4. Karamanolakis, G., Mukherjee, S., Zheng, G., Hassan, A.: Self-training with weak supervision. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 845–863 (2021)

    Google Scholar 

  5. Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP, pp. 1746–1751. Association for Computational Linguistics, Doha (October 2014). https://doi.org/10.3115/v1/D14-1181

  6. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Proceedings of the 3rd International Conference for Learning Representations, San Diego (2015)

    Google Scholar 

  7. Klinger, R., et al.: An analysis of annotated corpora for emotion classification in text. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 2104–2119 (2018)

    Google Scholar 

  8. Lee, D.H., et al.: Pseudo-label: the simple and efficient semi-supervised learning method for deep neural networks. In: Workshop on challenges in representation learning, ICML (2013)

    Google Scholar 

  9. Levy, O., Seo, M., Choi, E., Zettlemoyer, L.: Zero-shot relation extraction via reading comprehension. In: Proceedings of the 21st Conference on Computational Natural Language Learning, CoNLL 2017, pp. 333–342. Association for Computational Linguistics, Vancouver (August 2017). https://doi.org/10.18653/v1/K17-1034

  10. Mekala, D., Shang, J.: Contextualized weak supervision for text classification. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 323–333 (2020)

    Google Scholar 

  11. Mekala, D., Zhang, X., Shang, J.: Meta: metadata-empowered weak supervision for text classification. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP, pp. 8351–8361 (2020)

    Google Scholar 

  12. Meng, Y., Shen, J., Zhang, C., Han, J.: Weakly-supervised neural text classification. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management, CIKM 2018, vol. 2018 (2018)

    Google Scholar 

  13. Meng, Y., et al.: Text classification using label names only: a language model self-training approach. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP, pp. 9006–9017. Association for Computational Linguistics (November 2020). https://doi.org/10.18653/v1/2020.emnlp-main.724

  14. Müller, T., Pérez-Torró, G., Franco-Salvador, M.: Few-shot learning with siamese networks and label tuning (2022). arXiv preprint, arXiv:2203.14655

  15. Obamuyide, A., Vlachos, A.: Zero-shot relation classification as textual entailment. In: Proceedings of the First Workshop on Fact Extraction and VERification, FEVER, pp. 72–78 (2018)

    Google Scholar 

  16. Puri, R., Catanzaro, B.: Zero-shot text classification with generative language models (2019). CoRR, abs/1912.10165

    Google Scholar 

  17. Pushp, P.K., Srivastava, M.M.: Train once, test anywhere: zero-shot learning for text classification (2017). CoRR, abs/1712.05972

    Google Scholar 

  18. Reimers, N., Gurevych, I.: Sentence-bert: sentence embeddings using siamese bert-networks. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP, pp. 3973–3983 (2019)

    Google Scholar 

  19. Saravia, E., Liu, H.C.T., Huang, Y.H., Wu, J., Chen, Y.S.: Carer: contextualized affect representations for emotion recognition. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 3687–3697 (2018)

    Google Scholar 

  20. Veeranna, S.P., Nam, J., Mencıa, E.L., Fürnkranz, J.: Using semantic similarity for multi-label zero-shot classification of text documents. In: Proceeding of European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, pp. 423–428. Elsevier, Bruges (2016)

    Google Scholar 

  21. Wang, C., Lillis, D.: A comparative study on word embeddings in deep learning for text classification. In: Proceedings of the 4th International Conference on Natural Language Processing and Information Retrieval, NLPIR 2020, Seoul, South Korea (December 2020). https://doi.org/10.1145/3443279.3443304

  22. Wang, C., Nulty, P., Lillis, D.: Transformer-based multi-task learning for disaster tweet categorisation. In: Adrot, A., Grace, R., Moore, K., Zobel, C.W. (eds.) ISCRAM 2021 Conference Proceedings – 18th International Conference on Information Systems for Crisis Response and Management, pp. 705–718. Virginia Tech., Blacksburg (2021)

    Google Scholar 

  23. Wang, Z., Mekala, D., Shang, J.: X-class: text classification with extremely weak supervision. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 3043–3053 (2021)

    Google Scholar 

  24. Wolf, T., et al.: Transformers:state-of-the-art natural language processing. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 38–45. Association for Computational Linguistics (October 2020). https://doi.org/10.18653/v1/2020.emnlpdemos.6

  25. Xia, C., Zhang, C., Yan, X., Chang, Y., Philip, S.Y.: Zero-shot user intent detection via capsule neural networks. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 3090–3099 (2018)

    Google Scholar 

  26. Xie, J., Girshick, R., Farhadi, A.: Unsupervised deep embedding for clustering analysis. In: International Conference on Machine Learning, PMLR, pp. 478–487 (2016)

    Google Scholar 

  27. Ye, Z., et al.: Zero-shot text classification via reinforced self-training. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 3014–3024 (2020)

    Google Scholar 

  28. Yin, W., Hay, J., Roth, D.: Benchmarking zero-shot text classification: datasets, evaluation and entailment approach. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP, pp. 3905–3914 (2019)

    Google Scholar 

  29. Zhang, J., Lertvittayakumjorn, P., Guo, Y.: Integrating semantic knowledge to tackle zero-shot text classification. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1 (Long and Short Papers), pp. 1031–1040 (2019)

    Google Scholar 

  30. Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: Proceedings of the 28th International Conference on Neural Information Processing Systems, vol. 1, pp. 649–657 (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David Lillis .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wang, C., Nulty, P., Lillis, D. (2022). Using Pseudo-Labelled Data for Zero-Shot Text Classification. In: Rosso, P., Basile, V., Martínez, R., Métais, E., Meziane, F. (eds) Natural Language Processing and Information Systems. NLDB 2022. Lecture Notes in Computer Science, vol 13286. Springer, Cham. https://doi.org/10.1007/978-3-031-08473-7_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-08473-7_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-08472-0

  • Online ISBN: 978-3-031-08473-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics