Skip to main content
Log in

Adaptive semi-supervised learning from stronger augmentation transformations of discrete text information

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Semi-supervised learning is a promising approach to dealing with the problem of insufficient labeled data. Recent methods grouped into paradigms of consistency regularization and pseudo-labeling have outstanding performances on image data, but achieve limited improvements when employed for processing textual information, due to the neglect of the discrete nature of textual information and the lack of high-quality text augmentation transformation means. In this paper, we propose the novel SeqMatch method. It can automatically perceive abnormal model states caused by anomalous data obtained by text augmentations and reduce their interferences and instead leverages normal ones to improve the effectiveness of consistency regularization. And it generates hard artificial pseudo-labels to enable the model to be efficiently updated and optimized toward low entropy. We also design several much stronger well-organized text augmentation transformation pipelines to increase the divergence between two views of unlabeled discrete textual sequences, thus enabling the model to learn more knowledge from the alignment. Extensive comparative experimental results show that our SeqMatch outperforms previous methods on three widely used benchmarks significantly. In particular, SeqMatch can achieve a maximum performance improvement of 16.4% compared to purely supervised training when provided with a minimal number of labeled examples.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Algorithm 1
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. https://ai.stanford.edu/~-amaas/data/sentiment/.

  2. https://www.kaggle.com/datasets/yelp-dataset/yelp-dataset.

  3. https://emilhvitfeldt.github.io/textdata/reference/dataset_dbpedia.html.

References

  1. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inform Process Syst, 30

  2. Kenton JDMWC, Toutanova LK (2019) Bert: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL-HLT

  3. Van Engelen JE, Hoos HH (2020) A survey on semi-supervised learning. Mach Learn 109(2):373–440

    Article  MathSciNet  Google Scholar 

  4. Yang X, Song Z, King I, Xu Z (2022) A survey on deep semi-supervised learning. IEEE Trans Knowl Data Eng

  5. Liu F, Tian Y, Chen Y, Liu Y, Belagiannis V, Carneiro G (2022) Acpl: anti-curriculum pseudo-,labelling for semi-supervised medical image classification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 20697–20706

  6. Xie Q, Dai Z, Hovy E, Luong T, Le Q (2020) Unsupervised data augmentation for consistency training. Adv Neural Inf Process Syst 33:6256–6268

    Google Scholar 

  7. Sohn K, Berthelot D, Carlini N, Zhang Z, Zhang H, Raffel CA, Cubuk ED, Kurakin A, Li C-L (2020) Fixmatch: simplifying semi-supervised learning with consistency and confidence. Adv Neural Inf Process Syst 33:596–608

    Google Scholar 

  8. Zhang B, Wang Y, Hou W, Wu H, Wang J, Okumura M, Shinozaki T (2021) Flexmatch: Boosting semi-supervised learning with curriculum pseudo labeling. Adv Neural Inform Process Syst, 34

  9. Park J, Kim G, Kang J (2022) Consistency training with virtual adversarial discrete perturbation. In: Proceedings of the 2022 conference of the north american chapter of the association for computational linguistics: human language technologies, pp 5646–5656

  10. Lee DH (2013) Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In: Workshop on challenges in representation learning, ICML, vol 3, p 896

  11. Chen L, Alexa A, Garcia F, Kumar V, Xie H, Lu J (2021) Industry scale semi-supervised learning for natural language understanding. NAACL-HLT 2021:311

    Google Scholar 

  12. Tsai ACY, Lin SY, Fu LC (2022) Contrast-enhanced semi-supervised text classification with few labels. In: Proceedings of the AAAI conference on artificial intelligence, vol 36, pp 11394–11402

  13. Bachman P, Alsharif O, Precup D (2014) Learning with pseudo-ensemblesAdv Neural Inform Process Syst, 27

  14. Sajjadi M, Javanmardi M, Tasdizen T (2016) Regularization with stochastic transformations and perturbations for deep semi-supervised learning. Adv Neural Inform Process Syst, 29

  15. Laine S, Aila T (2016) Temporal ensembling for semi-supervised learning. In: International conference on learning representations

  16. Lee D, Kim S, Kim I, Cheon Y, Cho M, Han WS (2022 ) Contrastive regularization for semi-supervised learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3911–3920

  17. Tarvainen A, Valpola H (2017) Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. Adv. Neural Inform Process Syst, 30

  18. Cubuk ED, Zoph B, Shlens J, Le QV (2019) Randaugment: practical data augmentation with no separate search. arXiv:1909.13719 (vol 2 no 4, p 7)

  19. Cubuk ED, Zoph B, Mane D, Vasudevan V, Le QV (2018) Autoaugment: learning augmentation policies from data. arXiv:1805.09501

  20. Wei J, Zou K (2019) Eda: Easy data augmentation techniques for boosting performance on text classification tasks. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pp 6382–6388

  21. Sennrich R, Haddow B, Birch A(2016) Improving neural machine translation models with monolingual data. In: Proceedings of the 54th annual meeting of the association for computational linguistics, Vol 1: Long Papers, pp 86–96

  22. Edunov S, Ott M, Auli M, Grangier D (2018) Understanding back-translation at scale. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp 489–500

  23. Ma, E.: NLP Augmentation. https://github.com/makcedward/nlpaug (2019)

  24. Miyato T, Dai AM, Goodfellow I (2016) Adversarial training methods for semi-supervised text classification. In: International conference on learning representations

  25. Qu Y, Shen D, Shen Y, Sajeev S, Chen W, Han J (2020) Coda: contrast-enhanced and diversity-promoting data augmentation for natural language understanding. In: International conference on learning representations

  26. Feng SY, Gangal V, Wei J, Chandar S, Vosoughi S, Mitamura T, Hovy E (2021) A survey of data augmentation approaches for nlp. In: Findings of the association for computational linguistics: ACL-IJCNLP 2021, pp 968–988

  27. Wang Y, Xu C, Sun Q, Hu H, Tao C, Geng X, Jiang D (2022) Promda: Prompt-based data augmentation for low-resource nlu tasks. In: Proceedings of the 60th annual meeting of the association for computational linguistics, pp 4242–4255

  28. Wang M, Wang W, Li B, Zhang X, Lan L, Tan H, Liang T, Yu W, Luo Z (2021) Interbn: Channel fusion for adversarial unsupervised domain adaptation. In: Proceedings of the 29th ACM international conference on multimedia, pp 3691–3700

  29. Wang M, Li P, Shen L, Wang Y, Wang S, Wang W, Zhang X, Chen J, Luo Z (2022) Informative pairs mining based adaptive metric learning for adversarial domain adaptation. Neural Netw 151:238–249

    Article  Google Scholar 

  30. Wang M, Yuan J, Qian Q, Wang Z, Li H (2022) Semantic data augmentation based distance metric learning for domain generalization. In: Proceedings of the 30th ACM international conference on multimedia, pp 3214–3223

  31. Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, Zhou Y, Li W, Liu PJ (2020) Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res 21(1):5485–5551

    MathSciNet  Google Scholar 

  32. Kobayashi S (2018) Contextual augmentation: Data augmentation by words with paradigmatic relations. In: Proceedings of NAACL-HLT, pp 452–457

  33. Cocos A, Apidianaki M, Callison-Burch C (2017) Mapping the paraphrase database to wordnet. In: Proceedings of the 6th joint conference on lexical and computational semantics (* SEM 2017), pp 84–90

  34. Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv:1301.3781

  35. Maas AL, Daly RE, Pham PT, Huang D, Ng AY, Potts C (2011) Learning word vectors for sentiment analysis. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies, Association for Computational Linguistics, Portland, pp 142–150

  36. Zhang X, Zhao J, LeCun Y (2015) Character-level convolutional networks for text classification. Adv Neural Inform Process Syst, 28

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Xuemiao Zhang or Junfei Liu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, X., Tan, Z., Lu, F. et al. Adaptive semi-supervised learning from stronger augmentation transformations of discrete text information. Knowl Inf Syst (2024). https://doi.org/10.1007/s10115-024-02100-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10115-024-02100-y

Keywords

Navigation