Skip to main content
Log in

SGBGAN: minority class image generation for class-imbalanced datasets

  • Original Paper
  • Published:
Machine Vision and Applications Aims and scope Submit manuscript

Abstract

Class imbalance frequently arises in the context of image classification. Conventional generative adversarial networks (GANs) have a tendency to produce samples from the majority class when trained on class-imbalanced datasets. To address this issue, the Balancing GAN with gradient penalty (BAGAN-GP) has been proposed, but the outcomes may still exhibit a bias toward the majority categories when the similarity between images from different categories is substantial. In this study, we introduce a novel approach called the Pre-trained Gated Variational Autoencoder with Self-attention for Balancing Generative Adversarial Network (SGBGAN) as an image augmentation technique for generating high-quality images. The proposed method utilizes a Gated Variational Autoencoder with Self-attention (SA-GVAE) to initialize the GAN and transfers pre-trained SA-GVAE weights to the GAN. Our experimental results on Fashion-MNIST, CIFAR-10, and a highly unbalanced medical image dataset demonstrate that the SGBGAN outperforms other state-of-the-art methods. Results on Fréchet inception distance (FID) and structural similarity measures (SSIM) show that our model overcomes the instability problems that exist in other GANs. Especially on the Cells dataset, the FID of a minority class increases up to 23.09% compared to the latest BAGAN-GP, and the SSIM of a minority class increases up to 10.81%. It is proved that SGBGAN overcomes the class imbalance restriction and generates high-quality minority class images.

Graphical abstract

The diagram provides an overview of the technical approach employed in this research paper. To address the issue of class imbalance within the dataset, a novel technique called the Gated Variational Autoencoder with Self-attention (SA-GVAE) is proposed. This SA-GVAE is utilized to initialize the Generative Adversarial Network (GAN), with the pre-trained weights from SA-GVAE being transferred to the GAN. Consequently, a Pre-trained Gated Variational Autoencoder with Self-attention for Balancing GAN (SGBGAN) is formed, serving as an image augmentation tool to generate high-quality images. Ultimately, the generation of minority samples is employed to restore class balance within the dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2015)

  2. Xiao, H., Rasul, K., Vollgraf, R.: Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747 (2017)

  3. Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)

  4. Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: Imagenet: A large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009)

  5. Braytee, A., Liu, W., Anaissi, A., Kennedy, P.J.: Correlated multi-label classification with incomplete label space and class imbalance. ACM Trans. Intell. Syst. Technol. (TIST) 10, 1–26 (2019)

    Article  Google Scholar 

  6. Johnson, J.M., Khoshgoftaar, T.M.: Survey on deep learning with class imbalance. J. Big Data 6(1), 1–54 (2019)

    Article  Google Scholar 

  7. Rezaei, M., Uemura, T., Näppi, J., Yoshida, H., Lippert, C., Meinel, C.: Generative synthetic adversarial network for internal bias correction and handling class imbalance problem in medical image diagnosis. In: Medical Imaging 2020: Computer-Aided Diagnosis, vol. 11314, pp. 82–89. SPIE (2020)

  8. Arjovsky, M., Bottou, L.: Towards principled methods for training generative adversarial networks. arXiv preprint arXiv:1701.04862 (2017)

  9. Dauphin, Y.N., Fan, A., Auli, M., Grangier, D.: Language modeling with gated convolutional networks. In: Proceedings of the 34th International Conference on Machine Learning-Volume 70, pp. 933–941 (2017)

  10. Adiga, N., Pantazis, Y., Tsiaras, V., Stylianou, Y.: Speech enhancement for noise-robust speech synthesis using wasserstein gan. In: INTERSPEECH, pp. 1821–1825 (2019)

  11. Zhang, H., Goodfellow, I.J., Metaxas, D.N., Odena, A.: Self-attention generative adversarial networks. arXiv:1805.08318 (2018)

  12. Gurunlu, B., Ozturk, S.: Efficient approach for block-based copy-move forgery detection. In: Smart Trends in Computing and Communications: Proceedings of SmartCom 2021, pp. 167–174. Springer (2022)

  13. Mariani, G., Scheidegger, F., Istrate, R., Bekas, C., Malossi, A.C.I.: Bagan: Data augmentation with balancing gan. arXiv:1803.09655 (2018)

  14. Huang, G., Jafari, A.H.: Enhanced balancing gan: minority-class image generation. Neural Comput. Appl. 35, 5145–5154 (2023)

    Article  Google Scholar 

  15. Zhang, M., Xiao, T.Z., Paige, B., Barber, D.: Improving vae-based representation learning. arXiv preprint arXiv:2205.14539 (2022)

  16. Taghanaki, S.A., Havaei, M., Lamb, A., Sanghi, A., Danielyan, A., Custis, T.: Jigsaw-vae: Towards balancing features in variational autoencoders. arXiv:2005.05496 (2020)

  17. Zheng, Y., Ma, Y., Tian, C.: Tmrn-glu: A transformer-based automatic classification recognition network improved by gate linear unit. Electronics 11(10), 1554 (2022)

    Article  Google Scholar 

  18. Li, Z., Jin, Y., Li, Y., Lin, Z., Wang, S.: Imbalanced adversarial learning for weather image generation and classification. In: 2018 14th IEEE International Conference on Signal Processing (ICSP), pp. 1093–1097 (2018)

  19. Shoohi, L.M., Saud, J.H.: Dcgan for handling imbalanced malaria dataset based on over-sampling technique and using cnn. Med. Leg. Update 20, 1079–1085 (2020)

    Google Scholar 

  20. Wang, Q., Zhou, X., Wang, C., Liu, Z., Huang, J., Zhou, Y., Li, C., Zhuang, H., Cheng, J.-Z.: Wgan-based synthetic minority over-sampling technique: Improving semantic fine-grained classification for lung nodules in ct images. IEEE Access 7, 18450–18463 (2019)

    Article  Google Scholar 

  21. Rai, H., Shukla, N.: Unpaired image-to-image translation using cycle-consistent adversarial networks (2018)

  22. Balasubramanian, R., Sowmya, V., Gopalakrishnan, E.A., Menon, V.K., Variyar, V.V.S., Soman, K.P.: Analysis of adversarial based augmentation for diabetic retinopathy disease grading. In: 2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT), pp. 1–5 (2020)

  23. Waheed, A., Goyal, M., Gupta, D., Khanna, A., Al-turjman, F., Pinheiro, P.R.: Covidgan: Data augmentation using auxiliary classifier gan for improved covid-19 detection. IEEE Access 8, 91916–91923 (2020)

    Article  Google Scholar 

  24. Sampath, V., Maurtua, I., Martín, J.J.A., Gutierrez, A.: A survey on generative adversarial networks for imbalance problems in computer vision tasks. J. Big Data 8 (2020)

  25. Chen, J., Tam, D., Raffel, C., Bansal, M., Yang, D.: An empirical survey of data augmentation for limited data learning in nlp. Trans. Assoc. Comput. Linguist. 11, 191–211 (2023)

    Article  Google Scholar 

  26. Xu, M., Yoon, S., Fuentes, A., Park, D.S.: A comprehensive survey of image augmentation techniques for deep learning. Pattern Recognit. 109347 (2023)

  27. Zheng, M., Li, T., Zhu, R., Tang, Y., Tang, M., Lin, L., Ma, Z.: Conditional wasserstein generative adversarial network-gradient penalty-based approach to alleviating imbalanced data classification. Inf. Sci. 512, 1009–1023 (2020)

    Article  Google Scholar 

  28. Dai, W., Li, D., Tang, D., Wang, H., Peng, Y.: Deep learning approach for defective spot welds classification using small and class-imbalanced datasets. Neurocomputing 477, 46–60 (2022)

    Article  Google Scholar 

  29. Xu, M., Chen, Y., Wang, Y., Wang, D., Liu, Z., Zhang, L.: Bwgan-gp: An eeg data generation method for class imbalance problem in rsvp tasks. IEEE Trans. Neural Syst. Rehabil. Eng. 30, 251–263 (2022)

    Article  Google Scholar 

  30. Ding, N., Zhang, G., Zhang, L., Shen, Z., Yin, L., Zhou, S., Deng, Y.: Engineering an ai-based forward-reverse platform for the design of cross-ribosome binding sites of a transcription factor biosensor. Comput. Struct. Biotechnol. J. 21, 2929–2939 (2023)

    Article  Google Scholar 

  31. Snoussi, R., Youssef, H.: Vae-based latent representations learning for botnet detection in iot networks. J. Netw. Syst. Manag. 31(1), 4 (2023)

    Article  Google Scholar 

  32. Zhao, H., Jia, J., Koltun, V.: Exploring self-attention for image recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10076–10085 (2020)

  33. Dauphin, Y., Fan, A., Auli, M., Grangier, D.: Language modeling with gated convolutional networks. In: International Conference on Machine Learning (2016)

  34. Yao, Y., Wangr, X.L., Ma, Y., Fang, H., Wei, J., Chen, L., Anaissi, A., Braytee, A.: Conditional variational autoencoder with balanced pre-training for generative adversarial networks. In: 2022 IEEE 9th International Conference on Data Science and Advanced Analytics (DSAA), pp. 1–10 (2022)

  35. Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A.C., Bengio, Y.: Generative adversarial nets. In: NIPS (2014)

  36. Kodali, N., Hays, J., Abernethy, J.D., Kira, Z.: On convergence and stability of gans. arXiv:Artificial Intelligence (2018)

  37. Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein generative adversarial networks. In: International Conference on Machine Learning (2017)

  38. Willmott, C.J., Matsuura, K.: Advantages of the mean absolute error (mae) over the root mean square error (rmse) in assessing average model performance. Clim. Res. 30(1), 79–82 (2005)

    Article  Google Scholar 

  39. Kingma, D.P., Welling, M.: Auto-encoding variational bayes. CoRR arXiv:1312.6114 (2013)

  40. Yeung, M., Sala, E., Schönlieb, C.-B., Rundo, L.: Unified focal loss: Generalising dice and cross entropy-based losses to handle class imbalanced medical image segmentation. Comput. Med. Imaging Graph. 95, 102026 (2022)

    Article  Google Scholar 

  41. Sara, U., Akter, M., Uddin, M.S.: Image quality assessment through fsim, ssim, mse and psnr-a comparative study. J. Comput. Commun. 7(3), 8–18 (2019)

    Article  Google Scholar 

  42. Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: Gans trained by a two time-scale update rule converge to a local nash equilibrium. In: NIPS (2017)

  43. Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)

    Article  Google Scholar 

  44. Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. CoRR arXiv:1511.06434 (2015)

  45. Wattenberg, M., Viégas, F., Johnson, I.: How to use t-sne effectively. Distill 1(10), 2 (2016)

    Article  Google Scholar 

Download references

Acknowledgements

The research is supported by the National Natural Science Foundation of China under Grant No. 62072468.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yanjiang Wang.

Ethics declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A Nomenclature and abbreviations

Appendix A Nomenclature and abbreviations

Table 5 List of abbreviations
Table 6 List of Symbols

All acronyms and full names in the paper are shown in Table 5, and unexplained variables and physical meanings are shown in Table 6.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wan, Q., Guo, W. & Wang, Y. SGBGAN: minority class image generation for class-imbalanced datasets. Machine Vision and Applications 35, 22 (2024). https://doi.org/10.1007/s00138-023-01506-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00138-023-01506-y

Keywords

Navigation