Skip to main content

Oversampling Versus Variational Autoencoders: Employing Synthetic Data for Detection of Heracleum Sosnowskyi in Satellite Images

  • Conference paper
  • First Online:
Information Science and Applications

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 621))

Abstract

Detection of growth areas of hazardous invasive plants, such as Heracleum Sosnowskyi (HS), in satellite images, is an important application of machine learning and computer vision methods. There exists extensive literature on the problem of crop classification in images. However, sometimes, the hardest part is to gather qualitative labels and gather enough data to train models. Notably, this difficulty arises in the analysis of satellite images, where labeling the data is hard to perform manually. In this work, we’ve faced the same problem (lack of data) when trying to build a classification model for HS detection in images. The issue of lack of data can be solved by generating synthetic data using simple methods like oversampling (OS) to more complex techniques like Variational Autoencoders (VAE) and Generative Adversarial Networks. To the best of our knowledge, there exists no work that has compared the performance of using OS versus VAE to generate synthetic pixels to overcome the problem of data deficiency for the task of HS classification in images. Accordingly, in this work, we perform this comparison and present the evaluation results on our dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Pm 9/9(1) (2009) Heracleum mantegazzianum, h.sosnowskyi and h.persicum. EPPO Bull 39(3):465–470. https://doi.org/10.1111/j.1365-2338.2009.02331.x

  2. Akhtar A, Nazir M, Khan SA (2012) Crop classification using feature extraction from satellite imagery. In: 2012 15th international multitopic conference (INMIC). pp 9–15. https://doi.org/10.1109/INMIC.2012.6511479

  3. Ryzhikov DM (2017) Heracleum sosnowskyi growth area control by multispectral satellite data. Inf Control Sys 6:43–51. https://doi.org/10.15217/issn1684-8853.2017.6.43

  4. Saini R, Ghosh SK (2018) Crop classification on single date sentinel-2 imagery using random forest and suppor vector machine. ISPRS—International archives of the photogrammetry, remote sensing and spatial information sciences XLII-5:683–688. https://doi.org/10.5194/isprs-archives-XLII-5-683-2018

  5. Ji S, Zhang C, Xu A, Shi Y, Duan Y (2018) 3d convolutional neural networks for crop classification with multi-temporal remote sensing images. Remote Sens 10:75

    Article  Google Scholar 

  6. Sun Y, Wong AKC, Kamel MS (2009) Classification of imbalanced data: a review. Int J Pattern Recognit Artif Intell 23(04):687–719. https://doi.org/10.1142/S0218001409007326

    Article  Google Scholar 

  7. Wan Z, Zhang Y, He H (2017) Variational autoencoder based synthetic data generation for imbalanced learning. In: 2017 IEEE symposium series on computational intelligence (SSCI). pp 1–7. https://doi.org/10.1109/SSCI.2017.8285168

  8. An J, Cho S (2015) Variational autoencoder based anomaly detection using reconstruction probability

    Google Scholar 

  9. Semeniuta S, Severyn A, Barth E (2017) A hybrid convolutional variational autoencoder for text generation. In: Proceedings of the 2017 conference on empirical methods in natural language processing. pp 627–637. Association for computational linguistics, Copenhagen, Denmark. https://doi.org/10.18653/v1/D17-1066

  10. LeCun Y, Cortes C (2010) MNIST handwritten digit database. http://yann.lecun.com/exdb/mnist/

  11. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) Smote: synthetic minority over-sampling technique. J Artif Int Res 16(1):321–357

    MATH  Google Scholar 

  12. He H, Bai Y, Garcia EA, Li S (2008) Adasyn: adaptive synthetic sampling approach for imbalanced learning. In: 2008 IEEE international joint conference on neural networks (IEEE world congress on computational intelligence). pp 1322–1328. https://doi.org/10.1109/IJCNN.2008.4633969

  13. Doersch C (2016) Tutorial on variational autoencoders. arXiv:1606.05908

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Adil Khan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Turénko, D., Khan, A., Hussain, R., Imran Ali, S. (2020). Oversampling Versus Variational Autoencoders: Employing Synthetic Data for Detection of Heracleum Sosnowskyi in Satellite Images. In: Kim, K., Kim, HY. (eds) Information Science and Applications. Lecture Notes in Electrical Engineering, vol 621. Springer, Singapore. https://doi.org/10.1007/978-981-15-1465-4_40

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-1465-4_40

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-1464-7

  • Online ISBN: 978-981-15-1465-4

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics