LET-Attack: Latent Encodings of Normal-Data Manifold Transferring to Adversarial Examples

Zhang, Jie; Zhang, Zhihao

doi:10.1007/978-3-030-34637-9_10

Jie Zhang¹² &
Zhihao Zhang¹²

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 11933))

Included in the following conference series:

International Conference on Science of Cyber Security

782 Accesses

Abstract

Recent studies have highlighted the vulnerability and low robustness of deep learning model against adversarial examples. This issue limits their deployability on ubiquitous applications requiring a high level of security such as driverless system, unmanned aerial vehicle and intrusion detection. In this paper, we propose latent encodings transferring attack (LET-attack) to generate target natural adversarial examples to fool well-trained classifiers. In order to perturb in latent space, we train WGAN-variants on various datasets to achieve feature extraction, image reconstruction and image discrimination against counterfeit with good performance. Thanks to our two-stage procedure of mapping transformation, the adversary performs precise and semantic perturbations on source data referring to target data in latent space. By using the critic in WGAN-variant and the well-trained classifier, the adversary crafts more verisimilar and effective adversarial examples. As shown in the experimental results on MNIST, FashionMNIST, CIFAR-10 and LSUN, LET-attack can yield a distinct set of adversarial examples with partly data manifold targeted transfer and attains similar attack performance against state-of-the-art models in different attack scenarios. What is more, we evaluate LET-attack on the characteristic of transferability in different classifiers on MNIST and CIFAR-10 respectively, and find that the adversarial examples are easy to transfer with high confidence.

This study was supported by the National Key R&D Program of China (2018YFB1500902) and NUPTSF (Grant No.NY219122).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Szegedy, C., Zaremba, W., Sutskever, I., et al.: Intriguing properties of neural networks. Comput. Sci. (2013)
Google Scholar
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. Comput. Sci. (2014)
Google Scholar
Tramèr, F., Papernot, N., Goodfellow, I., et al.: The space of transferable adversarial examples (2017)
Google Scholar
Nguyen, A., Yosinski, J., Clune, J.: Deep neural networks are easily fooled: high confidence predictions for unrecognizable images (2014)
Google Scholar
Fawzi, A., Fawzi, O., Frossard, P.: Analysis of classifiers’ robustness to adversarial perturbations. Mach. Learn. 107(3), 481–508 (2015)
Article MathSciNet Google Scholar
Rauber, J., Brendel, W., Bethge, M.: Foolbox: a python toolbox to benchmark the robustness of machine learning models (2017)
Google Scholar
Kurakin, A., Goodfellow, I., Bengio, S.: Adversarial examples in the physical world (2016)
Google Scholar
Evtimov, I., Eykholt, K., Fernandes, E., et al.: Robust physical-world attacks on deep learning models (2018)
Google Scholar
Rozsa, A., Günther, M., Rudd, E.M., et al.: Facial attributes: accuracy and adversarial robustness. Pattern Recognit. Lett., S0167865517303926 (2018)
Google Scholar
Ross, A.S., Doshi-Velez, F.: Improving the adversarial robustness and interpretability of deep neural networks by regularizing their input gradients (2017)
Google Scholar
Sinha, A., Namkoong, H., Duchi, J.: Certifying some distributional robustness with principled adversarial training (2017)
Google Scholar
Tramèr, F., Kurakin, A., Papernot, N., et al.: Ensemble adversarial training: attacks and defenses (2017)
Google Scholar
Ma, X., Li, B., Wang, Y., et al.: Characterizing adversarial subspaces using local intrinsic dimensionality (2018)
Google Scholar
Papernot, N., Mcdaniel, P.: On the effectiveness of defensive distillation (2016)
Google Scholar
Metzen, J.H., Genewein, T., Fischer, V., et al.: On detecting adversarial perturbations (2017)
Google Scholar
Meng, D., Chen, H.: MagNet: a two-pronged defense against adversarial examples (2017)
Google Scholar
Zhao, Z., Dua, D., Singh, S.: Generating natural adversarial examples (2017)
Google Scholar
Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein GAN (2017)
Google Scholar
Gulrajani, I., Ahmed, F., Arjovsky, M., et al.: Improved Training of Wasserstein GANs (2017)
Google Scholar
Papernot, N., Mcdaniel, P., Jha, S., et al.: The limitations of deep learning in adversarial settings (2015)
Google Scholar
Carlini, N., Wagner, D.: Towards evaluating the robustness of neural networks (2016)
Google Scholar
Narodytska, N., Kasiviswanathan, S.P.: Simple black-box adversarial perturbations for deep networks (2016)
Google Scholar
Hayes, J., Danezis, G.: Machine learning as an adversarial service: learning black-box adversarial examples (2017)
Google Scholar
Papernot, N., Mcdaniel, P., Goodfellow, I., et al.: Practical black-box attacks against machine learning (2017)
Google Scholar
Brendel, W., Rauber, J., Bethge, M.: Decision-based adversarial attacks: reliable attacks against black-box machine learning models (2017)
Google Scholar
Baluja, S., Fischer, I.: Adversarial transformation networks: learning to generate adversarial examples (2017)
Google Scholar
Ebrahimi, J., Lowd, D., Dou, D.: On adversarial examples for character-level neural machine translation (2018)
Google Scholar
Carlini, N., Wagner, D.: Defensive distillation is not robust to adversarial examples (2016)
Google Scholar

Download references

Acknowledgment

This work was supported by the National Key R&D Program of China (2018YFB 1500902) and NUPTSF (Grant No. NY219122).

Author information

Authors and Affiliations

Nanjing University of Posts and Telecommunications, Nanjing, China
Jie Zhang & Zhihao Zhang

Authors

Jie Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zhihao Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jie Zhang .

Editor information

Editors and Affiliations

Institute of Information Engineering and School of Cybersecurity, University of Chinese Academy of Sciences, Beijing, China
Feng Liu
Nanjing University of Posts and Telecommunications, Nanning, China
Jia Xu
Department of Computer Science, The University of Texas at San Antonio, San Antonio, TX, USA
Shouhuai Xu
Google and Columbia University, New York, NY, USA
Moti Yung

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, J., Zhang, Z. (2019). LET-Attack: Latent Encodings of Normal-Data Manifold Transferring to Adversarial Examples. In: Liu, F., Xu, J., Xu, S., Yung, M. (eds) Science of Cyber Security. SciSec 2019. Lecture Notes in Computer Science(), vol 11933. Springer, Cham. https://doi.org/10.1007/978-3-030-34637-9_10

Download citation

DOI: https://doi.org/10.1007/978-3-030-34637-9_10
Published: 06 December 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34636-2
Online ISBN: 978-3-030-34637-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics