Boosting Adversarial Transferability Through Intermediate Feature

He, Chenghai; Li, Xiaoqian; Zhang, Xiaohang; Zhang, Kai; Li, Hailing; Xiong, Gang; Li, Xuan

doi:10.1007/978-3-031-44192-9_3

Chenghai He^11,12,13,
Xiaoqian Li¹³,
Xiaohang Zhang¹³,
Kai Zhang¹³,
Hailing Li¹³,
Gang Xiong^11,12 &
…
Xuan Li¹³

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14258))

Included in the following conference series:

International Conference on Artificial Neural Networks

695 Accesses

Abstract

Deep neural networks are well known to be vulnerable to adversarial samples in the white-box setting. However, as research progressed, researchers discovered that adversarial samples can perform black-box attacks, that is, adversarial samples generated on the original model can cause models with different structures from the original model to misidentify. A large number of methods have recently been proposed to improve the transferability of adversarial samples, but the majority of them have low transferability. In this paper, we propose an intermediate feature-based attack algorithm to improve the transferability of adversarial samples even further. Rather than generating adversarial samples directly from the original samples, we continue to optimize existing adversarial samples to improve attack transferability. To begin, we calculate the feature importance of the original samples using existing adversarial samples. Then, we analyze which features are more likely to produce adversarial samples with high transferability. Finally, we optimize those features to improve the attack transferability of the adversarial samples. Furthermore, rather than using the model’s logit output, we generate adversarial samples using the model’s intermediate layer output. Extensive experiments on the standard ImageNet dataset show that our method improves transferability and outperforms state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bai, T., Luo, J., Zhao, J., Wen, B., Wang, Q.: Recent advances in adversarial training for adversarial robustness. In: Zhou, Z.H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-2021, pp. 4312–4321. International Joint Conferences on Artificial Intelligence Organization, August 2021. https://doi.org/10.24963/ijcai.2021/591, Survey Track
Carlini, N., Wagner, D.: Towards evaluating the robustness of neural networks (2017)
Google Scholar
Dong, Y., et al.: Boosting adversarial attacks with momentum (2018)
Google Scholar
Dong, Y., Pang, T., Su, H., Zhu, J.: Evading defenses to transferable adversarial examples by translation-invariant attacks. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2019)
Google Scholar
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples (2015)
Google Scholar
He, C., et al.: Boosting the robustness of neural networks with M-PGD. In: Tanveer, M., Agarwal, S., Ozawa, S., Ekbal, A., Jatowt, A. (eds.) Neural Information Processing, pp. 562–573. Springer, Cham (2023). https://doi.org/10.1007/978-981-99-1639-9_47
Chapter Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016). https://doi.org/10.1109/CVPR.2016.90
Huang, Q., Katsman, I., He, H., Gu, Z., Belongie, S.J., Lim, S.N.: Enhancing adversarial example transferability with an intermediate level attack. arXiv:abs/1907.10823 (2019)
Huang, Y., Kong, A.W.K.: Transferable adversarial attack based on integrated gradients. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=DesNW4-5ai9
Huang, Y., Chen, C.: Smart app attack: hacking deep learning models in android apps. IEEE Trans. Inf. Forensics Secur. 17, 1827–1840 (2022). https://doi.org/10.1109/TIFS.2022.3172213
Article Google Scholar
Inkawhich, N., Liang, K.J., Carin, L., Chen, Y.: Transferable perturbations of deep feature distributions (2020)
Google Scholar
Inkawhich, N., Wen, W., Li, H.H., Chen, Y.: Feature space perturbations yield more transferable adversarial examples. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7059–7067 (2019). https://doi.org/10.1109/CVPR.2019.00723
Kim, M., Jain, A.K., Liu, X.: AdaFace: Quality adaptive margin for face recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)
Google Scholar
Kurakin, A., Goodfellow, I., Bengio, S.: Adversarial examples in the physical world (2017)
Google Scholar
Li, Q., Guo, Y., Chen, H.: Yet another intermediate-level attack (2020). https://doi.org/10.48550/ARXIV.2008.08847, https://arxiv.org/abs/2008.08847
Lin, J., Song, C., He, K., Wang, L., Hopcroft, J.E.: Nesterov accelerated gradient and scale invariance for adversarial attacks (2020)
Google Scholar
Lu, Y., et al.: Enhancing cross-task black-box transferability of adversarial examples with dispersion reduction (2019)
Google Scholar
Luo, C., Lin, Q., Xie, W., Wu, B., Xie, J., Shen, L.: Frequency-driven imperceptible adversarial attack on semantic similarity. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)
Google Scholar
Luo, X., Wu, Y., Xiao, X., Ooi, B.C.: Feature inference attack on model predictions in vertical federated learning. In: 2021 IEEE 37th International Conference on Data Engineering (ICDE), pp. 181–192 (2021). https://doi.org/10.1109/ICDE51399.2021.00023
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. In: International Conference on Learning Representations (2018). https://openreview.net/forum?id=rJzIBfZAb
Muneeb, M., Feng, S., Henschel, A.: Deep learning pipeline for image classification on mobile phones. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), May 2022
Google Scholar
Russakovsky, O., et al.: ImageNet large scale visual recognition challenge (2015)
Google Scholar
Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.A.: Inception-v4, inception-ResNet and the impact of residual connections on learning. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, AAAI 2017, pp. 4278–4284. AAAI Press (2017)
Google Scholar
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2818–2826 (2016). https://doi.org/10.1109/CVPR.2016.308
Tramèr, F., Kurakin, A., Papernot, N., Goodfellow, I., Boneh, D., McDaniel, P.: Ensemble adversarial training: attacks and defenses. In: International Conference on Learning Representations (2018). https://openreview.net/forum?id=rkZvSe-RZ
Wang, Z., Guo, H., Zhang, Z., Liu, W., Qin, Z., Ren, K.: Feature importance-aware transferable adversarial attacks (2022)
Google Scholar
Xie, C., et al.: Improving transferability of adversarial examples with input diversity. In: Computer Vision and Pattern Recognition. IEEE (2019)
Google Scholar
Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? (2014)
Google Scholar
Zhang, J., Zhu, J., Niu, G., Han, B., Sugiyama, M., Kankanhalli, M.: Geometry-aware instance-reweighted adversarial training. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=iAX0l6Cz8ub
Zhao, Y., Zhong, Z., Sebe, N., Lee, G.H.: Novel class discovery in semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)
Google Scholar
Zheng, Z., et al.: Localization distillation for dense object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)
Google Scholar
Zhou, D., et al.: Removing adversarial noise in class activation feature space (2021)
Google Scholar

Download references

Acknowledgements

This work is supported by the National Key R &D Program of China (No. 2021YFB3100600).

Author information

Authors and Affiliations

Institute of Information Engineering, Chinese Academy of Science, Beijing, China
Chenghai He & Gang Xiong
School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China
Chenghai He & Gang Xiong
CNCERT/CC, Beijing, China
Chenghai He, Xiaoqian Li, Xiaohang Zhang, Kai Zhang, Hailing Li & Xuan Li

Authors

Chenghai He
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoqian Li
View author publications
You can also search for this author in PubMed Google Scholar
Xiaohang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Kai Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Hailing Li
View author publications
You can also search for this author in PubMed Google Scholar
Gang Xiong
View author publications
You can also search for this author in PubMed Google Scholar
Xuan Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Gang Xiong or Xuan Li .

Editor information

Editors and Affiliations

Democritus University of Thrace, Xanthi, Greece
Lazaros Iliadis
Democritus University of Thrace, Xanthi, Greece
Antonios Papaleonidas
Lancaster University, Lancaster, UK
Plamen Angelov
Teesside University, Middlesbrough, UK
Chrisina Jayne

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

He, C. et al. (2023). Boosting Adversarial Transferability Through Intermediate Feature. In: Iliadis, L., Papaleonidas, A., Angelov, P., Jayne, C. (eds) Artificial Neural Networks and Machine Learning – ICANN 2023. ICANN 2023. Lecture Notes in Computer Science, vol 14258. Springer, Cham. https://doi.org/10.1007/978-3-031-44192-9_3

Download citation

DOI: https://doi.org/10.1007/978-3-031-44192-9_3
Published: 22 September 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-44191-2
Online ISBN: 978-3-031-44192-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics