Abstract
Deep neural networks (DNN) have been recently shown to be susceptible to a particular type of attack possible through the generation of particular synthetic examples referred to as adversarial samples. These samples are constructed by manipulating real examples from the training data distribution in order to “fool” the original neural model, resulting in misclassification of previously correctly classified samples. Addressing this weakness is of utmost importance if DNN is to be applied to critical applications, such as those in cybersecurity. In this paper, we present an analysis of this fundamental flaw lurking in all neural architectures to uncover limitations of previously proposed defense mechanisms. More importantly, we present a unifying framework for protecting deep neural models using a non-invertible data transformation–developing two adversary-resistant DNNs utilizing both linear and nonlinear dimensionality reduction techniques. Empirical results indicate that our framework provides better robustness compared to state-of-art solutions while having negligible degradation in generalization accuracy.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Aman, S., Hongseok, N., John, D.: Certifiable distributional robustness with principled adversarial training. In: International Conference on Learning Representations (2018)
Bar, Y., Diamant, I., Wolf, L., Greenspan, H.: Deep learning with non-medical training used for chest pathology identification. In: SPIE Medical Imaging, International Society for Optics and Photonics, p. 94140V (2015)
Bhagoji, A.N., Cullina, D., Mittal, P.: Dimensionality reduction as a defense against evasion attacks on machine learning classifiers. arXiv preprint (2017)
Cisse, M., Bojanowski, P., Grave, E., Dauphin, Y., Usunier, N.: Parseval networks: improving robustness to adversarial examples. In: Proceedings of the 34th International Conference on Machine Learning (2017)
Dahl, G.E., Stokes, J.W., Deng, L., Yu, D.: Large-scale malware classification using random projections and neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 3422–3426. IEEE (2013)
Donoho, D.L.: Compressed sensing. IEEE Trans. Inf. Theory 52, 1289–1306 (2006)
Eckart, C., Young, G.: The approximation of one matrix by another of lower rank. Psychometrika 1, 211–218 (1936)
Fornasier, M., Rauhut, H.: Compressive sensing. In: Scherzer, O. (ed.) Handbook of Mathematical Methods in Imaging, pp. 187–228. Springer, New York (2011). https://doi.org/10.1007/978-0-387-92920-0_6
Gu, S., Rigazio, L.: Towards deep neural network architectures robust to adversarial examples. arXiv:1412.5068 [cs] (2014)
Hadsell, R., Chopra, S., LeCun, Y.: Dimensionality reduction by learning an invariant mapping. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2006)
Hadsell, R., et al.: Learning long-range vision for autonomous off-road driving. J. Field Robot. 26, 120–144 (2009)
Huang, R., Xu, B., Schuurmans, D., Szepesvári, C.: Learning with a strong adversary. CoRR, abs/1511.03034 (2015)
Jolliffe, I.: Principal Component Analysis. Wiley Online Library (2002)
Miyato, T., Maeda, S.-I., Koyama, M., Nakae, K., Ishii, S.: Distributional smoothing with virtual adversarial training, stat, 1050, p. 25 (2015)
Nguyen, A., Yosinski, J., Clune, J.: Deep neural networks are easily fooled: high confidence predictions for unrecognizable images. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (2015)
Ororbia II, A.G., Giles, C.L., Kifer, D.: Unifying adversarial training algorithms with flexible deep data gradient regularization. arXiv:1601.07213 [cs] (2016)
Papernot, N., McDaniel, P., Wu, X., Jha, S., Swami, A.: Distillation as a defense to adversarial perturbations against deep neural networks. arXiv preprint arXiv:1511.04508 (2015)
Pascanu, R., Stokes, J.W., Sanossian, H., Marinescu, M., Thomas, A.: Malware classification with recurrent networks. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (2015)
Raskutti, G., Wainwright, M.J., Yu, B.: Minimax rates of estimation for high-dimensional linear regression over-balls. IEEE Trans. Inf. Theory 57, 6976–6994 (2011)
Szegedy, C., et al.: Intriguing properties of neural networks. In: International Conference on Learning Representations (2014)
Vavasis, S.A.: Nonlinear Optimization: Complexity Issues. Oxford University Press Inc., New York (1991)
Wu, X., Jang, U., Chen, J., Chen, L., Jha, S.: Reinforcing adversarial robustness using model confidence induced by adversarial training. In: International Conference on Machine Learning (2018)
Yuan, Z., Lu, Y., Wang, Z., Xue, Y.: Droid-Sec: deep learning in android malware detection. ACM SIGCOMM Comput. Commun. Rev. 44, 371 (2014)
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Guo, W., Mu, D., Chen, L., Gai, J. (2019). Building Adversarial Defense with Non-invertible Data Transformations. In: Nayak, A., Sharma, A. (eds) PRICAI 2019: Trends in Artificial Intelligence. PRICAI 2019. Lecture Notes in Computer Science(), vol 11672. Springer, Cham. https://doi.org/10.1007/978-3-030-29894-4_48
Download citation
DOI: https://doi.org/10.1007/978-3-030-29894-4_48
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29893-7
Online ISBN: 978-3-030-29894-4
eBook Packages: Computer ScienceComputer Science (R0)