Empirical Analysis of Benchmark Generation for the Verification of Neural Network Image Classifiers

Manzanas Lopez, Diego; Johnson, Taylor T.

doi:10.1007/978-3-031-46002-9_21

Diego Manzanas Lopez⁸ &
Taylor T. Johnson⁸

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14380))

Included in the following conference series:

International Conference on Bridging the Gap between AI and Reality

530 Accesses

Abstract

Deep Learning success in a wide range of applications, such as image recognition and natural language processing, has led to the increasing usage of this technology in many domains, including safety-critical applications such as autonomous cars and medicine. The usage of the models, e.g., neural networks, in safety critical applications demands a thorough evaluation from a component and system level perspective. In these domains, formal methods have the ability to guarantee the correct operation of these components. Despite great efforts in the formal verification of neural networks in the past decade, several challenges remain. One of these challenges is the development of neural networks for easier verification. In this work, we present an empirical analysis, presented as a Latin Hypercube experiment design, in which we evaluate how regularization and initialization methods across different random seeds on two datasets affect the verification analysis of a reachability analysis technique for the verification of neural networks. We show that there are certain training routines that simplify the formal verification task. Lastly, a discussion on how these training approaches impact the robustness verification and reachability computation of the method utilized is included.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
vnnlib: https://www.vnnlib.org.
2.
Available at https://github.com/Project-MONAI/MONAI/.
3.
Weights are independently sampled from a normal distribution with 0 mean and standard deviation of 0.01.
4.
The model architecture is depicted in Fig. 9 in the Appendix.
5.
Code is available at: https://github.com/verivital/nnv/tree/master/code/nnv/examples/Submission/AISOLA2023.

References

apolanco3225: Medical mnist classification. https://github.com/apolanco3225/Medical-MNIST-Classification (2017)
Bak, S.: nnenum: verification of ReLU neural networks with optimized abstraction refinement. In: Dutle, A., Moscato, M.M., Titolo, L., Muñoz, C.A., Perez, I. (eds.) NASA Formal Methods: 13th International Symposium, NFM 2021, Virtual Event, May 24–28, 2021, Proceedings, pp. 19–36. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-76384-8_2
Chapter Google Scholar
Bak, S., Duggirala, P.S.: Simulation-equivalent reachability of large linear systems with inputs. In: Majumdar, R., Kunčak, V. (eds.) Computer Aided Verification, pp. 401–420. Springer International Publishing, Cham (2017)
Chapter Google Scholar
Böing, B., Müller, E.: On training and verifying robust autoencoders. In: 2022 IEEE 9th International Conference on Data Science and Advanced Analytics (DSAA), pp. 1–10 (2022). https://doi.org/10.1109/DSAA54385.2022.10032334
Bunel, R., et al.: Branch and bound for piecewise linear neural network verification. J. Mach. Learn. Res. 21(1) (2020)
Google Scholar
Cireşan, D., Meier, U., Schmidhuber, J.: Multi-column deep neural networks for image classification. arXiv preprint arXiv:1202.2745 (2012)
Collobert, R., Weston, J.: A unified architecture for natural language processing: Deep neural networks with multitask learning. In: Proceedings of the 25th international conference on Machine learning, pp. 160–167. ACM (2008)
Google Scholar
Cruz, U.S., Ferlez, J., Shoukry, Y.: Safe-by-repair: A convex optimization approach for repairing unsafe two-level lattice neural network controllers. In: 2022 IEEE 61st Conference on Decision and Control (CDC), pp. 3383–3388 (2022). https://doi.org/10.1109/CDC51059.2022.9993239
Dong, G., Sun, J., Wang, J., Wang, X., Dai, T.: Towards repairing neural networks correctly (2021)
Google Scholar
Dreossi, T., Ghosh, S., Yue, X., Keutzer, K., Sangiovanni-Vincentelli, A., Seshia, S.A.: Counterexample-guided data augmentation. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence, pp. 2071–2078. IJCAI’18, AAAI Press (2018)
Google Scholar
Elboher, Y.Y., Cohen, E., Katz, G.: Neural network verification using residual reasoning. In: Schlingloff, B.-H., Chai, M. (eds.) Software Engineering and Formal Methods: 20th International Conference, SEFM 2022, Berlin, Germany, September 26–30, 2022, Proceedings, pp. 173–189. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-17108-6_11
Chapter Google Scholar
Fisher, R.A.: Statistical methods for research workers. In: Kotz, S., Johnson, N.L. (eds.) Breakthroughs in Statistics, pp. 66–70. Springer, New York, NY (1992). https://doi.org/10.1007/978-1-4612-4380-9_6
Chapter Google Scholar
Fu, F., Li, W.: Sound and complete neural network repair with minimality and locality guarantees. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=xS8AMYiEav3
Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2414–2423 (2016)
Google Scholar
Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Teh, Y.W., Titterington, M. (eds.) Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics. Proceedings of Machine Learning Research, vol. 9, pp. 249–256. PMLR, Chia Laguna Resort, Sardinia, Italy (2010)
Google Scholar
Goldberger, B., Katz, G., Adi, Y., Keshet, J.: Minimal modifications of deep neural networks using verification. In: Albert, E., Kovacs, L. (eds.) LPAR23. LPAR-23: 23rd International Conference on Logic for Programming, Artificial Intelligence and Reasoning. EPiC Series in Computing, vol. 73, pp. 260–278. EasyChair (2020). https://doi.org/10.29007/699q, https://easychair.org/publications/paper/CWhF
Goubault, E., Putot, S.: Rino: robust inner and outer approximated reachability of neural networks controlled systems. In: Shoham, S., Vizel, Y. (eds.) Computer Aided Verification, pp. 511–523. Springer, Cham (2022)
Chapter Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 1026–1034 (2015). https://doi.org/10.1109/ICCV.2015.123
Hoffman, J., Roberts, D.A., Yaida, S.: Robust learning with Jacobian regularization (2019)
Google Scholar
Huang, Y., Zhang, H., Shi, Y., Kolter, J.Z., Anandkumar, A.: Training certifiably robust neural networks with efficient local lipschitz bounds. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems. vol. 34, pp. 22745–22757. Curran Associates, Inc. (2021)
Google Scholar
Katz, G., Barrett, C., Dill, D.L., Julian, K., Kochenderfer, M.J.: Reluplex: an efficient SMT solver for verifying deep neural networks. In: Majumdar, R., Kunčak, V. (eds.) Computer Aided Verification: 29th International Conference, CAV 2017, Heidelberg, Germany, July 24-28, 2017, Proceedings, Part I, pp. 97–117. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-63387-9_5
Chapter Google Scholar
Katz, G., et al.: The marabou framework for verification and analysis of deep neural networks. In: Dillig, I., Tasiran, S. (eds.) Computer Aided Verification: 31st International Conference, CAV 2019, New York City, NY, USA, July 15-18, 2019, Proceedings, Part I, pp. 443–452. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-25540-4_26
Chapter Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp. 1097–1105 (2012)
Google Scholar
Krogh, A., Hertz, J.A.: A simple weight decay can improve generalization. In: Proceedings of the 4th International Conference on Neural Information Processing Systems, pp. 950–957. NIPS’91, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (1991)
Google Scholar
LeCun, Y., Cortes, C., Burges, C.: Mnist handwritten digit database. ATT Labs http://yann.lecun.com/exdb/mnist 2 (2010)
Leofante, F., Henriksen, P., Lomuscio, A.: Verification-friendly networks: the case for parametric relus. In: Workshop on Formal Verification of Machine Learning, Colocated with ICML 2022. IEEE (2022)
Google Scholar
Liu, C., Arnon, T., Lazarus, C., Strong, C., Barrett, C., Kochenderfer, M.J.: Algorithms for verifying deep neural networks. Found. Trends Optimization 4(3–4), 244–404 (2021). https://doi.org/10.1561/2400000035
Article Google Scholar
Lopez, D.M., et al.: Arch-comp22 category report: Artificial intelligence and neural network control systems (ainncs) for continuous and hybrid systems plants. In: Frehse, G., Althoff, M., Schoitsch, E., Guiochet, J. (eds.) Proceedings of 9th International Workshop on Applied Verification of Continuous and Hybrid Systems (ARCH22). EPiC Series in Computing, vol. 90, pp. 142–184. EasyChair (2022)
Google Scholar
Luong, M.T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025 (2015)
Lopez, D.M., Choi, S.W., Tran, H.-D., Johnson, T.T.: NNV 2.0: the neural network verification tool. In: Enea, C., Lal, A. (eds.) Computer Aided Verification: 35th International Conference, CAV 2023, Paris, France, July 17–22, 2023, Proceedings, Part II, pp. 397–412. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-37703-7_19
Chapter Google Scholar
Müller, M.N., Brix, C., Bak, S., Liu, C., Johnson, T.T.: The third international verification of neural networks competition (vnn-comp 2022): Summary and results (2022)
Google Scholar
OpenAI: Gpt-4 technical report (2023)
Google Scholar
Ren, X., et al.: Few-shot guided mix for dnn repairing. In: Proceedings - 2020 IEEE International Conference on Software Maintenance and Evolution, ICSME 2020, pp. 717–721. Proceedings - 2020 IEEE International Conference on Software Maintenance and Evolution, ICSME 2020, Institute of Electrical and Electronics Engineers Inc., United States (2020). https://doi.org/10.1109/ICSME46990.2020.00079
Shi, Z., Wang, Y., Zhang, H., Yi, J., Hsieh, C.J.: Fast certified robust training with short warmup. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems. vol. 34, pp. 18335–18349. Curran Associates, Inc. (2021). https://proceedings.neurips.cc/paper_files/paper/2021/file/988f9153ac4fd966ea302dd9ab9bae15-Paper.pdf
Singh, G., Gehr, T., Mirman, M., Püschel, M., Vechev, M.: Fast and effective robustness certification. In: Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R. (eds.) In: Advances in Neural Information Processing Systems, vol. 31. Curran Associates, Inc. (2018)
Google Scholar
Sinitsin, A., Plokhotnyuk, V., Pyrkin, D., Popov, S., Babenko, A.: Editable neural networks. In: International Conference on Learning Representations (2020)
Google Scholar
Sotoudeh, M., Thakur, A.V.: Provable repair of deep neural networks. In: Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation, pp. 588–603. PLDI 2021, Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3453483.3454064
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(56), 1929–1958 (2014). http://jmlr.org/papers/v15/srivastava14a.html
Tao, Z., Nawas, S., Mitchell, J., Thakur, A.V.: Architecture-preserving provable repair of deep neural networks. Proc. ACM Program. Lang. 7(PLDI), 443–467 (2023). https://doi.org/10.1145/3591238
Tjeng, V., Xiao, K., Tedrake, R.: Evaluating robustness of neural networks with mixed integer programming. arXiv preprint arXiv:1711.07356 (2017)
Tran, H.-D., Bak, S., Xiang, W., Johnson, T.T.: Verification of deep convolutional neural networks using imagestars. In: Lahiri, S.K., Wang, C. (eds.) Computer Aided Verification: 32nd International Conference, CAV 2020, Los Angeles, CA, USA, July 21–24, 2020, Proceedings, Part I, pp. 18–42. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-53288-8_2
Chapter Google Scholar
Tran, H.D., Choi, S., Okamoto, H., Hoxha, B., Fainekos, G., Prokhorov, D.: Quantitative verification for neural networks using probstars. In: Proceedings of the 26th ACM International Conference on Hybrid Systems: Computation and Control. HSCC ’23, Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3575870.3587112
Tran, H.-D., et al.: Star-based reachability analysis of deep neural networks. In: ter Beek, M.H., McIver, A., Oliveira, J.N. (eds.) Formal Methods – The Next 30 Years: Third World Congress, FM 2019, Porto, Portugal, October 7–11, 2019, Proceedings, pp. 670–686. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30942-8_39
Chapter Google Scholar
Tran, H.D., et al.: Verification of piecewise deep neural networks: A star set approach with zonotope pre-filter. Form. Asp. Comput. 33(4–5), 519–545 (2021)
Google Scholar
Tran, H.-D., et al.: Robustness verification of semantic segmentation neural networks using relaxed reachability. In: Silva, A., Leino, K.R.M. (eds.) Computer Aided Verification: 33rd International Conference, CAV 2021, Virtual Event, July 20–23, 2021, Proceedings, Part I, pp. 263–286. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-81685-8_12
Chapter Google Scholar
Tran, H.D., et al.: NNV: The neural network verification tool for deep neural networks and learning-enabled cyber-physical systems. In: 32nd International Conference on Computer-Aided Verification (CAV) (2020)
Google Scholar
Usman, M., Gopinath, D., Sun, Y., Noller, Y., Păsăreanu, C.S.: Nnrepair: constraint-based repair of neural network classifiers. In: Silva, A., Leino, K.R.M. (eds.) Computer Aided Verification, pp. 3–25. Springer, Cham (2021)
Chapter Google Scholar
Wang, S., Pei, K., Whitehouse, J., Yang, J., Jana, S.: Formal security analysis of neural networks using symbolic intervals. In: 27th \(\{\)USENIX\(\}\) Security Symposium (\(\{\)USENIX\(\}\) Security 18), pp. 1599–1614 (2018)
Google Scholar
Wang, S., et al.: Beta-CROWN: Efficient bound propagation with per-neuron split constraints for complete and incomplete neural network verification. In: Advances in Neural Information Processing Systems 34 (2021)
Google Scholar
Xiao, K.Y., Tjeng, V., Shafiullah, N.M.M., Madry, A.: Training for faster adversarial robustness verification via inducing reLU stability. In: International Conference on Learning Representations (2019). https://openreview.net/forum?id=BJfIVjAcKm
Yang, X., Yamaguchi, T., Tran, H.D., Hoxha, B., Johnson, T.T., Prokhorov, D.: Neural network repair with reachability analysis. In: Bogomolov, S., Parker, D. (eds.) Formal Modeling and Analysis of Timed Systems, pp. 221–236. Springer, Cham (2022)
Chapter Google Scholar

Download references

Acknowledgements

The material presented in this paper is based upon work supported by the National Science Foundation (NSF) through grant numbers 2028001, 2220426 and 2220401, the Defense Advanced Research Projects Agency (DARPA) under contract number FA8750-23-C-0518, and the Air Force Office of Scientific Research (AFOSR) under contract number FA9550-22-1-0019 and FA9550-23-1-0135. Any opinions, findings, and conclusions or recommendations expressed in this paper are those of the authors and do not necessarily reflect the views of AFOSR, DARPA, or NSF.

Author information

Authors and Affiliations

Vanderbilt University, Nashville, TN, USA
Diego Manzanas Lopez & Taylor T. Johnson

Authors

Diego Manzanas Lopez
View author publications
You can also search for this author in PubMed Google Scholar
Taylor T. Johnson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Diego Manzanas Lopez .

Editor information

Editors and Affiliations

TU Dortmund University, Dortmund, Germany
Bernhard Steffen

A Appendix

Table 1. MedNIST summary results

Full size table

Table 2. MNIST summary results

Full size table

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Manzanas Lopez, D., Johnson, T.T. (2024). Empirical Analysis of Benchmark Generation for the Verification of Neural Network Image Classifiers. In: Steffen, B. (eds) Bridging the Gap Between AI and Reality. AISoLA 2023. Lecture Notes in Computer Science, vol 14380. Springer, Cham. https://doi.org/10.1007/978-3-031-46002-9_21

Download citation

DOI: https://doi.org/10.1007/978-3-031-46002-9_21
Published: 14 December 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-46001-2
Online ISBN: 978-3-031-46002-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Empirical Analysis of Benchmark Generation for the Verification of Neural Network Image Classifiers

Abstract

Access this chapter

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

A Appendix

A Appendix

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation