Bayesian Optimization for Auto-tuning Convolution Neural Network on GPU

Zhu, Huming; Liu, Chendi; Zhang, Lingyun; Dong, Ximiao

doi:10.1007/978-981-97-0811-6_29

Huming Zhu¹⁰,
Chendi Liu¹⁰,
Lingyun Zhang¹⁰ &
…
Ximiao Dong¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14492))

Included in the following conference series:

International Conference on Algorithms and Architectures for Parallel Processing

156 Accesses

Abstract

GPU as a hardware processor plays an important role in the training of deep neural networks. However, when using GPUs for computation on convolutional neural network models, different combinations of GPU kernel configuration parameters have different performance. Therefore, this paper proposes BAGF, a bayesian auto-tuning framework for GPU kernels, which parameterizes the factors affecting the performance of GPU programs and uses bayesian optimization methods to search for the best parameters in the search space consisting of the parameters. Compared with other optimization algorithms, BAGF obtains excellent configuration parameters with fewer iterations. This paper analyzes the performance of BAGF on four benchmarks and compares with other common optimization algorithms. In addition, the performance improvement of each parameter configuration is analyzed. Finally, the BAGF was tested with the convolution layer of Alexnet, and the results of the Roofline model were analyzed. Compared with the original parameter configuration, the speed of BAGF was increased by 50.09%.

Supported by organization x.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Cao, Z.: Continuous improvement of self-driving cars using dynamic confidence-aware reinforcement learning. Nat. Mach. Intell. 5(2), 145–158 (2023)
Article Google Scholar
Mao, J.: 3D object detection for autonomous driving: a comprehensive survey. Int. J. Comput. Vision 131(8), 1909–1963 (2023)
Article Google Scholar
Aldarmaki, H.: Unsupervised automatic speech recognition: a review. Speech Commun. 139, 76–91 (2022)
Article Google Scholar
Kim, H.: Performance analysis of CNN frameworks for GPUs. In: ISPASS 2017 - IEEE International Symposium on Performance Analysis of Systems and Software, pp. 55–64. IEEE, Piscataway, NJ (2017)
Google Scholar
Hu, Y.: A survey on convolutional neural network accelerators: GPU, FPGA and ASIC. In: 2022 IEEE 14th International Conference on Computer Research and Development. ICCRD 2022, pp. 100–107. IEEE, Piscataway, NJ (2022)
Google Scholar
Wu, Y., Zhu, H., Zhang, L., Hou, B., Jiao, L.: Accelerating deep convolutional neural network inference based on OpenCL. In: Shi, Z., Jin, Y., Zhang, X. (eds.) Intelligence Science IV. ICIS 2022. IFIP Advances in Information and Communication Technology, vol. 659. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-14903-0_11
Schoonhoven, R.A.: Benchmarking optimization algorithms for auto-tuning GPU kernels. IEEE Trans. Evol. Comput. 27(3), 550–564 (2023)
Article Google Scholar
van Werkhoven, B.: Kernel tuner: a search-optimizing GPU code auto-tuner. Futur. Gener. Comput. Syst. 90, 347–358 (2019)
Article Google Scholar
Feurer, M.: Efficient and robust automated machine learning. In: Advances in Neural Information Processing Systems, pp. 2962–2970. Neural Information Processing Systems Foundation, La Jolla, California (2015)
Google Scholar
Snoek, J.: Practical Bayesian optimization of machine learning algorithms. In: Advances in Neural Information Processing Systems, pp. 2951–2959. Neural Information Processing Systems Foundation, La Jolla, California (2012)
Google Scholar
Mahendran, N.: Adaptive MCMC with Bayesian optimization. In: 15th International Conference on Artificial Intelligence and Statistics, pp. 751–760. PMLR, New York, NY, USA (2012)
Google Scholar
Wu, J.: Hyperparameter optimization for machine learning models based on Bayesian optimization. J. Electron. Sci. Technol. 17(1), 26–40 (2019)
Google Scholar
Dao, T.T.: An auto-tuner for OpenCL work-group size on GPUs. IEEE Trans. Parallel Distrib. Syst. 29(2), 283–296 (2017)
Article Google Scholar
Li, J.: A fine-grained prefetching scheme for DGEMM kernels on GPU with auto-tuning compatibility. In: 2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 863–874. IEEE, Piscataway, NJ (2022)
Google Scholar
Petrovič, F.: A benchmark set of highly-efficient CUDA and OpenCL kernels and its dynamic autotuning with kernel tuning toolkit. Futur. Gener. Comput. Syst. 108, 161–177 (2020)
Article Google Scholar
Cheema, S.: GPU Auto-tuning framework for optimal performance and power consumption. In: Proceedings of the 15th Workshop on General Purpose Processing Using GPU, pp. 1–6. Association for Computing Machinery, New York, NY, USA (2023)
Google Scholar
Lo, Y.J., et al.: Roofline model toolkit: a practical tool for architectural and program analysis. In: Jarvis, S.A., Wright, S.A., Hammond, S.D. (eds.) PMBS 2014. LNCS, vol. 8966, pp. 129–148. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-17248-4_7
Chapter Google Scholar

Download references

Acknowledgements

This work is funded in part by the Key Research and Development Program of Shaanxi (Program No. 2022ZDLGY01-09), GHfund A No. 202107014474, GHfund 202202036165, Wuhu and Xidian University special fund for industry- university- research cooperation (Project No. XWYCXY-012021013), and Cloud Computing Key Laboratory of Gansu Province.

Author information

Authors and Affiliations

Key Laboratory of Intelligent Perception and Image Understanding, Ministry of Education, Xidian University, Xi’an, 710071, China
Huming Zhu, Chendi Liu, Lingyun Zhang & Ximiao Dong

Authors

Huming Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Chendi Liu
View author publications
You can also search for this author in PubMed Google Scholar
Lingyun Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Ximiao Dong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Huming Zhu .

Editor information

Editors and Affiliations

Royal Melbourne Institute of Technology, Melbourne, VIC, Australia
Zahir Tari
Tianjin University, Tianjin, China
Keqiu Li
University of Arizona, Tucson, AZ, USA
Hongyi Wu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhu, H., Liu, C., Zhang, L., Dong, X. (2024). Bayesian Optimization for Auto-tuning Convolution Neural Network on GPU. In: Tari, Z., Li, K., Wu, H. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2023. Lecture Notes in Computer Science, vol 14492. Springer, Singapore. https://doi.org/10.1007/978-981-97-0811-6_29

Download citation

DOI: https://doi.org/10.1007/978-981-97-0811-6_29
Published: 27 February 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-0810-9
Online ISBN: 978-981-97-0811-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Bayesian Optimization for Auto-tuning Convolution Neural Network on GPU