Abstract
We investigate the use of two-layer networks with the rectified power unit, which is called the \(\text {ReLU}^k\) activation function, for function and derivative approximation. By extending and calibrating the corresponding Barron space, we show that two-layer networks with the \(\text {ReLU}^k\) activation function are well-designed to simultaneously approximate an unknown function and its derivatives. When the measurement is noisy, we propose a Tikhonov type regularization method, and provide error bounds when the regularization parameter is chosen appropriately. Several numerical examples support the efficiency of the proposed approach.
Similar content being viewed by others
References
Abdeljawad, A., Grohs, P.: Integral representations of shallow neural network with rectified power unit activation function. Neural Netw. 155, 536–550 (2022)
Aronszajn, N.: Theory of reproducing kernels. Trans. Am. Math. Soc. 68, 337–404 (1950)
Bao, G., Ye, X., Zang, Y., Zhou, H.: Numerical solution of inverse problems by weak adversarial networks. Inverse Probl. 36(11), 115003 (2020)
Barron, A.: Universal approximation bounds for superpositions of a sigmoidal function. IEEE Trans. Inform. Theory 39(3), 930–945 (1993)
Bishop, C.: Training with noise is equivalent to Tikhonov regularization. Neural Comput. 7(1), 108–116 (1995)
Burger, M., Neubauer, A.: Analysis of Tikhonov regularization for function approximation by neural networks. Neural Netw. 16(1), 79–90 (2003)
Caragea, A., Petersen, P., Voigtlaender, F.: Neural network approximation and estimation of classifiers with classification boundary in a Barron class (2022). Accessed: July 19, 2023. arXiv:2011.09363
Cavalier, L.: Ch.1 Inverse problems in statistics. In: P. Alquier et al. (eds.) Inverse Problems and High-Dimensional Estimation, Lecture Notes in Statistics, vol. 203. Springer, Berlin (2011)
DeVore, R.: Nonlinear approximation. Acta Numer. 7, 51–150 (1998)
Engl, H., Hanke, M., Neubauer, A.: Regularization of inverse problems. In: Mathematics and its Applications, vol. 375. Kluwer Academic Publishers Group, Dordrecht (1996)
Gribonval, R., Kutyniok, G., Nielsen, M., Voigtlaender, F.: Approximation spaces of deep neural networks. Constr. Approx. 55(1), 259–367 (2022)
Hanke, M., Scherzer, O.: Inverse problems light: numerical differentiation. Am. Math. Mon. 108(6), 512–521 (2001)
Kingma, D., Ba, J.: Adam: A method for stochastic optimization. ICLR 2015. arXiv:1412.6980 [cs] (2014)
Klusowski, J., Barron, A.: Approximation by combinations of ReLU and squared ReLU ridge functions with \(\ell ^1\) and \(\ell ^0\) controls. IEEE Trans. Inform. Theory 64(12), 7649–7656 (2018)
Kůrková, V.: Complexity estimates based on integral transforms induced by computational units. Neural Netw. 33, 160–167 (2012)
Li, B., Tang, S., Yu, H.: Better approximations of high dimensional smooth functions by deep neural networks with rectified power units. Commun. Comput. Phys. 27(2), 379–411 (2020)
Lu, S., Pereverzev, S.V.: Regularization theory for ill-posed problems, volume 58 of Inverse and Ill-posed Problems Series. De Gruyter, Berlin. Selected topics (2013)
Lu, S., Pereverzev, S.V.: Numerical differentiation from a viewpoint of regularization theory. Math. Comput. 75(256), 1853–1870 (2006)
Ma, C., Wu, L.: The Barron space and the flow-induced function spaces for neural network models. Constr. Approx. 55(1), 369–406 (2022)
Magaril-Il’yaev, G.G., Osipenko, K.Y.: Optimal recovery of functions and their derivatives from inaccurate information about the spectrum and inequalities for derivatives. Funct. Anal. Appl. 37, 203–214 (2003)
Moody, J.: The effective number of parameters: an analysis of generalization and regularization in nonlinear learning systems. In: Proceedings of the 4th International Conference on Neural Information Processing Systems, NIPS’91, pp. 847–854, San Francisco, CA, USA. Morgan Kaufmann Publishers Inc (1991)
Siegel, J., Xu, J.: High-order approximation rates for shallow neural networks with cosine and \(\text{ ReLU}^k\) activation functions. Appl. Comput. Harmon. Anal. 58, 1–26 (2022)
Siegel, J., Xu, J.: Sharp bounds on the approximation rates, metric entropy, and \(n\)-widths of Shallow neural networks. Found. Comput. Math. (2022). https://doi.org/10.1007/s10208-022-09595-3
Siegel, J., Xu, J.: Characterization of the variation spaces corresponding to shallow neural networks. Constr. Approx. 57, 1109–1132 (2023)
Wahba, G.: Spline Models for Observational Data, volume 59 of CBMS-NSF Regional Conference Series in Applied Mathematics. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA (1990)
Wang, Y.B., Jia, X.Z., Cheng, J.: A numerical differentiation method and its application to reconstruction of discontinuity. Inverse Prob. 18(6), 1461–1476 (2002)
Wojtowytsch, S.: Representation formulas and pointwise properties for Barron functions. Calc. Var. 61(2), 1–37 (2022)
Xu, J.: Finite neuron method and convergence analysis. Commun. Comput. Phys. 28(5), 1707–1745 (2020)
Yarotsky, D.: Error bounds for approximation with deep ReLU networks. Neural Netw. 94, 103–114 (2017)
Zhou, D.: Universality of deep convolutional neural networks. Appl. Comput. Harmon. Anal. 48(2), 787–794 (2020)
Acknowledgements
S. Lu is supported by NSFC (No. 11925104), and the Sino-German Mobility Programme (M-0187) by Sino-German Center for Research Promotion. S. Pereverzev is supported by the COMET Module S3AI managed by the Austrian Research Promotion Agency FFG. The authors thank two anonymous referees for their careful reading of the manuscript and valuable remarks which greatly helped to improve the article.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, Y., Lu, S., Mathé, P. et al. Two-layer networks with the \(\text {ReLU}^k\) activation function: Barron spaces and derivative approximation. Numer. Math. 156, 319–344 (2024). https://doi.org/10.1007/s00211-023-01384-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00211-023-01384-6