Two-layer networks with the $$\text {ReLU}^k$$ activation function: Barron spaces and derivative approximation

Li, Yuanyuan; Lu, Shuai; Mathé, Peter; Pereverzev, Sergei V.

doi:10.1007/s00211-023-01384-6

Two-layer networks with the $\text {ReLU}^k$ activation function: Barron spaces and derivative approximation

Published: 23 November 2023

Volume 156, pages 319–344, (2024)
Cite this article

Numerische Mathematik Aims and scope Submit manuscript

Yuanyuan Li¹,
Shuai Lu¹,
Peter Mathé² &
…
Sergei V. Pereverzev³

498 Accesses
Explore all metrics

Abstract

We investigate the use of two-layer networks with the rectified power unit, which is called the $\text {ReLU}^k$ activation function, for function and derivative approximation. By extending and calibrating the corresponding Barron space, we show that two-layer networks with the $\text {ReLU}^k$ activation function are well-designed to simultaneously approximate an unknown function and its derivatives. When the measurement is noisy, we propose a Tikhonov type regularization method, and provide error bounds when the regularization parameter is chosen appropriately. Several numerical examples support the efficiency of the proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Constructive function approximation by neural networks with optimized activation functions and fixed weights

Article 09 June 2018

Pushing Stochastic Gradient towards Second-Order Methods – Backpropagation Learning with Transformations in Nonlinearities

Fundamentals of Artificial Neural Networks and Deep Learning

References

Abdeljawad, A., Grohs, P.: Integral representations of shallow neural network with rectified power unit activation function. Neural Netw. 155, 536–550 (2022)
Article PubMed Google Scholar
Aronszajn, N.: Theory of reproducing kernels. Trans. Am. Math. Soc. 68, 337–404 (1950)
Article MathSciNet Google Scholar
Bao, G., Ye, X., Zang, Y., Zhou, H.: Numerical solution of inverse problems by weak adversarial networks. Inverse Probl. 36(11), 115003 (2020)
Article MathSciNet ADS Google Scholar
Barron, A.: Universal approximation bounds for superpositions of a sigmoidal function. IEEE Trans. Inform. Theory 39(3), 930–945 (1993)
Article MathSciNet Google Scholar
Bishop, C.: Training with noise is equivalent to Tikhonov regularization. Neural Comput. 7(1), 108–116 (1995)
Article Google Scholar
Burger, M., Neubauer, A.: Analysis of Tikhonov regularization for function approximation by neural networks. Neural Netw. 16(1), 79–90 (2003)
Article PubMed Google Scholar
Caragea, A., Petersen, P., Voigtlaender, F.: Neural network approximation and estimation of classifiers with classification boundary in a Barron class (2022). Accessed: July 19, 2023. arXiv:2011.09363
Cavalier, L.: Ch.1 Inverse problems in statistics. In: P. Alquier et al. (eds.) Inverse Problems and High-Dimensional Estimation, Lecture Notes in Statistics, vol. 203. Springer, Berlin (2011)
DeVore, R.: Nonlinear approximation. Acta Numer. 7, 51–150 (1998)
Article ADS Google Scholar
Engl, H., Hanke, M., Neubauer, A.: Regularization of inverse problems. In: Mathematics and its Applications, vol. 375. Kluwer Academic Publishers Group, Dordrecht (1996)
Gribonval, R., Kutyniok, G., Nielsen, M., Voigtlaender, F.: Approximation spaces of deep neural networks. Constr. Approx. 55(1), 259–367 (2022)
Article MathSciNet Google Scholar
Hanke, M., Scherzer, O.: Inverse problems light: numerical differentiation. Am. Math. Mon. 108(6), 512–521 (2001)
Article MathSciNet Google Scholar
Kingma, D., Ba, J.: Adam: A method for stochastic optimization. ICLR 2015. arXiv:1412.6980 [cs] (2014)
Klusowski, J., Barron, A.: Approximation by combinations of ReLU and squared ReLU ridge functions with $\ell ^1$ and $\ell ^0$ controls. IEEE Trans. Inform. Theory 64(12), 7649–7656 (2018)
Article MathSciNet Google Scholar
Kůrková, V.: Complexity estimates based on integral transforms induced by computational units. Neural Netw. 33, 160–167 (2012)
Article PubMed Google Scholar
Li, B., Tang, S., Yu, H.: Better approximations of high dimensional smooth functions by deep neural networks with rectified power units. Commun. Comput. Phys. 27(2), 379–411 (2020)
Article MathSciNet Google Scholar
Lu, S., Pereverzev, S.V.: Regularization theory for ill-posed problems, volume 58 of Inverse and Ill-posed Problems Series. De Gruyter, Berlin. Selected topics (2013)
Lu, S., Pereverzev, S.V.: Numerical differentiation from a viewpoint of regularization theory. Math. Comput. 75(256), 1853–1870 (2006)
Article MathSciNet ADS Google Scholar
Ma, C., Wu, L.: The Barron space and the flow-induced function spaces for neural network models. Constr. Approx. 55(1), 369–406 (2022)
Article MathSciNet Google Scholar
Magaril-Il’yaev, G.G., Osipenko, K.Y.: Optimal recovery of functions and their derivatives from inaccurate information about the spectrum and inequalities for derivatives. Funct. Anal. Appl. 37, 203–214 (2003)
Article MathSciNet Google Scholar
Moody, J.: The effective number of parameters: an analysis of generalization and regularization in nonlinear learning systems. In: Proceedings of the 4th International Conference on Neural Information Processing Systems, NIPS’91, pp. 847–854, San Francisco, CA, USA. Morgan Kaufmann Publishers Inc (1991)
Siegel, J., Xu, J.: High-order approximation rates for shallow neural networks with cosine and $\text{ ReLU}^k$ activation functions. Appl. Comput. Harmon. Anal. 58, 1–26 (2022)
Article MathSciNet Google Scholar
Siegel, J., Xu, J.: Sharp bounds on the approximation rates, metric entropy, and $n$-widths of Shallow neural networks. Found. Comput. Math. (2022). https://doi.org/10.1007/s10208-022-09595-3
Article Google Scholar
Siegel, J., Xu, J.: Characterization of the variation spaces corresponding to shallow neural networks. Constr. Approx. 57, 1109–1132 (2023)
Article MathSciNet Google Scholar
Wahba, G.: Spline Models for Observational Data, volume 59 of CBMS-NSF Regional Conference Series in Applied Mathematics. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA (1990)
Wang, Y.B., Jia, X.Z., Cheng, J.: A numerical differentiation method and its application to reconstruction of discontinuity. Inverse Prob. 18(6), 1461–1476 (2002)
Article MathSciNet ADS Google Scholar
Wojtowytsch, S.: Representation formulas and pointwise properties for Barron functions. Calc. Var. 61(2), 1–37 (2022)
MathSciNet Google Scholar
Xu, J.: Finite neuron method and convergence analysis. Commun. Comput. Phys. 28(5), 1707–1745 (2020)
Article MathSciNet Google Scholar
Yarotsky, D.: Error bounds for approximation with deep ReLU networks. Neural Netw. 94, 103–114 (2017)
Article PubMed Google Scholar
Zhou, D.: Universality of deep convolutional neural networks. Appl. Comput. Harmon. Anal. 48(2), 787–794 (2020)
Article MathSciNet Google Scholar

Download references

Acknowledgements

S. Lu is supported by NSFC (No. 11925104), and the Sino-German Mobility Programme (M-0187) by Sino-German Center for Research Promotion. S. Pereverzev is supported by the COMET Module S3AI managed by the Austrian Research Promotion Agency FFG. The authors thank two anonymous referees for their careful reading of the manuscript and valuable remarks which greatly helped to improve the article.

Author information

Authors and Affiliations

School of Mathematical Sciences, Fudan University, No. 220 Handan Road, Shanghai, 200433, China
Yuanyuan Li & Shuai Lu
Weierstrass Institute for Applied Analysis and Stochastics, Mohrenstrasse 39, 10117, Berlin, Germany
Peter Mathé
Johann Radon Institute for Computational and Applied Mathematics, Altenbergerstrasse 69, 4040, Linz, Austria
Sergei V. Pereverzev

Authors

Yuanyuan Li
View author publications
You can also search for this author in PubMed Google Scholar
Shuai Lu
View author publications
You can also search for this author in PubMed Google Scholar
Peter Mathé
View author publications
You can also search for this author in PubMed Google Scholar
Sergei V. Pereverzev
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shuai Lu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Li, Y., Lu, S., Mathé, P. et al. Two-layer networks with the $\text {ReLU}^k$ activation function: Barron spaces and derivative approximation. Numer. Math. 156, 319–344 (2024). https://doi.org/10.1007/s00211-023-01384-6

Download citation

Received: 20 January 2023
Revised: 11 August 2023
Accepted: 30 October 2023
Published: 23 November 2023
Issue Date: February 2024
DOI: https://doi.org/10.1007/s00211-023-01384-6

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Two-layer networks with the \(\text {ReLU}^k\) activation function: Barron spaces and derivative approximation

Abstract

Access this article

Similar content being viewed by others

Constructive function approximation by neural networks with optimized activation functions and fixed weights

Pushing Stochastic Gradient towards Second-Order Methods – Backpropagation Learning with Transformations in Nonlinearities

Fundamentals of Artificial Neural Networks and Deep Learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Mathematics Subject Classification

Navigation

Two-layer networks with the \(\text {ReLU}^k\) activation function: Barron spaces and derivative approximation

Abstract

Access this article

Similar content being viewed by others

Constructive function approximation by neural networks with optimized activation functions and fixed weights

Pushing Stochastic Gradient towards Second-Order Methods – Backpropagation Learning with Transformations in Nonlinearities

Fundamentals of Artificial Neural Networks and Deep Learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Mathematics Subject Classification

Search

Navigation