Accelerated Stochastic Variance Reduction for a Class of Convex Optimization Problems

He, Lulu; Ye, Jimin; Jianwei, E.

doi:10.1007/s10957-022-02157-1

Accelerated Stochastic Variance Reduction for a Class of Convex Optimization Problems

Published: 12 January 2023

Volume 196, pages 810–828, (2023)
Cite this article

Journal of Optimization Theory and Applications Aims and scope Submit manuscript

624 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Katyusha momentum is a famous and efficient alternative acceleration method that used for stochastic optimization problems, which can reduce the potential accumulation error from the process of randomly sampling, induced by classical Nesterov’s acceleration technique. The nature idea behind the Katyusha momentum is to use a convex combination framework instead of extrapolation framework used in Nesterov’s momentum. In this paper, we design a Katyusha-like momentum step, i.e., a negative momentum framework, and incorporate it into the classical variance reduction stochastic gradient algorithm. Based on the built negative momentum-based framework, we proposed an accelerated stochastic algorithm, namely negative momentum-based stochastic variance reduction gradient (NMSVRG) algorithm for minimizing a class of convex finite-sum problems. There is only one extra parameter needed to turn in NMSVRG algorithm, which is obviously more friendly in parameter turning than the original Katyusha momentum-based algorithm. We provided a rigorous theoretical analysis and shown that the proposed NMSVRG algorithm is superior to the SVRG algorithm and is comparable to the best one in the existing literature in convergence rate. Finally, experimental results verify our analysis and show again that our proposed algorithm is superior to the state-of-the-art-related stochastic algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SGEM: stochastic gradient with energy and momentum

Article 08 August 2023

Momentum-Based Variance-Reduced Proximal Stochastic Gradient Method for Composite Nonconvex Stochastic Optimization

Article 02 December 2022

On the Convergence Analysis of Aggregated Heavy-Ball Method

References

Allen-Zhu, Z.: Katyusha: the first direct acceleration of stochastic gradient methods. J. Mach. Learn. Res. 18(221), 1–51 (2018)
MathSciNet MATH Google Scholar
Allen-Zhu, Z.: Katyusha X: Simple momentum method for stochastic sum-of-nonconvex optimization. In: Jennifer, D., Andreas, K. (eds.) Proceedings of the 35th International Conference on Machine Learning, vol. 80, pp. 179–185 (2018)
Beck, A.: First-Order Methods in Optimization. In: MOS-SIAM, Series on Optimization. SIAM, Philadelhia (2017)
Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Image Sci. 2(1), 183–202 (2009)
Article MathSciNet MATH Google Scholar
Bottou, L.: Large-scale machine learning with stochastic gradient descent. In: Proceedings of Computational Statistics. pp. 177–186 (2010)
Cevher, V., Vu, B.C.: On the linear convergence of the stochastic gradient method with constant step-size. Optim. Lett. 13, 1177–1187 (2019)
Article MathSciNet MATH Google Scholar
Defazio, A., Bach, F., Lacoste-Julien, S.: SAGA: a fast incremental gradient method with support for non-strongly convex composite objectives. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 27, pp. 1–9 (2014)
Ghadimi, S., Lan, G.: Stochastic first and zeroth order methods for nonconvex stochastic programming. SIAM J. Optimi. 23(4), 2341–2368 (2013)
Article MathSciNet MATH Google Scholar
Ghadimi, S., Lan, G.: Accelerated gradient methods for nonconvex nonlinear and stochastic programming. Math. Program. 156, 59–99 (2016)
Article MathSciNet MATH Google Scholar
Ghadimi, S., Lan, G.: Unified convergence analysis of stochastic momentum methods for convex and non-convex optimization. arXiv:1604.03257v2 (2016)
Johnson, R., Zhang, T.: Accelerating stochastic gradient descent using predictive variance reduction. In: Burges, C.J.C., Bottou, L., Welling, M., Ghahramani, Z., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 26, pp. 315–323 (2013)
Klein, S., Pluim, J., Staring, M., Viergever, M.A.: Adaptive stochastic gradient descent optimisation for image registration. Int. J. Comput. Vis. 81, 227–239 (2009)
Article MATH Google Scholar
Lan, L.: An optimal method for stochastic composite optimization. Math. Program. 133, 365–397 (2012)
Article MathSciNet MATH Google Scholar
Le Roux, N., Schmidt, M., Bach, F.: A stochastic gradient method with an exponential convergence rate for finite training sets. In: Burges, C.J.C., Bottou, L., Welling, M., Ghahramani, Z., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 25, pp. 1–9 (2012)
Lin, Z., Li, H., Fang, C.: Accelerated Optimization for Machine Learning: First-Order Algorithms. Springer, Singapore (2020)
Book MATH Google Scholar
Luo, Z., Chen, S., Qian, Y., Hou, Y.: Multi-stage stochastic gradient method with momentum acceleration. Signal Process. 188, 108201 (2021)
Article Google Scholar
Mittelhammer, R.C.: Sampling, Sample Moments and Sampling Distributions. Mathematical Statistics for Economics and Business, Springer, New York (2013)
Book Google Scholar
Nemirovski, A., Juditsky, A., Lan, G., Shapiro, A.: Robust stochastic approximation approach to stochastic programming. SIAM J. Optim. 19(4), 1574–1609 (2009)
Article MathSciNet MATH Google Scholar
Nesterov, Y.: A method of solving a convex programming problem with convergence rate. Sov. Math. Dokl. 27, 372–376 (1983)
MATH Google Scholar
Nesterov, Y.: Introductory Lectures on Convex Optimization: A Basic Course. Springer, Boston (2004)
Book MATH Google Scholar
Nesterov, Y.: Lectures on Convex Optimization, vol. 137. Springer, Berlin (2018)
MATH Google Scholar
Nguyen, L.M., Liu, J., Scheinberg, K., Taka, M.: SARAH: a novel method for machine learning problems using stochastic recursive gradient. In: Doina, P., Yee Whye, T. (eds.) The 34th International Conference on Machine Learning, vol. 70, pp. 2613–2621 (2017)
Nguyen, L.M., Scheinberg, K., Takac, M.: Inexact SARAH algorithm for stochastic optimization. Optim. Method Softw. 36(1), 237–258 (2021)
Article MathSciNet MATH Google Scholar
Nitanda, A.: Stochastic proximal gradient descent with acceleration techniques. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 27, pp. 1574–1582 (2014)
Pham, N.H., Nguyen, L.M., Phan, D.T., Tran-Dinh, Q.: ProxSARAH: an efficient algorithmic framework for stochastic composite nonconvex optimization. J. Mach. Learn. Res. 21, 1–48 (2020)
MathSciNet MATH Google Scholar
Reddi, S.J., Sra, S., Poczos, B., Smola, A.: Fast incremental method for nonconvex optimization. arXiv:1603.06159v1 (2016)
Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. 22(3), 400–407 (1951)
MathSciNet MATH Google Scholar
Shalev-Shwartz, S., Zhang, T.: Stochastic dual coordinate ascent methods for regularized loss minimization. arXiv:1209.1873v2 (2013)
Shang, F., Jiao, L., Zhou, K., Cheng, J., Ren, Y., Jin, Y.: ASVRG: accelerated proximal SVRG. In: Zhu, J., Takeuchi, I. (eds.) Proceedings of Machine Learning Research, vol. 95, pp. 1–32 (2018)
Shapiro, A., Wardi, Y.: Convergence analysis of gradient descent stochastic algorithms. J. Optim. Theory Appl. 91(2), 439–454 (1996)
Article MathSciNet MATH Google Scholar
Shapiro, A., Wardi, Y.: Convergence analysis of stochastic algorithms. Math. Oper. Res. 21(3), 615–628 (1996)
Article MathSciNet MATH Google Scholar
Wu, Z., Li, M.: General inertial proximal gradient method for a class of nonconvex nonsmooth optimization problems. Comput. Optim. Appl. 73(1), 129–158 (2019)
Article MathSciNet MATH Google Scholar
Xiao, L., Zhang, T.: A proximal stochastic gradient method with progressive variance reduction. SIAM J. Optim. 24(4), 2057–2075 (2014)
Article MathSciNet MATH Google Scholar
Yang, Z., Wang, C., Zhang, Z., Li, J.: Accelerated stochastic gradient descent with step size selection rules. Signal Process. 159, 171–186 (2019)
Article Google Scholar
Zavriev, S.K., Kostyuk, F.V.: Heavy-ball method in nonconvex optimization problems. Comput. Math. Model. 4, 336–341 (1993). https://doi.org/10.1007/BF01128757
Article MATH Google Scholar
Zhou, K., Shang, F., Cheng, J.: A simple stochastic variance reduced algorithm with fast convergence rates. In: Jennifer, D., Andreas, K. (eds.) Proceedings of the 35th International Conference on Machine Learning, vol. 80, pp. 5980–5989 (2018)

Download references

Acknowledgements

The authors would like to thank the editor and the anonymous referees for their valuable suggestions and comments which have greatly improved the presentation of this paper. This work is supported in part by the National Nature Science Foundation of China under Grant 61573014, the Fundamental Research Funds for the Central Universities under Grant JB210717 and the Fundamental Research Funds for the Central Universities under Grant YJS2215.

Author information

Authors and Affiliations

School of Mathematics and Statistics, Xi’dian University, Xi’an, 710071, China
Lulu He & Jimin Ye
College of Mathematics and Physics and Center for Applied Mathematics of Guangxi, Guangxi Minzu University, Nanning, 530006, China
E. Jianwei

Authors

Lulu He
View author publications
You can also search for this author in PubMed Google Scholar
Jimin Ye
View author publications
You can also search for this author in PubMed Google Scholar
E. Jianwei
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jimin Ye.

Additional information

Communicated by Liqun Qi.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

He, L., Ye, J. & Jianwei, E. Accelerated Stochastic Variance Reduction for a Class of Convex Optimization Problems. J Optim Theory Appl 196, 810–828 (2023). https://doi.org/10.1007/s10957-022-02157-1

Download citation

Received: 31 October 2021
Accepted: 29 December 2022
Published: 12 January 2023
Issue Date: March 2023
DOI: https://doi.org/10.1007/s10957-022-02157-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Accelerated Stochastic Variance Reduction for a Class of Convex Optimization Problems

Abstract

Access this article

Similar content being viewed by others

SGEM: stochastic gradient with energy and momentum

Momentum-Based Variance-Reduced Proximal Stochastic Gradient Method for Composite Nonconvex Stochastic Optimization

On the Convergence Analysis of Aggregated Heavy-Ball Method

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Accelerated Stochastic Variance Reduction for a Class of Convex Optimization Problems

Abstract

Access this article

Similar content being viewed by others

SGEM: stochastic gradient with energy and momentum

Momentum-Based Variance-Reduced Proximal Stochastic Gradient Method for Composite Nonconvex Stochastic Optimization

On the Convergence Analysis of Aggregated Heavy-Ball Method

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation