Skip to main content

Unifying Framework for Accelerated Randomized Methods in Convex Optimization

  • Conference paper
  • First Online:
Foundations of Modern Statistics (FMS 2019)

Abstract

In this paper, we consider smooth convex optimization problems with simple constraints and inexactness in the oracle information such as value, partial or directional derivatives of the objective function. We introduce a unifying framework, which allows to construct different types of accelerated randomized methods for such problems and to prove convergence rate theorems for them. We focus on accelerated random block-coordinate descent, accelerated random directional search, accelerated random derivative-free method and, using our framework, provide their versions for problems with inexact oracle information. Our contribution also includes accelerated random block-coordinate descent with inexact oracle and entropy proximal setup as well as derivative-free version of this method. Moreover, we present an extension of our framework for strongly convex optimization problems. We also discuss an extension for the case of inexact model of the objective function.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Agarwal, A., Dekel, O., Xiao, L.: Optimal algorithms for online convex optimization with multi-point bandit feedback. In: COLT 2010—The 23rd Conference on Learning Theory (2010)

    Google Scholar 

  2. Allen-Zhu, Z.: Katyusha: the first direct acceleration of stochastic gradient methods. In: Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017, New York, NY, USA, pp. 1200–1205. ACM (2017). https://doi.org/10.1145/3055399.3055448, arXiv:1603.05953

  3. Allen-Zhu, Z., Qu, Z., Richtarik, P., Yuan, Y.: Even faster accelerated coordinate descent using non-uniform sampling. In: Balcan, M.F., Weinberger, K.Q. (eds.) Proceedings of The 33rd International Conference on Machine Learning, Proceedings of Machine Learning Research, New York, New York, USA, 20–22 Jun 2016, PMLR, vol. 48, pp. 1110–1119. http://proceedings.mlr.press/v48/allen-zhuc16.html. First appeared in arXiv:1512.09103

  4. Bayandina, A., Gasnikov, A., Lagunovskaya, A.: Gradient-free two-points optimal method for non smooth stochastic convex optimization problem with additional small noise. Autom. Remote Control 79 (2018). https://doi.org/10.1134/S0005117918080039, arXiv:1701.03821

  5. Ben-Tal, A., Nemirovski, A.: Lectures on Modern Convex Optimization. Lecture Notes (2015)

    Google Scholar 

  6. Berahas, A.S., Cao, L., Choromanski, K., Scheinberg, K.: A theoretical and empirical comparison of gradient approximations in derivative-free optimization. Found. Comput. Math. 1–54 (2021). https://doi.org/10.1007/s10208-021-09513-z

  7. Berahas, A.S., Cao, L., Scheinberg, K.: Global convergence rate analysis of a generic line search algorithm with noise. SIAM J. Optim. 31, 1489–1518 (2021). https://doi.org/10.1137/19M1291832

  8. Bogolubsky, L., Dvurechenskii, P., Gasnikov, A., Gusev, G., Nesterov, Y., Raigorodskii, A.M., Tikhonov, A., Zhukovskii, M.: Learning supervised pagerank with gradient-based and gradient-free optimization methods. In: Lee D.D., Sugiyama, M., Luxburg, U.V., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 29, pp. 4914–4922. Curran Associates, Inc., (2016). http://papers.nips.cc/paper/6565-learning-supervised-pagerank-with-gradient-based-and-gradient-free-optimization-methods.pdf. arXiv:1603.00717

  9. Cohen, M., Diakonikolas, J., Orecchia, L.: On acceleration with noise-corrupted gradients. In: Dy, J., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning, Proceedings of Machine Learning Research, Stockholmsmässan, Stockholm Sweden, 10–15 Jul 2018, vol. 80, pp. 1019–1028. PMLR. http://proceedings.mlr.press/v80/cohen18a.html. arXiv:1805.12591

  10. Conn, A., Scheinberg, K., Vicente, L.: Introduction to Derivative-Free Optimization, Society for Industrial and Applied Mathematics (2009). https://doi.org/10.1137/1.9780898718768, http://epubs.siam.org/doi/abs/10.1137/1.9780898718768

  11. Dang, C.D., Lan, G.: Stochastic block mirror descent methods for nonsmooth and stochastic optimization. SIAM J. Optim. 25, 856–881 (2015). https://doi.org/10.1137/130936361

  12. Danilova, M., Dvurechensky, P., Gasnikov, A., Gorbunov, E., Guminov, S., Kamzolov, D., Shibaev, I.: Recent theoretical advances in non-convex optimization, pp. 79–163. Springer International Publishing, Cham (2020). ISBN 978-3-031-00832-0. https://doi.org/10.1007/978-3-031-00832-03. arXiv:2012.06188. (accepted)

  13. d’Aspremont, A.: Smooth optimization with approximate gradient. SIAM J. Optim. 19, 1171–1183 (2008). https://doi.org/10.1137/060676386

  14. Devolder, O., Glineur, F., Nesterov, Y.: First-order methods of smooth convex optimization with inexact oracle. Math. Program. 146, 37–75 (2014). https://doi.org/10.1007/s10107-013-0677-5

  15. Duchi, J.C., Jordan, M.I., Wainwright, M.J., Wibisono, A.: Optimal rates for zero-order convex optimization: the power of two function evaluations. IEEE Trans. Inf. Theory 61, 2788–2806 (2015). https://doi.org/10.1109/TIT.2015.2409256, arXiv:1312.2139

  16. Dvinskikh, D., Ogaltsov, A., Gasnikov, A., Dvurechensky, P., Spokoiny, V.: On the line-search gradient methods for stochastic optimization. IFAC-PapersOnLine 53, 1715–1720 (2020). https://doi.org/10.1016/j.ifacol.2020.12.2284, https://www.sciencedirect.com/science/article/pii/S240589632032944X. 21th IFAC World Congress, arXiv:1911.08380

  17. Dvinskikh D.M, Turin, A.I., Gasnikov, A.V., Omelchenko, S.S.: Accelerated and non accelerated stochastic gradient descent in model generality. Matematicheskie Zametki 108, 515–528 (2020). https://doi.org/10.1134/S0001434620090230

  18. Dvurechensky, P., Dvinskikh, D., Gasnikov, A., Uribe, C.A., Nedić, A.: Decentralize and randomize: faster algorithm for Wasserstein barycenters. In: Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R., (eds.) Advances in Neural Information Processing Systems 31, NeurIPS 2018, Curran Associates, Inc., pp. 10783–10793 (2018). http://papers.nips.cc/paper/8274-decentralize-and-randomize-faster-algorithm-for-wasserstein-barycenters.pdf, arXiv:1806.03915

  19. Dvurechensky, P., Gasnikov, A.: Stochastic intermediate gradient method for convex problems with stochastic inexact oracle. J. Optim. Theory Appl. 171, pp. 121–145 (2016). https://doi.org/10.1007/s10957-016-0999-6

  20. Dvurechensky, P., Gasnikov, A., Omelchenko, S., Tiurin, A.: A stable alternative to Sinkhorn’s algorithm for regularized optimal transport. In: Kononov, A., Khachay, M., Kalyagin, V.A., Pardalos, P., (eds.) Mathematical Optimization Theory and Operations Research, pp. 406–423. Springer International Publishing, Cham (2020). https://doi.org/10.1007/978-3-030-49988-4_28

  21. Dvurechensky, P., Gorbunov, E., Gasnikov, A.: An accelerated directional derivative method for smooth stochastic convex optimization. Eur. J. Oper. Res. 290, 601–621 (2021). https://doi.org/10.1016/j.ejor.2020.08.027, http://www.sciencedirect.com/science/article/pii/S0377221720307402

  22. Fercoq, O., Richtárik, P.: Accelerated, parallel, and proximal coordinate descent. SIAM J. Optim. 25, 1997–2023 (2015). https://doi.org/10.1137/130949993, First appeared in arXiv:1312.5799

  23. Frostig, R., Ge, R., Kakade, S., Sidford, A.: Un-regularizing: approximate proximal point and faster stochastic algorithms for empirical risk minimization. In: Bach, F., Blei, D., (eds.) Proceedings of the 32nd International Conference on Machine Learning, Proceedings of Machine Learning Research, Lille, France, 07–09 Jul 2015, vol. 37, pp. 2540–2548. PMLR. http://proceedings.mlr.press/v37/frostig15.html

  24. Gasnikov, A.: Universal gradient descent (2017). arXiv:1711.00394

  25. Gasnikov, A., Dvurechensky, P., Nesterov, Y.: Stochastic gradient methods with inexact oracle. Proc. Mosc. Inst. Phys. Technol. 8, pp. 41–91 (2016). In Russian, first appeared in arXiv:1411.4218

  26. Gasnikov, A., Dvurechensky, P., Usmanova, I.: On accelerated randomized methods. Proc. Mosc. Inst. Phys. Technol. 8, pp. 67–100 (2016). In Russian, first appeared in arXiv:1508.02182

  27. Gasnikov, A., Tyurin, A.: Fast gradient descent for convex minimization problems with an oracle producing a (\(\delta \), l)-model of function at the requested point. Comput. Math. Math. Phys. 59, 1085–1097 (2019). https://doi.org/10.1134/S0965542519070078

  28. Gasnikov, A.V., Dvurechensky, P.E.: Stochastic intermediate gradient method for convex optimization problems. Dokl. Math. 93, 148–151 (2016). https://doi.org/10.1134/S1064562416020071

  29. Gasnikov, A.V., Dvurechensky, P.E., Zhukovskii, M.E., Kim, S.V., Plaunov, S.S., Smirnov, D.A., Noskov, F.A.: About the power law of the pagerank vector component distribution. Part 2. The Buckley–Osthus model, verification of the power law for this model, and setup of real search engines. Numer. Anal. Appl. 11, 16–32 (2018). https://doi.org/10.1134/S1995423918010032

  30. Gasnikov, A.V., Gasnikova, E.V., Dvurechensky, P.E., Mohammed, A.A.M., Chernousova, E.O.: About the power law of the pagerank vector component distribution. Part 1. Numerical methods for finding the pagerank vector. Numer. Anal. Appl. 10, 299–312 (2017). https://doi.org/10.1134/S1995423917040024

  31. Gasnikov, A.V., Krymova, E.A., Lagunovskaya, A.A., Usmanova, I.N., Fedorenko, F.A.: Stochastic online optimization. Single-point and multi-point non-linear multi-armed bandits. Convex and strongly-convex case. Autom. Remote Control 78, 224–234 (2017). https://doi.org/10.1134/S0005117917020035, arXiv:1509.01679

  32. Gasnikov, A.V., Lagunovskaya, A.A., Usmanova, I.N., Fedorenko, F.A.: Gradient-free proximal methods with inexact oracle for convex stochastic nonsmooth optimization problems on the simplex. Autom. Remote Control 77, 2018–2034 (2016). https://doi.org/10.1134/S0005117916110114, arXiv:1412.3890

  33. Gasnikov, A.V., Nesterov, Y.E.: Universal method for stochastic composite optimization problems. Comput. Math. Math. Phys. 58, 48–64 (2018). https://doi.org/10.7868/S0044466918010052

  34. Ghadimi, S., Lan, G.: Stochastic first- and zeroth-order methods for nonconvex stochastic programming. SIAM J. Optim. 23, 2341–2368 (2013). https://doi.org/10.1137/120880811, arXiv:1309.5549

  35. Ghadimi, S., Lan, G., Zhang, H.: Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization. Math. Program. 155, 267–305 (2016). https://doi.org/10.1007/s10107-014-0846-1, arXiv:1308.6594

  36. Gladin, E., Sadiev, A., Gasnikov, A., Dvurechensky, P., Beznosikov, A., Alkousa, M.: Solving smooth min-min and min-max problems by mixed oracle algorithms. In: Strekalovsky, A., Kochetov, Y., Gruzdeva, T., Orlov, A., (eds.) Mathematical Optimization Theory and Operations Research: Recent Trends. pp. 19–40. Springer International Publishing, Cham (2021). https://link.springer.com/chapter/10.1007/978-3-030-86433-0_2, arXiv:2103.00434

  37. Gorbunov, E., Danilova, M., Shibaev, I., Dvurechensky, P., Gasnikov, A.: Near-optimal high probability complexity bounds for non-smooth stochastic optimization with heavy-tailed noise (2021). arXiv:2106.05958

  38. Gorbunov, E., Dvurechensky, P., Gasnikov, A.: An accelerated method for derivative-free smooth stochastic convex optimization. SIAM J. Optim. 32(2), 1210–1238 (2022). arXiv:1802.09022

  39. Ivanova, A., Gasnikov, A., Dvurechensky, P., Dvinskikh, D., Tyurin, A., Vorontsova, E., Pasechnyuk, D.: Oracle complexity separation in convex optimization. Optim. Methods. Softw. 36(4), 720–754 (2021) https://doi.org/10.1080/10556788.2020.1712599. arXiv:2002.02706. WIAS Preprint No. 2711

  40. Juditsky, A., Nesterov, Y.: Deterministic and stochastic primal-dual subgradient algorithms for uniformly convex minimization. Stoch. Syst. 4, 44–80 (2014). https://doi.org/10.1287/10-SSY010

  41. Lan, G., Zhou, Y.: An optimal randomized incremental gradient method. Math. Program. (2017). https://doi.org/10.1007/s10107-017-1173-0

  42. Larson, J., Menickelly, M., Wild, S.M.: Derivative-free optimization methods. Acta Numer. 28, 287–404 (2019). https://doi.org/10.1017/s0962492919000060

  43. Lee, Y.T., Sidford, A.: Efficient accelerated coordinate descent methods and faster algorithms for solving linear systems. In: Proceedings of the 2013 IEEE 54th Annual Symposium on Foundations of Computer Science, FOCS ’13, pp. 147–156. IEEE Computer Society, Washington, DC, USA (2013). https://doi.org/10.1109/FOCS.2013.24. First appeared in arXiv:1305.1922

  44. Lin, H., Mairal, J., Harchaoui, Z.: A universal catalyst for first-order optimization. In: Proceedings of the 28th International Conference on Neural Information Processing Systems, NIPS’15, pp. 3384–3392. MIT Press, Cambridge, MA, USA (2015). http://dl.acm.org/citation.cfm?id=2969442.2969617

  45. Lin, Q., Lu, Z., Xiao, L.: An accelerated proximal coordinate gradient method. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q., (eds.) Advances in Neural Information Processing Systems, vol. 27, pp. 3059–3067. Curran Associates, Inc., (2014). http://papers.nips.cc/paper/5356-an-accelerated-proximal-coordinate-gradient-method.pdf. First appeared in arXiv:1407.1296

  46. Nesterov, Y.: Efficiency of coordinate descent methods on huge-scale optimization problems. SIAM J. Optim. 22, 341–362 (2012). https://doi.org/10.1137/100802001. First appeared in 2010 as CORE discussion paper 2010/2

  47. Nesterov, Y.: Lectures on convex optimization, vol. 137, Springer, Berlin (2018). https://doi.org/10.1007/978-3-319-91578-4

  48. Nesterov, Y., Spokoiny, V.: Random gradient-free minimization of convex functions. Found. Comput. Math. 17, 527–566 (2017). https://doi.org/10.1007/s10208-015-9296-2. First appeared in 2011 as CORE discussion paper 2011/16

  49. Nesterov, Y., Stich, S.U.: Efficiency of the accelerated coordinate descent method on structured optimization problems. SIAM J. Optim. 27, 110–123 (2017). https://doi.org/10.1137/16M1060182

  50. Rogozin, A., Bochko, M., Dvurechensky, P., Gasnikov, A., Lukoshkin, V.: An accelerated method for decentralized distributed stochastic optimization over time-varying graphs. In: 2021 60th IEEE Conference on Decision and Control (CDC), pp. 3367–3373 (2021). https://doi.org/10.1109/CDC45484.2021.9683110. arXiv:2103.15598

  51. Sadiev, A., Beznosikov, A., Dvurechensky, P., Gasnikov, A.: Zeroth-order algorithms for smooth saddle-point problems. In: Strekalovsky, A., Kochetov, Y., Gruzdeva, T., Orlov, A., (eds.) Mathematical Optimization Theory and Operations Research: Recent Trends, pp. 71–85. Springer International Publishing, Cham (2021). https://link.springer.com/chapter/10.1007/978-3-030-86433-0_5, arXiv:2009.09908

  52. Shalev-Shwartz, S., Zhang, T.: Accelerated proximal stochastic dual coordinate ascent for regularized loss minimization. In: Xing, E.P., Jebara, T., (eds.) Proceedings of the 31st International Conference on Machine Learning, Proceedings of Machine Learning Research, 22–24 Jun 2014, vol. 32, pp. 64–72. PMLR, Bejing, China (2014). http://proceedings.mlr.press/v32/shalev-shwartz14.html. First appeared in arXiv:1309.2375

  53. Shamir, O.: An optimal algorithm for bandit and zero-order convex optimization with two-point feedback. J. Mach. Learn. Res. 18, 52:1–52:11 (2017). http://jmlr.org/papers/v18/papers/v18/16-632.html. First appeared in arXiv:1507.08752

  54. Shibaev, I., Dvurechensky, P., Gasnikov, A.: Zeroth-order methods for noisy Hölder-gradient functions. Optim. Lett. 16(7), 2123–2143 Sep (2022). https://doi.org/10.1007/s11590-021-01742-z. arXiv:2006.11857

  55. Stonyakin, F., Tyurin, A., Gasnikov, A., Dvurechensky, P., Agafonov, A., Dvinskikh, D., Alkousa, M., Pasechnyuk, D., Artamonov, S., Piskunova, V.: Inexact model: a framework for optimization and variational inequalities. Optim. Methods Softw. 36(6), 1155–1201 (2021). https://doi.org/10.1080/10556788.2021.1924714. WIAS Preprint No. 2709, arXiv:2001.09013, arXiv:1902.00990

  56. Stonyakin, F.S., Dvinskikh, D., Dvurechensky, P., Kroshnin, A., Kuznetsova, O., Agafonov, A., Gasnikov, A., Tyurin, A., Uribe, C.A., Pasechnyuk, D., Artamonov, S.: Gradient methods for problems with inexact model of the objective. In: Khachay, M., Kochetov, Y., Pardalos, P., (eds.) Mathematical Optimization Theory and Operations Research, pp. 97–114, Springer International Publishing, Cham (2019). arXiv:1902.09001

  57. Tappenden, R., Richtárik, P., Gondzio, J.: Inexact coordinate descent: complexity and preconditioning. J. Optim. Theory Appl. 170, 144–176 (2016). https://doi.org/10.1007/s10957-016-0867-4. First appeared in arXiv:1304.5530

  58. Tyurin, A.: Mirror version of similar triangles method for constrained optimization problems (2017). arXiv:1705.09809

  59. Vorontsova, E.A., Gasnikov, A.V., Gorbunov, E.A., Dvurechenskii, P.E.: Accelerated gradient-free optimization methods with a non-euclidean proximal operator. Autom. Remote Control 80, 1487–1501 (2019). https://doi.org/10.1134/S0005117919080095

  60. Zhang, Y., Lin, X.: Stochastic primal-dual coordinate method for regularized empirical risk minimization. In: Bach, F., Blei, D., (eds.) Proceedings of the 32nd International Conference on Machine Learning, Proceedings of Machine Learning Research, 07–09 July 2015, vol. 37, pp. 353–361. PMLR, Lille, France. http://proceedings.mlr.press/v37/zhanga15.html

Download references

Acknowledgements

The authors are very grateful to Yu. Nesterov and V. Spokoiny for fruitful discussions. Our interest to this field was initiated by the paper [48]. The research was supported by the Ministry of Science and Higher Education of the Russian Federation (Goszadaniye) No. 075-00337-20-03, project No. 0714-2020-0005.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pavel Dvurechensky .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Dvurechensky, P., Gasnikov, A., Tyurin, A., Zholobov, V. (2023). Unifying Framework for Accelerated Randomized Methods in Convex Optimization. In: Belomestny, D., Butucea, C., Mammen, E., Moulines, E., Reiß, M., Ulyanov, V.V. (eds) Foundations of Modern Statistics. FMS 2019. Springer Proceedings in Mathematics & Statistics, vol 425. Springer, Cham. https://doi.org/10.1007/978-3-031-30114-8_15

Download citation

Publish with us

Policies and ethics