Skip to main content
Log in

No-regret algorithms in on-line learning, games and convex optimization

  • Full Length Paper
  • Series B
  • Published:
Mathematical Programming Submit manuscript

Abstract

The purpose of this article is to underline the links between some no-regret algorithms used in on-line learning, games and convex optimization and to compare the continuous and discrete time versions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Abernethy, J., Bartlett, P.L., Hazan, E.: Blackwell approachability and no-regret learning are equivalent. Proc. Mach. Learn. Res. 19, 27–46 (2011)

    Google Scholar 

  2. Alvarez, F., Bolte, J., Brahic, O.: Hessian Riemannian gradient flows in convex programming. SIAM J. Control. Optim. 43, 477–501 (2004)

    MathSciNet  Google Scholar 

  3. Attouch, H., Bolte, J., Redont, P., Teboulle, M.: Singular Riemannian barrier methods and gradient-projection dynamical systems for constrained optimization. Optimization 53, 435–454 (2004)

    MathSciNet  Google Scholar 

  4. Attouch, H., Chbani, Z., Peypouquet, J., Redont, P.: Fast convergence of inertial dynamics and algorithms with asymptotic vanishing damping. Math. Program. 168(1–2), 123–175 (2018)

    MathSciNet  Google Scholar 

  5. Attouch, H., Teboulle, M.: Regularized Lotka–Volterra dynamical system as continuous proximal-like method in optimization. JOTA 121, 541–570 (2004)

    MathSciNet  Google Scholar 

  6. Aubin, J.-P., Cellina, A.: Differential Inclusions. Springer, Berlin (1984)

    Google Scholar 

  7. Auer, P., Cesa-Bianchi, N., Freund, Y., Shapire, R.E.: The nonstochastic multiarmed bandit problem. SIAM J. Comput. 32, 48–77 (2002)

    MathSciNet  Google Scholar 

  8. Auslender, A., Teboulle, M.: Interior gradient and proximal methods for convex and conic optimization. SIAM J. Optim. 16, 697–725 (2006)

    MathSciNet  Google Scholar 

  9. Auslender, A., Teboulle, M.: Projected subgradient methods with non Euclidean distances for non-differentiable convex minimization and variational inequalities. Math. Program. 120, 27–48 (2009)

    MathSciNet  Google Scholar 

  10. Baillon, J.B., Brézis, H.: Une remarque sur le comportement asymptotique des semi-groupes non linéaires, Houston. J. Math. 2, 5–7 (1976)

    MathSciNet  Google Scholar 

  11. Bansal, N., Gupta, A.: Potential-function proofs for gradient methods. Theory Comput. 15, 1–32 (2019)

    MathSciNet  Google Scholar 

  12. Bauschke, H., Bolte, J., Teboulle, M.: A descent lemma beyond Lipschitz gradient continuity: first-order methods revisited and applications. Math. Oper. Res. 42, 330–348 (2017)

    MathSciNet  Google Scholar 

  13. Beck, A., Teboulle, M.: Mirror descent and nonlinear projected subgradient methods for convex optimization. Oper. Res. Lett. 31, 167–175 (2003)

    MathSciNet  Google Scholar 

  14. Benaim, M., Hofbauer, J., Sorin, S.: Stochastic approximations and differential inclusions. SIAM J. Control. Optim. 44, 328–348 (2005)

    MathSciNet  Google Scholar 

  15. Blackwell, D.: Controlled random walks. Proceedings of the International Congress of Mathematicians 3, 336–338 (1954)

    Google Scholar 

  16. Blackwell, D.: An analog of the minimax theorem for vector payoffs. Pac. J. Math. 6, 1–8 (1956)

    MathSciNet  Google Scholar 

  17. Blum, A., Mansour, Y.: From external to internal regret. J. Mach. Learn. Res. 8, 1307–1324 (2007)

    MathSciNet  Google Scholar 

  18. Bolte, J., Teboulle, M.: Barrier operators and associated gradient-like dynamical systems for constrained minimization problems. SIAM J. Control. Optim. 42, 1266–1292 (2003)

    MathSciNet  Google Scholar 

  19. Bégout, P., Bolte, J., Jendoubi, M.A.: On damped second-order gradient systems. J. Differ. Equ. 259, 3115–3143 (2015)

    ADS  MathSciNet  Google Scholar 

  20. Brézis, H.: Opérateurs Maximaux Monotones et Semi-groupes de Contractions dans les Espaces de Hilbert. North Holland Publishing Company, Amsterdam (1973)

    Google Scholar 

  21. Brown, G.W., von Neumann, J.: Solutions of games by differential equations. In: Kuhn, H.W., Tucker, A.W. (eds.), Contibutions to the Theory of Games, I. Annals of Mathematics Studies, vol. 24, pp. 73–79 (1950)

  22. Bruck, R.E.: Asymptotic convergence of nonlinear contraction semigroups in Hilbert space. J. Funct. Anal. 18, 15–26 (1975)

    MathSciNet  Google Scholar 

  23. Bubeck, S.: Convex optimization: algorithms and complexity. Found. Trends Mach. Learn. 8, 231–357 (2015)

    Google Scholar 

  24. Bui, M.N., Combettes, P.L.: Bregman forward-backward operator splitting. Set Valued Variat. Anal. (2020). https://doi.org/10.1007/s11228-020-00563-z

    Article  Google Scholar 

  25. Cesa-Bianchi, N., Lugosi, G.: Potential-based algorithms in on-line prediction and game theory. Mach. Learn. 51, 239–261 (2003)

    Google Scholar 

  26. Cesa-Bianchi, N., Lugosi, G.: Prediction Learning and Games. Cambridge University Press, Cambridge (2006)

    Google Scholar 

  27. Chambolle, A., Dossal, C.: On the convergence of the iterates of FISTA. J. Optim. Theory Appl. 166, 968–982 (2015)

    MathSciNet  Google Scholar 

  28. Coucheney, P., Gaujal, B., Mertikopoulos, P.: Penalty-regulated dynamics and robust learning procedures in games. Math. Oper. Res. 40, 611–633 (2015)

    MathSciNet  Google Scholar 

  29. Cover, T.: Universal portfolios. Math. Finance 1, 1–29 (1991)

    MathSciNet  Google Scholar 

  30. Dupuis, P., Nagurney, A.: Dynamical systems and variational inequalities. Ann. Oper. Res. 44, 9–42 (1993)

    MathSciNet  Google Scholar 

  31. Facchinei, F., Pang, J.: Finite-Dimensional Variational Inequalities and Complementarity Problems. Springer, Berlin (2007)

    Google Scholar 

  32. Foster, D., Vohra, R.: A randomization rule for selecting forecasts. Oper. Res. 41, 704–707 (1993)

    Google Scholar 

  33. Foster, D., Vohra, R.: Calibrated leaning and correlated equilibria. Games Econom. Behav. 21, 40–55 (1997)

    MathSciNet  Google Scholar 

  34. Foster, D., Vohra, R.: Asymptotic calibration. Biometrika 85, 379–390 (1998)

    MathSciNet  Google Scholar 

  35. Foster, D., Vohra, R.: Regret in the on-line decision problem. Games Econ. Behav. 29, 7–35 (1999)

    MathSciNet  Google Scholar 

  36. Freund, Y., Schapire, R.E.: Adaptive game playing using multiplicative weights. Games Econ. Behav. 29, 79–103 (1999)

    MathSciNet  Google Scholar 

  37. Fudenberg, D., Levine, D.K.: Consistency and cautious fictitious play. J. Econ. Dyn. Control 19, 1065–1089 (1995)

    MathSciNet  Google Scholar 

  38. Fudenberg, D., Levine, D.K.: The Theory of Learning in Games. MIT Press, New York (1998)

    Google Scholar 

  39. Fudenberg, D., Levine, D.K.: Conditional universal consistency. Games Econ. Behav. 29, 104–130 (1999)

    MathSciNet  Google Scholar 

  40. Harker, P.T., Pang, J.S.: Finite-dimensional variational inequality and nonlinear complementarity problems: a survey of theory, algorithms, and applications. Math. Program. 48, 161–220 (1990)

    MathSciNet  Google Scholar 

  41. Hannan, J.: Approximation to Bayes risk in repeated plays. In: Drescher, M., Tucker, A.W., Wolfe, P. (eds.), Contributions to the Theory of Games, III, Princeton University Press, pp. 97–139 (1957)

  42. Hart, S.: Adaptive heuristics. Econometrica 73, 1401–1430 (2005)

    MathSciNet  Google Scholar 

  43. Hart, S., Mas-Colell, A.: A simple adaptive procedure leading to correlated equilibria. Econometrica 68, 1127–1150 (2000)

    MathSciNet  Google Scholar 

  44. Hart, S., Mas-Colell, A.: A general class of adaptive strategies. J. Econ. Theory 98, 26–54 (2001)

    MathSciNet  Google Scholar 

  45. Hart, S., Mas-Colell, A.: Regret-based continuous time dynamics. Games Econ. Behav. 45, 375–394 (2003)

    MathSciNet  Google Scholar 

  46. Hart, S., Mas Colell, A.: Simple Adaptive Strategies: From Regret-Matching to Uncoupled Dynamics. World Scientific Publishing, Singapore (2013)

    Google Scholar 

  47. Hazan, E.: The convex optimization approach to regret minimization. In: Sra, S., Nowozin, S., Wright, S. (eds.), Optimization for Machine Learning, MIT Press, pp. 287–303 (2011)

  48. Hazan, E.: Introduction to online convex optimization. Found. Trends Optim. 2, 157–325 (2015)

    Google Scholar 

  49. Hazan, E.: Optimization for Machine Learning (2019). arXiv:1909.03550

  50. Hofbauer, J.: Minmax via replicator dynamics. Dyn. Games Appl. 8, 637–640 (2018)

    MathSciNet  Google Scholar 

  51. Hofbauer, J., Sandholm, W.H.: Stable games and their dynamics. J. Econ. Theory 144, 1710–1725 (2009)

    MathSciNet  Google Scholar 

  52. Hofbauer, J., Sigmund, K.: Evolutionary Games and Population Dynamics, Cambridge U.P (1998)

  53. Hofbauer, J., Sorin, S.: Best response dynamics for continuous zero-sum games. Discrete Continu. Dyn. Syst. Ser. B 6, 215–224 (2006)

    MathSciNet  Google Scholar 

  54. Kalai, A., Vempala, S.: Efficient algorithms for online decision problems. J. Comput. Syst. Sci. 71, 291–307 (2005)

    MathSciNet  Google Scholar 

  55. Kinderlehrer, D., Stampacchia, G.: An Introduction to Variational Inequalities and Their Applications. Academic Press, New York (1980)

    Google Scholar 

  56. Korpelevich, G.: The extragradient method for finding saddle points and other problems. Ekonomika i Matematicheskie Metody 12, 747–756 (1976)

    MathSciNet  Google Scholar 

  57. Krichene, W., Bayen, A., Bartlett, P.: Accelerated mirror descent in continuous and discrete time. In: NIPS (2015)

  58. Krichene, W., Bayen, A., Bartlett, P.: Adaptive averaging in accelerated descent dynamics. In: NIPS (2016)

  59. Kwon, J., Mertikopoulos, P.: A continuous time approach to on-line optimization. J. Dyn. Games 4, 125–148 (2017)

    MathSciNet  Google Scholar 

  60. Lahkar, R., Sandholm, W.H.: The projection dynamic and the geometry of population games. Games Econ. Behav. 64, 565–590 (2008)

    MathSciNet  Google Scholar 

  61. Lehrer, E.: A wide range no-regret theorem. Games Econ. Behav. 42, 101–115 (2003)

    MathSciNet  Google Scholar 

  62. Lehrer, E., Sorin, S.: Minmax via differential inclusion. J. Convex Anal. 14, 271–274 (2007)

    MathSciNet  Google Scholar 

  63. Levitin, E.S., Polyak, B.T.: Constrained minimization methods. USSR Comput. Math. Math. Phys. 6, 1–50 (1966)

    Google Scholar 

  64. Littlestone, N., Warmuth, M.K.: The weighted majority algorithm. Inf. Comput. 108, 212–261 (1994)

    MathSciNet  Google Scholar 

  65. Lu, H., Nesterov, Y.: Relatively smooth convex optimization by first-order methods, and applications. SIAM J. Opt. 28, 333–354 (2018)

    MathSciNet  Google Scholar 

  66. Mertikopoulos, P., Sandholm, W.H.: Learning in games via reinforcement and regularization. Math. Oper. Res. 41, 1297–1324 (2016)

    MathSciNet  Google Scholar 

  67. Mertikopoulos, P., Sandholm, W.H.: Riemannian game dynamics. J. Econ. Theory 177, 315–364 (2018)

    MathSciNet  Google Scholar 

  68. Mertikopoulos, P., Zhou, Z.: Learning in games with continuous action sets and unknown payoff functions. Math. Program. 173, 465–507 (2019)

    MathSciNet  Google Scholar 

  69. Minty, G.J.: On the generalization of a direct method of the calculus of variations. Bull. AMS 73, 315–321 (1967)

    MathSciNet  Google Scholar 

  70. Monderer, D., Shapley, L.S.: Potential games. Games Econ. Behav. 14, 124–143 (1996)

    MathSciNet  Google Scholar 

  71. Moreau, J.J.: Proximité et dualité dans un espace Hilbertien. Bull. Soc. Math. France 93, 273–299 (1965)

    MathSciNet  Google Scholar 

  72. Nagurney, A., Zhang, D.: Projected dynamical systems in the formulation, stability analysis, and computation of fixed demand traffic network equilibria. Transp. Sci. 31, 147–158 (1997)

    Google Scholar 

  73. Nemirovski, A.: Prox-method with rate of convergence O(1/t) for variational inequalities with Lipschitz continuous monotone operators and smooth convex-concave saddle point problems. SIAM J. Opt. 15, 229–251 (2004)

    MathSciNet  Google Scholar 

  74. Nemirovski, A., Yudin, D.: Problem Complexity and Method Efficiency in Optimization. Wiley, New York (1983)

    Google Scholar 

  75. Nesterov, Y.: A method of solving a convex programming problem with convergence rate \(O(1/k^2)\). Sov. Math. Doklady 27, 372–376 (1983)

    Google Scholar 

  76. Nesterov, Y.: Introductory Lectures on Convex Optimization. Kluwer, Dordrecht (2004)

    Google Scholar 

  77. Nesterov, Y.: Primal-dual subgradient methods for convex problems. Math. Program. 120, 221–259 (2009)

    MathSciNet  Google Scholar 

  78. Nguyen, Q.V.: Forward–backward splitting with Bregman distances. Vietnam J. Math. 45, 519–539 (2017)

    MathSciNet  Google Scholar 

  79. Opial, Z.: Weak convergence of the sequence of successive approximations for nonexpansive mappings. Bull. Am. Math. Soc. 73, 591–597 (1967)

    MathSciNet  Google Scholar 

  80. Perchet, V.: Approachability, regret and calibration: implications and equivalences. J. Dyn. Games 1, 181–254 (2014)

    MathSciNet  Google Scholar 

  81. Peypouquet, J., Sorin, S.: Evolution equations for maximal monotone operators: asymptotic analysis in continuous and discrete time. J. Convex Anal. 17, 1113–1163 (2010)

    MathSciNet  Google Scholar 

  82. Polyak, B.: Introduction to Optimization, Optimization Software (1987)

  83. Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970)

    Google Scholar 

  84. Rockafellar, R.T.: Monotone operators associated with saddle-functions and minmax problems. In: Browder, F. (ed.), Nonlinear Functional Analysis: Proceedings of Symposia in Pure Mathematics, vol. 18, AMS, pp. 241–250 (1970)

  85. Rosen, J.B.: Existence and uniqueness of equilibrium points for concave N-person games. Econometrica 33, 520–534 (1965)

    MathSciNet  Google Scholar 

  86. Sandholm, W.H.: Potential games with continuous player sets. J. Econ. Theory 97, 81–108 (2001)

    MathSciNet  Google Scholar 

  87. Sandholm, W.H.: Population Games and Evolutionary Dynamics. MIT Press, New YOrk (2010)

    Google Scholar 

  88. Sandholm, W.H.: Population games and deterministic evolutionary dynamics. In: Young, H.P., Zamir, S. (eds.), Handbook of Game Theory IV, Elsevier, pp. 703–778 (2015)

  89. Sandholm, W.H., Dokumaci, E., Lahkar, R.: The projection dynamic and the replicator dynamic. Games Econ. Behav. 64, 666–683 (2008)

    MathSciNet  Google Scholar 

  90. Shahshahani, S.: A new mathematical framework for the study of linkage and selection. In: Memoirs of the American Mathematical Society, vol. 211 (1979)

  91. Shalev-Shwartz, S.: Online learning and online convex optimization. Found. Trends Mach. Learn. 4, 107–194 (2012)

    Google Scholar 

  92. Smith, M.J.: A descent algorithm for solving monotone variational inequalities and monotone complementarity problems. J. Optim. Theory Appl. 44, 485–496 (1984)

    MathSciNet  Google Scholar 

  93. Sorin, S.: Exponential weight algorithm in continuous time. Math. Program. Ser. B 116, 513–528 (2009)

    MathSciNet  Google Scholar 

  94. Sorin, S.: On some global and unilateral adaptive dynamics. In: Sigmund, K. (ed.), Evolutionary Game Dynamics, Proceedings of Symposia in Applied Mathematics, vol. 69, AMS, pp. 81–109 (2011)

  95. Sorin, S.: Replicator dynamics: old and new. J. Dyn. Games 7, 365–385 (2020)

    MathSciNet  Google Scholar 

  96. Sorin, S., Wan, C.: Finite composite games: equilibria and dynamics. J. Dyn. Games 3, 101–120 (2016)

    MathSciNet  Google Scholar 

  97. Stoltz, G., Lugosi, G.: Internal regret in on-line portfolio selection. Mach. Learn. 59, 125–159 (2005)

    Google Scholar 

  98. Su, W., Boyd, S., Candes, E.: A differential equation for modeling Nesterov’s accelerated gradient method: theory and insights. In: NIPS (2014)

  99. Su, W., Boyd, S., Candes, E.: A differential equation for modeling Nesterov’s accelerated gradient method: theory and insights. J. Mach. Learn. Res. 17, 1–43 (2016)

    MathSciNet  Google Scholar 

  100. Swinkels, J.M.: Adjustment dynamics and rational play in games. Games Econ. Behav. 5, 455–484 (1983)

    MathSciNet  Google Scholar 

  101. Taylor, P., Jonker, L.: Evolutionary stable strategies and game dynamics. Math. Biosci. 40, 145–156 (1978)

    MathSciNet  Google Scholar 

  102. Teboulle, M.: A simplified view of first order methods for optimization. Math. Program. 170, 67–96 (2018)

    MathSciNet  Google Scholar 

  103. Viossat, Y., Zapechelnyuk, A.: No-regret dynamics and fictitious play. J. Econ. Theory 148, 825–842 (2013)

    MathSciNet  Google Scholar 

  104. Vovk, V.: Aggregating strategies. In: Proceedings of the 3rd Annual Conference on Computational Learning Theory, pp. 371–383 (1990)

  105. Wibisono, A., Wilson, A.C., Jordan, M.I.: A variational perspective on accelerated methods in optimization. PNAS 113, 7351–7358 (2016)

    ADS  MathSciNet  Google Scholar 

  106. Xiao, L.: Dual averaging methods for regularized stochastic learning and online optimization. J. Mach. Learn. Res. 11, 2543–2596 (2010)

    MathSciNet  Google Scholar 

  107. Zinkevich, M.: Online convex programming and generalized infinitesimal gradient ascent. In; Proceedings of the 20th International Conference on Machine Learning, pp. 928–936 (2003)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sylvain Sorin.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Part of this research has been supported by a PGMO grant (2015): “COGLED-Convergence of gradient-like and evolutionary dynamics”, with Jerôme Bolte. Partial successive versions have been presented at the following events: PGMO Days, October 2015; Tutorial “Topics on strategic learning”, IMS, Singapore, November 2015; JFCO, Toulouse, July 2017; GAMENET, Krakow, September 2018; Variational Day, Ecole Polytechnique, December 2018; Operator Splitting Methods in Data Analysis, Flatiron Institute, NY, March 2019; GAMENET, Cluj, April 2019; MAPLE 2019, Politecnico Milano, September 2019; SPOT, Toulouse, January 2020; SPO, Paris, October 2020; One World Optimization Seminar, March 2021 Multi-Agent Reinforcement Learning and Bandit Learning, Simons Institute, Berkeley, May 2022. I would like to thank J. Bolte, P. Combettes, J. Kwon, R. Laraki, V. Perchet, G. Vigeral for nice discussions and interesting comments.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sorin, S. No-regret algorithms in on-line learning, games and convex optimization. Math. Program. 203, 645–686 (2024). https://doi.org/10.1007/s10107-023-01927-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10107-023-01927-7

Mathematics Subject Classification

Navigation