No-regret algorithms in on-line learning, games and convex optimization

Sorin, Sylvain

doi:10.1007/s10107-023-01927-7

No-regret algorithms in on-line learning, games and convex optimization

Full Length Paper
Series B
Published: 23 March 2023

Volume 203, pages 645–686, (2024)
Cite this article

Mathematical Programming Submit manuscript

Sylvain Sorin ORCID: orcid.org/0000-0001-7100-4832¹

338 Accesses
Explore all metrics

Abstract

The purpose of this article is to underline the links between some no-regret algorithms used in on-line learning, games and convex optimization and to compare the continuous and discrete time versions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Continuous Time Learning Algorithms in Optimization and Game Theory

Article 31 January 2022

No-regret dynamics in the Fenchel game: a unified framework for algorithmic convex optimization

Article 22 June 2023

Sublinear Regret with Barzilai-Borwein Step Sizes

References

Abernethy, J., Bartlett, P.L., Hazan, E.: Blackwell approachability and no-regret learning are equivalent. Proc. Mach. Learn. Res. 19, 27–46 (2011)
Google Scholar
Alvarez, F., Bolte, J., Brahic, O.: Hessian Riemannian gradient flows in convex programming. SIAM J. Control. Optim. 43, 477–501 (2004)
MathSciNet Google Scholar
Attouch, H., Bolte, J., Redont, P., Teboulle, M.: Singular Riemannian barrier methods and gradient-projection dynamical systems for constrained optimization. Optimization 53, 435–454 (2004)
MathSciNet Google Scholar
Attouch, H., Chbani, Z., Peypouquet, J., Redont, P.: Fast convergence of inertial dynamics and algorithms with asymptotic vanishing damping. Math. Program. 168(1–2), 123–175 (2018)
MathSciNet Google Scholar
Attouch, H., Teboulle, M.: Regularized Lotka–Volterra dynamical system as continuous proximal-like method in optimization. JOTA 121, 541–570 (2004)
MathSciNet Google Scholar
Aubin, J.-P., Cellina, A.: Differential Inclusions. Springer, Berlin (1984)
Google Scholar
Auer, P., Cesa-Bianchi, N., Freund, Y., Shapire, R.E.: The nonstochastic multiarmed bandit problem. SIAM J. Comput. 32, 48–77 (2002)
MathSciNet Google Scholar
Auslender, A., Teboulle, M.: Interior gradient and proximal methods for convex and conic optimization. SIAM J. Optim. 16, 697–725 (2006)
MathSciNet Google Scholar
Auslender, A., Teboulle, M.: Projected subgradient methods with non Euclidean distances for non-differentiable convex minimization and variational inequalities. Math. Program. 120, 27–48 (2009)
MathSciNet Google Scholar
Baillon, J.B., Brézis, H.: Une remarque sur le comportement asymptotique des semi-groupes non linéaires, Houston. J. Math. 2, 5–7 (1976)
MathSciNet Google Scholar
Bansal, N., Gupta, A.: Potential-function proofs for gradient methods. Theory Comput. 15, 1–32 (2019)
MathSciNet Google Scholar
Bauschke, H., Bolte, J., Teboulle, M.: A descent lemma beyond Lipschitz gradient continuity: first-order methods revisited and applications. Math. Oper. Res. 42, 330–348 (2017)
MathSciNet Google Scholar
Beck, A., Teboulle, M.: Mirror descent and nonlinear projected subgradient methods for convex optimization. Oper. Res. Lett. 31, 167–175 (2003)
MathSciNet Google Scholar
Benaim, M., Hofbauer, J., Sorin, S.: Stochastic approximations and differential inclusions. SIAM J. Control. Optim. 44, 328–348 (2005)
MathSciNet Google Scholar
Blackwell, D.: Controlled random walks. Proceedings of the International Congress of Mathematicians 3, 336–338 (1954)
Google Scholar
Blackwell, D.: An analog of the minimax theorem for vector payoffs. Pac. J. Math. 6, 1–8 (1956)
MathSciNet Google Scholar
Blum, A., Mansour, Y.: From external to internal regret. J. Mach. Learn. Res. 8, 1307–1324 (2007)
MathSciNet Google Scholar
Bolte, J., Teboulle, M.: Barrier operators and associated gradient-like dynamical systems for constrained minimization problems. SIAM J. Control. Optim. 42, 1266–1292 (2003)
MathSciNet Google Scholar
Bégout, P., Bolte, J., Jendoubi, M.A.: On damped second-order gradient systems. J. Differ. Equ. 259, 3115–3143 (2015)
ADS MathSciNet Google Scholar
Brézis, H.: Opérateurs Maximaux Monotones et Semi-groupes de Contractions dans les Espaces de Hilbert. North Holland Publishing Company, Amsterdam (1973)
Google Scholar
Brown, G.W., von Neumann, J.: Solutions of games by differential equations. In: Kuhn, H.W., Tucker, A.W. (eds.), Contibutions to the Theory of Games, I. Annals of Mathematics Studies, vol. 24, pp. 73–79 (1950)
Bruck, R.E.: Asymptotic convergence of nonlinear contraction semigroups in Hilbert space. J. Funct. Anal. 18, 15–26 (1975)
MathSciNet Google Scholar
Bubeck, S.: Convex optimization: algorithms and complexity. Found. Trends Mach. Learn. 8, 231–357 (2015)
Google Scholar
Bui, M.N., Combettes, P.L.: Bregman forward-backward operator splitting. Set Valued Variat. Anal. (2020). https://doi.org/10.1007/s11228-020-00563-z
Article Google Scholar
Cesa-Bianchi, N., Lugosi, G.: Potential-based algorithms in on-line prediction and game theory. Mach. Learn. 51, 239–261 (2003)
Google Scholar
Cesa-Bianchi, N., Lugosi, G.: Prediction Learning and Games. Cambridge University Press, Cambridge (2006)
Google Scholar
Chambolle, A., Dossal, C.: On the convergence of the iterates of FISTA. J. Optim. Theory Appl. 166, 968–982 (2015)
MathSciNet Google Scholar
Coucheney, P., Gaujal, B., Mertikopoulos, P.: Penalty-regulated dynamics and robust learning procedures in games. Math. Oper. Res. 40, 611–633 (2015)
MathSciNet Google Scholar
Cover, T.: Universal portfolios. Math. Finance 1, 1–29 (1991)
MathSciNet Google Scholar
Dupuis, P., Nagurney, A.: Dynamical systems and variational inequalities. Ann. Oper. Res. 44, 9–42 (1993)
MathSciNet Google Scholar
Facchinei, F., Pang, J.: Finite-Dimensional Variational Inequalities and Complementarity Problems. Springer, Berlin (2007)
Google Scholar
Foster, D., Vohra, R.: A randomization rule for selecting forecasts. Oper. Res. 41, 704–707 (1993)
Google Scholar
Foster, D., Vohra, R.: Calibrated leaning and correlated equilibria. Games Econom. Behav. 21, 40–55 (1997)
MathSciNet Google Scholar
Foster, D., Vohra, R.: Asymptotic calibration. Biometrika 85, 379–390 (1998)
MathSciNet Google Scholar
Foster, D., Vohra, R.: Regret in the on-line decision problem. Games Econ. Behav. 29, 7–35 (1999)
MathSciNet Google Scholar
Freund, Y., Schapire, R.E.: Adaptive game playing using multiplicative weights. Games Econ. Behav. 29, 79–103 (1999)
MathSciNet Google Scholar
Fudenberg, D., Levine, D.K.: Consistency and cautious fictitious play. J. Econ. Dyn. Control 19, 1065–1089 (1995)
MathSciNet Google Scholar
Fudenberg, D., Levine, D.K.: The Theory of Learning in Games. MIT Press, New York (1998)
Google Scholar
Fudenberg, D., Levine, D.K.: Conditional universal consistency. Games Econ. Behav. 29, 104–130 (1999)
MathSciNet Google Scholar
Harker, P.T., Pang, J.S.: Finite-dimensional variational inequality and nonlinear complementarity problems: a survey of theory, algorithms, and applications. Math. Program. 48, 161–220 (1990)
MathSciNet Google Scholar
Hannan, J.: Approximation to Bayes risk in repeated plays. In: Drescher, M., Tucker, A.W., Wolfe, P. (eds.), Contributions to the Theory of Games, III, Princeton University Press, pp. 97–139 (1957)
Hart, S.: Adaptive heuristics. Econometrica 73, 1401–1430 (2005)
MathSciNet Google Scholar
Hart, S., Mas-Colell, A.: A simple adaptive procedure leading to correlated equilibria. Econometrica 68, 1127–1150 (2000)
MathSciNet Google Scholar
Hart, S., Mas-Colell, A.: A general class of adaptive strategies. J. Econ. Theory 98, 26–54 (2001)
MathSciNet Google Scholar
Hart, S., Mas-Colell, A.: Regret-based continuous time dynamics. Games Econ. Behav. 45, 375–394 (2003)
MathSciNet Google Scholar
Hart, S., Mas Colell, A.: Simple Adaptive Strategies: From Regret-Matching to Uncoupled Dynamics. World Scientific Publishing, Singapore (2013)
Google Scholar
Hazan, E.: The convex optimization approach to regret minimization. In: Sra, S., Nowozin, S., Wright, S. (eds.), Optimization for Machine Learning, MIT Press, pp. 287–303 (2011)
Hazan, E.: Introduction to online convex optimization. Found. Trends Optim. 2, 157–325 (2015)
Google Scholar
Hazan, E.: Optimization for Machine Learning (2019). arXiv:1909.03550
Hofbauer, J.: Minmax via replicator dynamics. Dyn. Games Appl. 8, 637–640 (2018)
MathSciNet Google Scholar
Hofbauer, J., Sandholm, W.H.: Stable games and their dynamics. J. Econ. Theory 144, 1710–1725 (2009)
MathSciNet Google Scholar
Hofbauer, J., Sigmund, K.: Evolutionary Games and Population Dynamics, Cambridge U.P (1998)
Hofbauer, J., Sorin, S.: Best response dynamics for continuous zero-sum games. Discrete Continu. Dyn. Syst. Ser. B 6, 215–224 (2006)
MathSciNet Google Scholar
Kalai, A., Vempala, S.: Efficient algorithms for online decision problems. J. Comput. Syst. Sci. 71, 291–307 (2005)
MathSciNet Google Scholar
Kinderlehrer, D., Stampacchia, G.: An Introduction to Variational Inequalities and Their Applications. Academic Press, New York (1980)
Google Scholar
Korpelevich, G.: The extragradient method for finding saddle points and other problems. Ekonomika i Matematicheskie Metody 12, 747–756 (1976)
MathSciNet Google Scholar
Krichene, W., Bayen, A., Bartlett, P.: Accelerated mirror descent in continuous and discrete time. In: NIPS (2015)
Krichene, W., Bayen, A., Bartlett, P.: Adaptive averaging in accelerated descent dynamics. In: NIPS (2016)
Kwon, J., Mertikopoulos, P.: A continuous time approach to on-line optimization. J. Dyn. Games 4, 125–148 (2017)
MathSciNet Google Scholar
Lahkar, R., Sandholm, W.H.: The projection dynamic and the geometry of population games. Games Econ. Behav. 64, 565–590 (2008)
MathSciNet Google Scholar
Lehrer, E.: A wide range no-regret theorem. Games Econ. Behav. 42, 101–115 (2003)
MathSciNet Google Scholar
Lehrer, E., Sorin, S.: Minmax via differential inclusion. J. Convex Anal. 14, 271–274 (2007)
MathSciNet Google Scholar
Levitin, E.S., Polyak, B.T.: Constrained minimization methods. USSR Comput. Math. Math. Phys. 6, 1–50 (1966)
Google Scholar
Littlestone, N., Warmuth, M.K.: The weighted majority algorithm. Inf. Comput. 108, 212–261 (1994)
MathSciNet Google Scholar
Lu, H., Nesterov, Y.: Relatively smooth convex optimization by first-order methods, and applications. SIAM J. Opt. 28, 333–354 (2018)
MathSciNet Google Scholar
Mertikopoulos, P., Sandholm, W.H.: Learning in games via reinforcement and regularization. Math. Oper. Res. 41, 1297–1324 (2016)
MathSciNet Google Scholar
Mertikopoulos, P., Sandholm, W.H.: Riemannian game dynamics. J. Econ. Theory 177, 315–364 (2018)
MathSciNet Google Scholar
Mertikopoulos, P., Zhou, Z.: Learning in games with continuous action sets and unknown payoff functions. Math. Program. 173, 465–507 (2019)
MathSciNet Google Scholar
Minty, G.J.: On the generalization of a direct method of the calculus of variations. Bull. AMS 73, 315–321 (1967)
MathSciNet Google Scholar
Monderer, D., Shapley, L.S.: Potential games. Games Econ. Behav. 14, 124–143 (1996)
MathSciNet Google Scholar
Moreau, J.J.: Proximité et dualité dans un espace Hilbertien. Bull. Soc. Math. France 93, 273–299 (1965)
MathSciNet Google Scholar
Nagurney, A., Zhang, D.: Projected dynamical systems in the formulation, stability analysis, and computation of fixed demand traffic network equilibria. Transp. Sci. 31, 147–158 (1997)
Google Scholar
Nemirovski, A.: Prox-method with rate of convergence O(1/t) for variational inequalities with Lipschitz continuous monotone operators and smooth convex-concave saddle point problems. SIAM J. Opt. 15, 229–251 (2004)
MathSciNet Google Scholar
Nemirovski, A., Yudin, D.: Problem Complexity and Method Efficiency in Optimization. Wiley, New York (1983)
Google Scholar
Nesterov, Y.: A method of solving a convex programming problem with convergence rate \(O(1/k^2)\). Sov. Math. Doklady 27, 372–376 (1983)
Google Scholar
Nesterov, Y.: Introductory Lectures on Convex Optimization. Kluwer, Dordrecht (2004)
Google Scholar
Nesterov, Y.: Primal-dual subgradient methods for convex problems. Math. Program. 120, 221–259 (2009)
MathSciNet Google Scholar
Nguyen, Q.V.: Forward–backward splitting with Bregman distances. Vietnam J. Math. 45, 519–539 (2017)
MathSciNet Google Scholar
Opial, Z.: Weak convergence of the sequence of successive approximations for nonexpansive mappings. Bull. Am. Math. Soc. 73, 591–597 (1967)
MathSciNet Google Scholar
Perchet, V.: Approachability, regret and calibration: implications and equivalences. J. Dyn. Games 1, 181–254 (2014)
MathSciNet Google Scholar
Peypouquet, J., Sorin, S.: Evolution equations for maximal monotone operators: asymptotic analysis in continuous and discrete time. J. Convex Anal. 17, 1113–1163 (2010)
MathSciNet Google Scholar
Polyak, B.: Introduction to Optimization, Optimization Software (1987)
Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970)
Google Scholar
Rockafellar, R.T.: Monotone operators associated with saddle-functions and minmax problems. In: Browder, F. (ed.), Nonlinear Functional Analysis: Proceedings of Symposia in Pure Mathematics, vol. 18, AMS, pp. 241–250 (1970)
Rosen, J.B.: Existence and uniqueness of equilibrium points for concave N-person games. Econometrica 33, 520–534 (1965)
MathSciNet Google Scholar
Sandholm, W.H.: Potential games with continuous player sets. J. Econ. Theory 97, 81–108 (2001)
MathSciNet Google Scholar
Sandholm, W.H.: Population Games and Evolutionary Dynamics. MIT Press, New YOrk (2010)
Google Scholar
Sandholm, W.H.: Population games and deterministic evolutionary dynamics. In: Young, H.P., Zamir, S. (eds.), Handbook of Game Theory IV, Elsevier, pp. 703–778 (2015)
Sandholm, W.H., Dokumaci, E., Lahkar, R.: The projection dynamic and the replicator dynamic. Games Econ. Behav. 64, 666–683 (2008)
MathSciNet Google Scholar
Shahshahani, S.: A new mathematical framework for the study of linkage and selection. In: Memoirs of the American Mathematical Society, vol. 211 (1979)
Shalev-Shwartz, S.: Online learning and online convex optimization. Found. Trends Mach. Learn. 4, 107–194 (2012)
Google Scholar
Smith, M.J.: A descent algorithm for solving monotone variational inequalities and monotone complementarity problems. J. Optim. Theory Appl. 44, 485–496 (1984)
MathSciNet Google Scholar
Sorin, S.: Exponential weight algorithm in continuous time. Math. Program. Ser. B 116, 513–528 (2009)
MathSciNet Google Scholar
Sorin, S.: On some global and unilateral adaptive dynamics. In: Sigmund, K. (ed.), Evolutionary Game Dynamics, Proceedings of Symposia in Applied Mathematics, vol. 69, AMS, pp. 81–109 (2011)
Sorin, S.: Replicator dynamics: old and new. J. Dyn. Games 7, 365–385 (2020)
MathSciNet Google Scholar
Sorin, S., Wan, C.: Finite composite games: equilibria and dynamics. J. Dyn. Games 3, 101–120 (2016)
MathSciNet Google Scholar
Stoltz, G., Lugosi, G.: Internal regret in on-line portfolio selection. Mach. Learn. 59, 125–159 (2005)
Google Scholar
Su, W., Boyd, S., Candes, E.: A differential equation for modeling Nesterov’s accelerated gradient method: theory and insights. In: NIPS (2014)
Su, W., Boyd, S., Candes, E.: A differential equation for modeling Nesterov’s accelerated gradient method: theory and insights. J. Mach. Learn. Res. 17, 1–43 (2016)
MathSciNet Google Scholar
Swinkels, J.M.: Adjustment dynamics and rational play in games. Games Econ. Behav. 5, 455–484 (1983)
MathSciNet Google Scholar
Taylor, P., Jonker, L.: Evolutionary stable strategies and game dynamics. Math. Biosci. 40, 145–156 (1978)
MathSciNet Google Scholar
Teboulle, M.: A simplified view of first order methods for optimization. Math. Program. 170, 67–96 (2018)
MathSciNet Google Scholar
Viossat, Y., Zapechelnyuk, A.: No-regret dynamics and fictitious play. J. Econ. Theory 148, 825–842 (2013)
MathSciNet Google Scholar
Vovk, V.: Aggregating strategies. In: Proceedings of the 3rd Annual Conference on Computational Learning Theory, pp. 371–383 (1990)
Wibisono, A., Wilson, A.C., Jordan, M.I.: A variational perspective on accelerated methods in optimization. PNAS 113, 7351–7358 (2016)
ADS MathSciNet Google Scholar
Xiao, L.: Dual averaging methods for regularized stochastic learning and online optimization. J. Mach. Learn. Res. 11, 2543–2596 (2010)
MathSciNet Google Scholar
Zinkevich, M.: Online convex programming and generalized infinitesimal gradient ascent. In; Proceedings of the 20th International Conference on Machine Learning, pp. 928–936 (2003)

Download references

Author information

Authors and Affiliations

CNRS UMR 7586, Institut de Mathématiques Jussieu-PRG, Sorbonne Université, Campus Pierre et Marie Curie, Paris, France
Sylvain Sorin

Authors

Sylvain Sorin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sylvain Sorin.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Part of this research has been supported by a PGMO grant (2015): “COGLED-Convergence of gradient-like and evolutionary dynamics”, with Jerôme Bolte. Partial successive versions have been presented at the following events: PGMO Days, October 2015; Tutorial “Topics on strategic learning”, IMS, Singapore, November 2015; JFCO, Toulouse, July 2017; GAMENET, Krakow, September 2018; Variational Day, Ecole Polytechnique, December 2018; Operator Splitting Methods in Data Analysis, Flatiron Institute, NY, March 2019; GAMENET, Cluj, April 2019; MAPLE 2019, Politecnico Milano, September 2019; SPOT, Toulouse, January 2020; SPO, Paris, October 2020; One World Optimization Seminar, March 2021 Multi-Agent Reinforcement Learning and Bandit Learning, Simons Institute, Berkeley, May 2022. I would like to thank J. Bolte, P. Combettes, J. Kwon, R. Laraki, V. Perchet, G. Vigeral for nice discussions and interesting comments.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Sorin, S. No-regret algorithms in on-line learning, games and convex optimization. Math. Program. 203, 645–686 (2024). https://doi.org/10.1007/s10107-023-01927-7

Download citation

Received: 04 January 2021
Accepted: 16 January 2023
Published: 23 March 2023
Issue Date: January 2024
DOI: https://doi.org/10.1007/s10107-023-01927-7

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

No-regret algorithms in on-line learning, games and convex optimization

Abstract

Access this article

Similar content being viewed by others

Continuous Time Learning Algorithms in Optimization and Game Theory

No-regret dynamics in the Fenchel game: a unified framework for algorithmic convex optimization

Sublinear Regret with Barzilai-Borwein Step Sizes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Mathematics Subject Classification

Navigation

No-regret algorithms in on-line learning, games and convex optimization

Abstract

Access this article

Similar content being viewed by others

Continuous Time Learning Algorithms in Optimization and Game Theory

No-regret dynamics in the Fenchel game: a unified framework for algorithmic convex optimization

Sublinear Regret with Barzilai-Borwein Step Sizes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Mathematics Subject Classification

Search

Navigation