Abstract
The purpose of this article is to underline the links between some no-regret algorithms used in on-line learning, games and convex optimization and to compare the continuous and discrete time versions.
Similar content being viewed by others
References
Abernethy, J., Bartlett, P.L., Hazan, E.: Blackwell approachability and no-regret learning are equivalent. Proc. Mach. Learn. Res. 19, 27–46 (2011)
Alvarez, F., Bolte, J., Brahic, O.: Hessian Riemannian gradient flows in convex programming. SIAM J. Control. Optim. 43, 477–501 (2004)
Attouch, H., Bolte, J., Redont, P., Teboulle, M.: Singular Riemannian barrier methods and gradient-projection dynamical systems for constrained optimization. Optimization 53, 435–454 (2004)
Attouch, H., Chbani, Z., Peypouquet, J., Redont, P.: Fast convergence of inertial dynamics and algorithms with asymptotic vanishing damping. Math. Program. 168(1–2), 123–175 (2018)
Attouch, H., Teboulle, M.: Regularized Lotka–Volterra dynamical system as continuous proximal-like method in optimization. JOTA 121, 541–570 (2004)
Aubin, J.-P., Cellina, A.: Differential Inclusions. Springer, Berlin (1984)
Auer, P., Cesa-Bianchi, N., Freund, Y., Shapire, R.E.: The nonstochastic multiarmed bandit problem. SIAM J. Comput. 32, 48–77 (2002)
Auslender, A., Teboulle, M.: Interior gradient and proximal methods for convex and conic optimization. SIAM J. Optim. 16, 697–725 (2006)
Auslender, A., Teboulle, M.: Projected subgradient methods with non Euclidean distances for non-differentiable convex minimization and variational inequalities. Math. Program. 120, 27–48 (2009)
Baillon, J.B., Brézis, H.: Une remarque sur le comportement asymptotique des semi-groupes non linéaires, Houston. J. Math. 2, 5–7 (1976)
Bansal, N., Gupta, A.: Potential-function proofs for gradient methods. Theory Comput. 15, 1–32 (2019)
Bauschke, H., Bolte, J., Teboulle, M.: A descent lemma beyond Lipschitz gradient continuity: first-order methods revisited and applications. Math. Oper. Res. 42, 330–348 (2017)
Beck, A., Teboulle, M.: Mirror descent and nonlinear projected subgradient methods for convex optimization. Oper. Res. Lett. 31, 167–175 (2003)
Benaim, M., Hofbauer, J., Sorin, S.: Stochastic approximations and differential inclusions. SIAM J. Control. Optim. 44, 328–348 (2005)
Blackwell, D.: Controlled random walks. Proceedings of the International Congress of Mathematicians 3, 336–338 (1954)
Blackwell, D.: An analog of the minimax theorem for vector payoffs. Pac. J. Math. 6, 1–8 (1956)
Blum, A., Mansour, Y.: From external to internal regret. J. Mach. Learn. Res. 8, 1307–1324 (2007)
Bolte, J., Teboulle, M.: Barrier operators and associated gradient-like dynamical systems for constrained minimization problems. SIAM J. Control. Optim. 42, 1266–1292 (2003)
Bégout, P., Bolte, J., Jendoubi, M.A.: On damped second-order gradient systems. J. Differ. Equ. 259, 3115–3143 (2015)
Brézis, H.: Opérateurs Maximaux Monotones et Semi-groupes de Contractions dans les Espaces de Hilbert. North Holland Publishing Company, Amsterdam (1973)
Brown, G.W., von Neumann, J.: Solutions of games by differential equations. In: Kuhn, H.W., Tucker, A.W. (eds.), Contibutions to the Theory of Games, I. Annals of Mathematics Studies, vol. 24, pp. 73–79 (1950)
Bruck, R.E.: Asymptotic convergence of nonlinear contraction semigroups in Hilbert space. J. Funct. Anal. 18, 15–26 (1975)
Bubeck, S.: Convex optimization: algorithms and complexity. Found. Trends Mach. Learn. 8, 231–357 (2015)
Bui, M.N., Combettes, P.L.: Bregman forward-backward operator splitting. Set Valued Variat. Anal. (2020). https://doi.org/10.1007/s11228-020-00563-z
Cesa-Bianchi, N., Lugosi, G.: Potential-based algorithms in on-line prediction and game theory. Mach. Learn. 51, 239–261 (2003)
Cesa-Bianchi, N., Lugosi, G.: Prediction Learning and Games. Cambridge University Press, Cambridge (2006)
Chambolle, A., Dossal, C.: On the convergence of the iterates of FISTA. J. Optim. Theory Appl. 166, 968–982 (2015)
Coucheney, P., Gaujal, B., Mertikopoulos, P.: Penalty-regulated dynamics and robust learning procedures in games. Math. Oper. Res. 40, 611–633 (2015)
Cover, T.: Universal portfolios. Math. Finance 1, 1–29 (1991)
Dupuis, P., Nagurney, A.: Dynamical systems and variational inequalities. Ann. Oper. Res. 44, 9–42 (1993)
Facchinei, F., Pang, J.: Finite-Dimensional Variational Inequalities and Complementarity Problems. Springer, Berlin (2007)
Foster, D., Vohra, R.: A randomization rule for selecting forecasts. Oper. Res. 41, 704–707 (1993)
Foster, D., Vohra, R.: Calibrated leaning and correlated equilibria. Games Econom. Behav. 21, 40–55 (1997)
Foster, D., Vohra, R.: Asymptotic calibration. Biometrika 85, 379–390 (1998)
Foster, D., Vohra, R.: Regret in the on-line decision problem. Games Econ. Behav. 29, 7–35 (1999)
Freund, Y., Schapire, R.E.: Adaptive game playing using multiplicative weights. Games Econ. Behav. 29, 79–103 (1999)
Fudenberg, D., Levine, D.K.: Consistency and cautious fictitious play. J. Econ. Dyn. Control 19, 1065–1089 (1995)
Fudenberg, D., Levine, D.K.: The Theory of Learning in Games. MIT Press, New York (1998)
Fudenberg, D., Levine, D.K.: Conditional universal consistency. Games Econ. Behav. 29, 104–130 (1999)
Harker, P.T., Pang, J.S.: Finite-dimensional variational inequality and nonlinear complementarity problems: a survey of theory, algorithms, and applications. Math. Program. 48, 161–220 (1990)
Hannan, J.: Approximation to Bayes risk in repeated plays. In: Drescher, M., Tucker, A.W., Wolfe, P. (eds.), Contributions to the Theory of Games, III, Princeton University Press, pp. 97–139 (1957)
Hart, S.: Adaptive heuristics. Econometrica 73, 1401–1430 (2005)
Hart, S., Mas-Colell, A.: A simple adaptive procedure leading to correlated equilibria. Econometrica 68, 1127–1150 (2000)
Hart, S., Mas-Colell, A.: A general class of adaptive strategies. J. Econ. Theory 98, 26–54 (2001)
Hart, S., Mas-Colell, A.: Regret-based continuous time dynamics. Games Econ. Behav. 45, 375–394 (2003)
Hart, S., Mas Colell, A.: Simple Adaptive Strategies: From Regret-Matching to Uncoupled Dynamics. World Scientific Publishing, Singapore (2013)
Hazan, E.: The convex optimization approach to regret minimization. In: Sra, S., Nowozin, S., Wright, S. (eds.), Optimization for Machine Learning, MIT Press, pp. 287–303 (2011)
Hazan, E.: Introduction to online convex optimization. Found. Trends Optim. 2, 157–325 (2015)
Hazan, E.: Optimization for Machine Learning (2019). arXiv:1909.03550
Hofbauer, J.: Minmax via replicator dynamics. Dyn. Games Appl. 8, 637–640 (2018)
Hofbauer, J., Sandholm, W.H.: Stable games and their dynamics. J. Econ. Theory 144, 1710–1725 (2009)
Hofbauer, J., Sigmund, K.: Evolutionary Games and Population Dynamics, Cambridge U.P (1998)
Hofbauer, J., Sorin, S.: Best response dynamics for continuous zero-sum games. Discrete Continu. Dyn. Syst. Ser. B 6, 215–224 (2006)
Kalai, A., Vempala, S.: Efficient algorithms for online decision problems. J. Comput. Syst. Sci. 71, 291–307 (2005)
Kinderlehrer, D., Stampacchia, G.: An Introduction to Variational Inequalities and Their Applications. Academic Press, New York (1980)
Korpelevich, G.: The extragradient method for finding saddle points and other problems. Ekonomika i Matematicheskie Metody 12, 747–756 (1976)
Krichene, W., Bayen, A., Bartlett, P.: Accelerated mirror descent in continuous and discrete time. In: NIPS (2015)
Krichene, W., Bayen, A., Bartlett, P.: Adaptive averaging in accelerated descent dynamics. In: NIPS (2016)
Kwon, J., Mertikopoulos, P.: A continuous time approach to on-line optimization. J. Dyn. Games 4, 125–148 (2017)
Lahkar, R., Sandholm, W.H.: The projection dynamic and the geometry of population games. Games Econ. Behav. 64, 565–590 (2008)
Lehrer, E.: A wide range no-regret theorem. Games Econ. Behav. 42, 101–115 (2003)
Lehrer, E., Sorin, S.: Minmax via differential inclusion. J. Convex Anal. 14, 271–274 (2007)
Levitin, E.S., Polyak, B.T.: Constrained minimization methods. USSR Comput. Math. Math. Phys. 6, 1–50 (1966)
Littlestone, N., Warmuth, M.K.: The weighted majority algorithm. Inf. Comput. 108, 212–261 (1994)
Lu, H., Nesterov, Y.: Relatively smooth convex optimization by first-order methods, and applications. SIAM J. Opt. 28, 333–354 (2018)
Mertikopoulos, P., Sandholm, W.H.: Learning in games via reinforcement and regularization. Math. Oper. Res. 41, 1297–1324 (2016)
Mertikopoulos, P., Sandholm, W.H.: Riemannian game dynamics. J. Econ. Theory 177, 315–364 (2018)
Mertikopoulos, P., Zhou, Z.: Learning in games with continuous action sets and unknown payoff functions. Math. Program. 173, 465–507 (2019)
Minty, G.J.: On the generalization of a direct method of the calculus of variations. Bull. AMS 73, 315–321 (1967)
Monderer, D., Shapley, L.S.: Potential games. Games Econ. Behav. 14, 124–143 (1996)
Moreau, J.J.: Proximité et dualité dans un espace Hilbertien. Bull. Soc. Math. France 93, 273–299 (1965)
Nagurney, A., Zhang, D.: Projected dynamical systems in the formulation, stability analysis, and computation of fixed demand traffic network equilibria. Transp. Sci. 31, 147–158 (1997)
Nemirovski, A.: Prox-method with rate of convergence O(1/t) for variational inequalities with Lipschitz continuous monotone operators and smooth convex-concave saddle point problems. SIAM J. Opt. 15, 229–251 (2004)
Nemirovski, A., Yudin, D.: Problem Complexity and Method Efficiency in Optimization. Wiley, New York (1983)
Nesterov, Y.: A method of solving a convex programming problem with convergence rate \(O(1/k^2)\). Sov. Math. Doklady 27, 372–376 (1983)
Nesterov, Y.: Introductory Lectures on Convex Optimization. Kluwer, Dordrecht (2004)
Nesterov, Y.: Primal-dual subgradient methods for convex problems. Math. Program. 120, 221–259 (2009)
Nguyen, Q.V.: Forward–backward splitting with Bregman distances. Vietnam J. Math. 45, 519–539 (2017)
Opial, Z.: Weak convergence of the sequence of successive approximations for nonexpansive mappings. Bull. Am. Math. Soc. 73, 591–597 (1967)
Perchet, V.: Approachability, regret and calibration: implications and equivalences. J. Dyn. Games 1, 181–254 (2014)
Peypouquet, J., Sorin, S.: Evolution equations for maximal monotone operators: asymptotic analysis in continuous and discrete time. J. Convex Anal. 17, 1113–1163 (2010)
Polyak, B.: Introduction to Optimization, Optimization Software (1987)
Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970)
Rockafellar, R.T.: Monotone operators associated with saddle-functions and minmax problems. In: Browder, F. (ed.), Nonlinear Functional Analysis: Proceedings of Symposia in Pure Mathematics, vol. 18, AMS, pp. 241–250 (1970)
Rosen, J.B.: Existence and uniqueness of equilibrium points for concave N-person games. Econometrica 33, 520–534 (1965)
Sandholm, W.H.: Potential games with continuous player sets. J. Econ. Theory 97, 81–108 (2001)
Sandholm, W.H.: Population Games and Evolutionary Dynamics. MIT Press, New YOrk (2010)
Sandholm, W.H.: Population games and deterministic evolutionary dynamics. In: Young, H.P., Zamir, S. (eds.), Handbook of Game Theory IV, Elsevier, pp. 703–778 (2015)
Sandholm, W.H., Dokumaci, E., Lahkar, R.: The projection dynamic and the replicator dynamic. Games Econ. Behav. 64, 666–683 (2008)
Shahshahani, S.: A new mathematical framework for the study of linkage and selection. In: Memoirs of the American Mathematical Society, vol. 211 (1979)
Shalev-Shwartz, S.: Online learning and online convex optimization. Found. Trends Mach. Learn. 4, 107–194 (2012)
Smith, M.J.: A descent algorithm for solving monotone variational inequalities and monotone complementarity problems. J. Optim. Theory Appl. 44, 485–496 (1984)
Sorin, S.: Exponential weight algorithm in continuous time. Math. Program. Ser. B 116, 513–528 (2009)
Sorin, S.: On some global and unilateral adaptive dynamics. In: Sigmund, K. (ed.), Evolutionary Game Dynamics, Proceedings of Symposia in Applied Mathematics, vol. 69, AMS, pp. 81–109 (2011)
Sorin, S.: Replicator dynamics: old and new. J. Dyn. Games 7, 365–385 (2020)
Sorin, S., Wan, C.: Finite composite games: equilibria and dynamics. J. Dyn. Games 3, 101–120 (2016)
Stoltz, G., Lugosi, G.: Internal regret in on-line portfolio selection. Mach. Learn. 59, 125–159 (2005)
Su, W., Boyd, S., Candes, E.: A differential equation for modeling Nesterov’s accelerated gradient method: theory and insights. In: NIPS (2014)
Su, W., Boyd, S., Candes, E.: A differential equation for modeling Nesterov’s accelerated gradient method: theory and insights. J. Mach. Learn. Res. 17, 1–43 (2016)
Swinkels, J.M.: Adjustment dynamics and rational play in games. Games Econ. Behav. 5, 455–484 (1983)
Taylor, P., Jonker, L.: Evolutionary stable strategies and game dynamics. Math. Biosci. 40, 145–156 (1978)
Teboulle, M.: A simplified view of first order methods for optimization. Math. Program. 170, 67–96 (2018)
Viossat, Y., Zapechelnyuk, A.: No-regret dynamics and fictitious play. J. Econ. Theory 148, 825–842 (2013)
Vovk, V.: Aggregating strategies. In: Proceedings of the 3rd Annual Conference on Computational Learning Theory, pp. 371–383 (1990)
Wibisono, A., Wilson, A.C., Jordan, M.I.: A variational perspective on accelerated methods in optimization. PNAS 113, 7351–7358 (2016)
Xiao, L.: Dual averaging methods for regularized stochastic learning and online optimization. J. Mach. Learn. Res. 11, 2543–2596 (2010)
Zinkevich, M.: Online convex programming and generalized infinitesimal gradient ascent. In; Proceedings of the 20th International Conference on Machine Learning, pp. 928–936 (2003)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Part of this research has been supported by a PGMO grant (2015): “COGLED-Convergence of gradient-like and evolutionary dynamics”, with Jerôme Bolte. Partial successive versions have been presented at the following events: PGMO Days, October 2015; Tutorial “Topics on strategic learning”, IMS, Singapore, November 2015; JFCO, Toulouse, July 2017; GAMENET, Krakow, September 2018; Variational Day, Ecole Polytechnique, December 2018; Operator Splitting Methods in Data Analysis, Flatiron Institute, NY, March 2019; GAMENET, Cluj, April 2019; MAPLE 2019, Politecnico Milano, September 2019; SPOT, Toulouse, January 2020; SPO, Paris, October 2020; One World Optimization Seminar, March 2021 Multi-Agent Reinforcement Learning and Bandit Learning, Simons Institute, Berkeley, May 2022. I would like to thank J. Bolte, P. Combettes, J. Kwon, R. Laraki, V. Perchet, G. Vigeral for nice discussions and interesting comments.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Sorin, S. No-regret algorithms in on-line learning, games and convex optimization. Math. Program. 203, 645–686 (2024). https://doi.org/10.1007/s10107-023-01927-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10107-023-01927-7