Adaptive Sparse Grids in Reinforcement Learning

Garcke, Jochen; Klompmaker, Irene

doi:10.1007/978-3-319-08159-5_9

Jochen Garcke¹⁵ &
Irene Klompmaker

Part of the book series: Lecture Notes in Computational Science and Engineering ((LNCSE,volume 102))

1216 Accesses

Abstract

We propose a model-based online reinforcement learning approach for continuous domains with deterministic transitions using a spatially adaptive sparse grid in the planning stage. The model learning employs Gaussian processes regression and allows a low sample complexity. The adaptive sparse grid is introduced to allow the representation of the value function in the planning stage in higher dimensional state spaces. This work gives numerical evidence that adaptive sparse grids are applicable in the case of reinforcement learning.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Hardcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
We call a discrete space \(V _{\underline{k}}\) smaller than a space \(V _{\underline{l}}\) if ∀_t k _t ≤ l _t and \(\exists t: k_{t} < l_{t}\). In the same way a grid \(\varOmega _{\underline{k}}\) is smaller than a grid \(\varOmega _{\underline{l}}\).

References

Bardi, M., Capuzzo-Dolcetta, I.: Optimal control and viscosity solutions of Hamilton-Jacobi-Bellman equations. In: Systems and Control: Foundations and Applications. Birkhäuser, Boston (1997)
Google Scholar
Barles, G., Jakobsen, E.R.: On the convergence rate of approximation schemes for Hamilton-Jacobi-Bellman equations. M2AN Math. Model. Numer. Anal. 36(1), 33–54 (2002)
Google Scholar
Barles, G., Jakobsen, E.R.: Error bounds for monotone approximation schemes for parabolic Hamilton-Jacobi-Bellman equations. Math. Comput. 76(240), 1861–1893 (2007)
Article MATH MathSciNet Google Scholar
Barles, G., Souganidis, P.: Convergence of approximation schemes for fully nonlinear second order equations. Asymptot. Anal. 4(3), 271–283 (1991)
MATH MathSciNet Google Scholar
Bertsekas, D.P., Tsitsiklis, J.N.: Neuro-Dynamic Programming. Athena Scientific, Belmont (1996)
MATH Google Scholar
Bokanowski, O., Garcke, J., M-Griebel, Klompmaker, I.: An adaptive sparse grid semi-Lagrangian scheme for first order Hamilton-Jacobi Bellman equations. J. Sci. Comput. 55(3), 575–605 (2013)
Google Scholar
Bonnans, J.F., Ottenwaelter, E., Zidani, H.: A fast algorithm for the two dimensional HJB equation of stochastic control. M2AN, Math. Model. Numer. Anal. 38(4), 723–735 (2004)
Google Scholar
Bonnans, J.F., Zidani, H.: Consistency of generalized finite difference schemes for the stochastic HJB equation. SIAM J. Numer. Anal. 41(3), 1008–1021 (2003)
Article MATH MathSciNet Google Scholar
Brafman, R., Tennenholtz, M.: R-max-a general polynomial time algorithm for near-optimal reinforcement learning. J. Mach. Learn. Res. 3, 213–231 (2002)
MathSciNet Google Scholar
Bungartz, H.J., Griebel, M.: Sparse grids. Acta Numer. 13, 1–123 (2004)
Article MathSciNet Google Scholar
Camilli, F., Falcone, M.: An approximation scheme for the optimal control of diffusion processes. RAIRO, Modélisation Math. Anal. Numér. 29(1), 97–122 (1995)
Google Scholar
Chapman, D., Kaelbling, L.P.: Input generalization in delayed reinforcement learning: an algorithm and performance comparisons. In: Proceedings of the 12th International Joint Conference on Artificial Intelligence, San Mateo, pp. 726–731 (1991)
Google Scholar
Deisenroth, M.P., Rasmussen, C., Peters, J.: Gaussian process dynamic programming. Neurocomputing 72(7–9), 1508–1524 (2009)
Article Google Scholar
Farahmand, A.M., Munos, R., Szepesvári, C.: Error propagation for approximate policy and value iteration. In: NIPS. Lafferty, J., Williams, C.K.I., Shawe-Taylor, J., Zemel, R.S., Culotta, A. (eds.) Advances in Neural Information Processing Systems, vol. 23, pp. 568–576. (2010)
Google Scholar
Feuersänger, C.: Sparse grid methods for higher dimensional approximation. Dissertation, Institut für Numerische Simulation, Universität Bonn (2010)
Google Scholar
Garcke, J.: Regression with the optimised combination technique. In: Cohen, W., Moore, A. (eds.) Proceedings of the 23rd ICML’06, Pittsburgh, pp. 321–328. ACM, New York (2006)
Google Scholar
Garcke, J.: Sparse grids in a nutshell. In: Sparse Grids and Applications. Lecture Notes in Computational Science and Engineering, vol. 88, pp. 57–80. Springer, Berlin/New York (2013)
Google Scholar
Griebel, M.: Adaptive sparse grid multilevel methods for elliptic PDEs based on finite differences. Computing 61(2), 151–179 (1998)
Article MATH MathSciNet Google Scholar
Grüne, L.: An adaptive grid scheme for the discrete Hamilton-Jacobi-Bellman equation. Numer. Math. 75(3), 319–337 (1997)
Article MATH MathSciNet Google Scholar
Grüne, L.: Error estimation and adaptive discretization for the discrete stochastic Hamilton-Jacobi-Bellman equation. Numer. Math. 99(1), 85–112 (2004)
Article MATH MathSciNet Google Scholar
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. Springer, New York (2001)
Book MATH Google Scholar
Heinecke, A., PflügerS, D.: Multi- and many-core data mining with adaptive sparse grids. In: Proceedings of the 8th ACM International Conference on Computing Frontiers, CF’11, Ischia, pp. 29:1–29:10. ACM (2011)
Google Scholar
Jung, T., Stone, P.: Gaussian processes for sample efficient reinforcement learning with RMAX-Like exploration. In: Balcázar, J.L., Bonchi, F., Gionis, A. Sebag, M. (eds.) ECML/PKDD 2010 (1). Lecture Notes in Computer Science, vol. 6321, pp. 601–616. Springer, Berlin/New York (2010)
Google Scholar
Krylov, N.V.: The rate of convergence of finite-difference approximations for Bellman equations with Lipschitz coefficients. Appl. Math. Optim. 52(3), 365–399 (2005)
Article MATH MathSciNet Google Scholar
Kushner, H., Dupuis, P.: Numerical Methods for Stochastic Control Problems in Continuous Time. No. 24 in Applications of Mathematics, 2nd edn. Springer, New York (2001)
Google Scholar
Munos, R.: A study of reinforcement learning in the continuous case by the means of viscosity solutions. Mach. Learn. 40(3), 265–299 (2000)
Article MATH MathSciNet Google Scholar
Munos, R.: Performance bounds in L _p-norm for approximate value iteration. SIAM J. Control Optim. 46(2), 541–561 (2007)
Article MATH MathSciNet Google Scholar
Munos, R., Moore, A.: Variable resolution discretization in optimal control. Mach. Learn. 49(2–3), 291–323 (2002)
Article MATH Google Scholar
Noordmans, J., Hemker, P.: Application of an adaptive sparse grid technique to a model singular perturbation problem. Computing 65, 357–378 (2000)
Article MATH MathSciNet Google Scholar
Pareigis, S.: Adaptive choice of grid and time in reinforcement learning. In: NIPS. MIT, Cambridge (1997).
Google Scholar
Pflüger, D.: Spatially Adaptive Sparse Grids for High-Dimensional Problems. Verlag Dr. Hut, München (2010)
Google Scholar
Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning. MIT, Cambridge (2006)
MATH Google Scholar
Sutton, R.S., Barto, A.: Reinforcement Learning: An Introduction. MIT, Cambridge (1998)
Google Scholar
Smolyak, S.A.: Quadrature and interpolation formulas for tensor products of certain classes of functions. Dokl. Akad. Nauk SSSR 148, 1042–1043 (1963)
MATH MathSciNet Google Scholar
Tourin, A.: Splitting methods for Hamilton-Jacobi equations. Numer. Methods Partial Differ. Equ. 22(2), 381–396 (2006)
Article MATH MathSciNet Google Scholar
Yserentant, H.: On the multi-level splitting of finite element spaces. Numerische Mathematik 49, 379–412 (1986)
Article MATH MathSciNet Google Scholar
Zenger, C.: Sparse grids. In: Hackbusch, W. (ed.) Parallel Algorithms for Partial Differential Equations, Proceedings of the Sixth GAMM-Seminar, Kiel, 1990. Notes on Numerical Fluid Mechanics, vol. 31, pp. 241–251. Vieweg, Braunschweig (1991)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Bonn, Wegelerstr. 6, 53115, Bonn, Germany
Jochen Garcke

Authors

Jochen Garcke
View author publications
You can also search for this author in PubMed Google Scholar
Irene Klompmaker
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jochen Garcke .

Editor information

Editors and Affiliations

FB 12 Mathematik und Informatik, Philipps-Universität Marburg, Marburg, Germany
Stephan Dahlke
Inst. Geometrie & Praktische Mathemathik, RWTH Aachen University, Aachen, Germany
Wolfgang Dahmen
Universität Bonn Institut für Numerische Simulation, Bonn, Germany
Michael Griebel
Max-Planck-Inst. f. Mathem. in d. Naturwissenschaften, Leipzig, Sachsen, Germany
Wolfgang Hackbusch
Fachbereich Mathematik, Technische Universität Kaiserslautern, Kaiserslautern, Germany
Klaus Ritter
Institut für Mathematik, Technische Universität Berlin, Berlin, Germany
Reinhold Schneider
Seminar für Angewandte Mathematik, ETH Zürich, Zürich, Switzerland
Christoph Schwab
Institut für Mathematik, Technische Universität Berlin, Berlin, Germany
Harry Yserentant

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Garcke, J., Klompmaker, I. (2014). Adaptive Sparse Grids in Reinforcement Learning. In: Dahlke, S., et al. Extraction of Quantifiable Information from Complex Systems. Lecture Notes in Computational Science and Engineering, vol 102. Springer, Cham. https://doi.org/10.1007/978-3-319-08159-5_9

Download citation

DOI: https://doi.org/10.1007/978-3-319-08159-5_9
Published: 30 September 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-08158-8
Online ISBN: 978-3-319-08159-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics