Abstract
We propose a model-based online reinforcement learning approach for continuous domains with deterministic transitions using a spatially adaptive sparse grid in the planning stage. The model learning employs Gaussian processes regression and allows a low sample complexity. The adaptive sparse grid is introduced to allow the representation of the value function in the planning stage in higher dimensional state spaces. This work gives numerical evidence that adaptive sparse grids are applicable in the case of reinforcement learning.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
We call a discrete space \(V _{\underline{k}}\) smaller than a space \(V _{\underline{l}}\) if ∀ t k t ≤ l t and \(\exists t: k_{t} < l_{t}\). In the same way a grid \(\varOmega _{\underline{k}}\) is smaller than a grid \(\varOmega _{\underline{l}}\).
References
Bardi, M., Capuzzo-Dolcetta, I.: Optimal control and viscosity solutions of Hamilton-Jacobi-Bellman equations. In: Systems and Control: Foundations and Applications. Birkhäuser, Boston (1997)
Barles, G., Jakobsen, E.R.: On the convergence rate of approximation schemes for Hamilton-Jacobi-Bellman equations. M2AN Math. Model. Numer. Anal. 36(1), 33–54 (2002)
Barles, G., Jakobsen, E.R.: Error bounds for monotone approximation schemes for parabolic Hamilton-Jacobi-Bellman equations. Math. Comput. 76(240), 1861–1893 (2007)
Barles, G., Souganidis, P.: Convergence of approximation schemes for fully nonlinear second order equations. Asymptot. Anal. 4(3), 271–283 (1991)
Bertsekas, D.P., Tsitsiklis, J.N.: Neuro-Dynamic Programming. Athena Scientific, Belmont (1996)
Bokanowski, O., Garcke, J., M-Griebel, Klompmaker, I.: An adaptive sparse grid semi-Lagrangian scheme for first order Hamilton-Jacobi Bellman equations. J. Sci. Comput. 55(3), 575–605 (2013)
Bonnans, J.F., Ottenwaelter, E., Zidani, H.: A fast algorithm for the two dimensional HJB equation of stochastic control. M2AN, Math. Model. Numer. Anal. 38(4), 723–735 (2004)
Bonnans, J.F., Zidani, H.: Consistency of generalized finite difference schemes for the stochastic HJB equation. SIAM J. Numer. Anal. 41(3), 1008–1021 (2003)
Brafman, R., Tennenholtz, M.: R-max-a general polynomial time algorithm for near-optimal reinforcement learning. J. Mach. Learn. Res. 3, 213–231 (2002)
Bungartz, H.J., Griebel, M.: Sparse grids. Acta Numer. 13, 1–123 (2004)
Camilli, F., Falcone, M.: An approximation scheme for the optimal control of diffusion processes. RAIRO, Modélisation Math. Anal. Numér. 29(1), 97–122 (1995)
Chapman, D., Kaelbling, L.P.: Input generalization in delayed reinforcement learning: an algorithm and performance comparisons. In: Proceedings of the 12th International Joint Conference on Artificial Intelligence, San Mateo, pp. 726–731 (1991)
Deisenroth, M.P., Rasmussen, C., Peters, J.: Gaussian process dynamic programming. Neurocomputing 72(7–9), 1508–1524 (2009)
Farahmand, A.M., Munos, R., Szepesvári, C.: Error propagation for approximate policy and value iteration. In: NIPS. Lafferty, J., Williams, C.K.I., Shawe-Taylor, J., Zemel, R.S., Culotta, A. (eds.) Advances in Neural Information Processing Systems, vol. 23, pp. 568–576. (2010)
Feuersänger, C.: Sparse grid methods for higher dimensional approximation. Dissertation, Institut für Numerische Simulation, Universität Bonn (2010)
Garcke, J.: Regression with the optimised combination technique. In: Cohen, W., Moore, A. (eds.) Proceedings of the 23rd ICML’06, Pittsburgh, pp. 321–328. ACM, New York (2006)
Garcke, J.: Sparse grids in a nutshell. In: Sparse Grids and Applications. Lecture Notes in Computational Science and Engineering, vol. 88, pp. 57–80. Springer, Berlin/New York (2013)
Griebel, M.: Adaptive sparse grid multilevel methods for elliptic PDEs based on finite differences. Computing 61(2), 151–179 (1998)
Grüne, L.: An adaptive grid scheme for the discrete Hamilton-Jacobi-Bellman equation. Numer. Math. 75(3), 319–337 (1997)
Grüne, L.: Error estimation and adaptive discretization for the discrete stochastic Hamilton-Jacobi-Bellman equation. Numer. Math. 99(1), 85–112 (2004)
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. Springer, New York (2001)
Heinecke, A., PflügerS, D.: Multi- and many-core data mining with adaptive sparse grids. In: Proceedings of the 8th ACM International Conference on Computing Frontiers, CF’11, Ischia, pp. 29:1–29:10. ACM (2011)
Jung, T., Stone, P.: Gaussian processes for sample efficient reinforcement learning with RMAX-Like exploration. In: Balcázar, J.L., Bonchi, F., Gionis, A. Sebag, M. (eds.) ECML/PKDD 2010 (1). Lecture Notes in Computer Science, vol. 6321, pp. 601–616. Springer, Berlin/New York (2010)
Krylov, N.V.: The rate of convergence of finite-difference approximations for Bellman equations with Lipschitz coefficients. Appl. Math. Optim. 52(3), 365–399 (2005)
Kushner, H., Dupuis, P.: Numerical Methods for Stochastic Control Problems in Continuous Time. No. 24 in Applications of Mathematics, 2nd edn. Springer, New York (2001)
Munos, R.: A study of reinforcement learning in the continuous case by the means of viscosity solutions. Mach. Learn. 40(3), 265–299 (2000)
Munos, R.: Performance bounds in L p -norm for approximate value iteration. SIAM J. Control Optim. 46(2), 541–561 (2007)
Munos, R., Moore, A.: Variable resolution discretization in optimal control. Mach. Learn. 49(2–3), 291–323 (2002)
Noordmans, J., Hemker, P.: Application of an adaptive sparse grid technique to a model singular perturbation problem. Computing 65, 357–378 (2000)
Pareigis, S.: Adaptive choice of grid and time in reinforcement learning. In: NIPS. MIT, Cambridge (1997).
Pflüger, D.: Spatially Adaptive Sparse Grids for High-Dimensional Problems. Verlag Dr. Hut, München (2010)
Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning. MIT, Cambridge (2006)
Sutton, R.S., Barto, A.: Reinforcement Learning: An Introduction. MIT, Cambridge (1998)
Smolyak, S.A.: Quadrature and interpolation formulas for tensor products of certain classes of functions. Dokl. Akad. Nauk SSSR 148, 1042–1043 (1963)
Tourin, A.: Splitting methods for Hamilton-Jacobi equations. Numer. Methods Partial Differ. Equ. 22(2), 381–396 (2006)
Yserentant, H.: On the multi-level splitting of finite element spaces. Numerische Mathematik 49, 379–412 (1986)
Zenger, C.: Sparse grids. In: Hackbusch, W. (ed.) Parallel Algorithms for Partial Differential Equations, Proceedings of the Sixth GAMM-Seminar, Kiel, 1990. Notes on Numerical Fluid Mechanics, vol. 31, pp. 241–251. Vieweg, Braunschweig (1991)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Garcke, J., Klompmaker, I. (2014). Adaptive Sparse Grids in Reinforcement Learning. In: Dahlke, S., et al. Extraction of Quantifiable Information from Complex Systems. Lecture Notes in Computational Science and Engineering, vol 102. Springer, Cham. https://doi.org/10.1007/978-3-319-08159-5_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-08159-5_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-08158-8
Online ISBN: 978-3-319-08159-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)