Abstract
Reinforcement Learning (RL) holds particular promise in an emerging application domain of performance management of computing systems. In recent work, online RL yielded effective server allocation policies in a prototype Data Center, without explicit system models or built-in domain knowledge. This paper presents a substantially improved and more practical “hybrid” approach, in which RL trains offline on data collected while a queuing-theoretic policy controls the system. This approach avoids potentially poor performance in live online training. Additionally we use nonlinear function approximators instead of tabular value functions; this greatly improves scalability, and surprisingly, eliminated the need for exploratory actions. In experiments using both open-loop and closed-loop traffic as well as large switching delays, our results show significant performance improvement over state-of-art queuing model policies.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Das, R., Tesauro, G., Walsh, W.E.: Model-based and model-free approaches to autonomic resource allcation. Technical Report RC23802, IBM Research (2005)
Tesauro, G.: Online resource allocation using decompositional reinforcement learning. In: Proc. of AAAI 2005. AAAI Press, Menlo Park (2005)
Vengerov, D., Iakovlev, N.: A reinforcement learning framework for dynamic resource allocation: First results. In: Proc. of ICAC 2005 (2005)
Price, B., Boutilier, C.: Accelerating reinforcement learning through implicit imitation. J. of AI Research 19, 569–629 (2003)
Lavenberg, S.S.: Personal communication (2006)
Tesauro, G., Jong, N.K., Das, R., Bennani, M.N.: A hybrid reinforcement learning approach to autnomic resource allocation. In: Proc. of ICAC 2006, pp. 65–73 (2006)
Squillante, M.S., Yao, D.D., Zhang, L.: Internet traffic: Periodicity, tail behavior and performance implications. In: Gelenbe, E. (ed.) System Performance Evaluation: Methodologies and Applications. CRC Press, Boca Raton (1999)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Baird, L.: Residual algorithms: Reinforcement learning with function approximation. In: Proc. of ICML 1995 (1995)
Abbeel, P., Ng, A.Y.: Exploration and apprenticeship learning in reinforcement learning. In: Proc. of ICML 2005 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Tesauro, G., Jong, N.K., Das, R., Bennani, M.N. (2006). Improvement of Systems Management Policies Using Hybrid Reinforcement Learning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds) Machine Learning: ECML 2006. ECML 2006. Lecture Notes in Computer Science(), vol 4212. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11871842_80
Download citation
DOI: https://doi.org/10.1007/11871842_80
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-45375-8
Online ISBN: 978-3-540-46056-5
eBook Packages: Computer ScienceComputer Science (R0)