Abstract
This paper describes the design and implementation of robotic agents for the RoboCup Simulation 2D category that learns using a recently proposed Heuristic Reinforcement Learning algorithm, the Heuristically Accelerated Q–Learning (HAQL). This algorithm allows the use of heuristics to speed up the well-known Reinforcement Learning algorithm Q–Learning. A heuristic function that influences the choice of the actions characterizes the HAQL algorithm. A set of empirical evaluations was conducted in the RoboCup 2D Simulator, and experimental results show that even very simple heuristics enhances significantly the performance of the agents.
Chapter PDF
Similar content being viewed by others
References
Bianchi, R.A.C., Ribeiro, C.H.C., Costa, A.H.R.: Heuristically Accelerated Q-Learning: a new approach to speed up reinforcement learning. In: Bazzan, A.L.C., Labidi, S. (eds.) SBIA 2004. LNCS (LNAI), vol. 3171, pp. 245–254. Springer, Heidelberg (2004)
de Boer, R., Kok, J.: The Incremental Development of a Synthetic Multi-Agent System: The UvA Trilearn 2001 Robotic Soccer Simulation Team. Master’s Thesis, University of Amsterdam (2002)
Kalyanakrishnan, S., Liu, Y., Stone, P.: Half field offense in RoboCup soccer: A multiagent reinforcement learning case study. In: Lakemeyer, G., Sklar, E., Sorenti, D., Takahashi, T. (eds.) RoboCup-2006: Robot Soccer World Cup X, Springer, Berlin (2007)
Kitano, H., Minoro, A., Kuniyoshi, Y., Noda, I., Osawa, E.: Robocup: A challenge problem for ai. AI Magazine 18(1), 73–85 (1997)
Littman, M.L., Szepesvári, C.: A generalized reinforcement learning model: Convergence and applications. In: Procs. of the Thirteenth International Conf. on Machine Learning (ICML 1996), pp. 310–318 (1996)
Mitchell, T.: Machine Learning. McGraw Hill, New York (1997)
Noda, I.: Soccer server: a simulator of robocup. In: Proceedings of AI symposium of the Japanese Society for Artificial Intelligence, pp. 29–34 (1995)
Spiegel, M.R.: Statistics. McGraw-Hill (1998)
Szepesvári, C., Littman, M.L.: Generalized markov decision processes: Dynamic-programming and reinforcement-learning algorithms. Technical report, Brown University, Department of Computer Science, Brown University, Providence, Rhode Island 0, 1996. CS-96-11 (2912)
Watkins, C.J.C.H.: Learning from Delayed Rewards. PhD thesis, University of Cambridge (1989)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Celiberto, L.A., Ribeiro, C.H.C., Costa, A.H.R., Bianchi, R.A.C. (2008). Heuristic Reinforcement Learning Applied to RoboCup Simulation Agents. In: Visser, U., Ribeiro, F., Ohashi, T., Dellaert, F. (eds) RoboCup 2007: Robot Soccer World Cup XI. RoboCup 2007. Lecture Notes in Computer Science(), vol 5001. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68847-1_19
Download citation
DOI: https://doi.org/10.1007/978-3-540-68847-1_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68846-4
Online ISBN: 978-3-540-68847-1
eBook Packages: Computer ScienceComputer Science (R0)