Abstract
In this paper we examine the application of temporal difference methods in learning a linear state value function approximation in a game of give-away checkers. Empirical results show that the TD(λ) algorithm can be successfully used to improve playing policy quality in this domain. Training games with strong and random opponents were considered. Results show that learning only on negative game outcomes improved performance of the learning player against strong opponents.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Sutton, R.: Learning to predict by the method of temporal differences. Machine Learning 3, 9–44 (1988)
Tesauro, G.: Temporal difference learning and td-gammon. Communications of the ACM 38, 58–68 (1995)
Samuel, A.L.: Some studies in machine learning using the game of checkers. IBM Journal of Research and Development 3, 210–229 (1959)
Schaeffer, J., Hlynka, M., Jussila, V.: Temporal difference learning applied to a high-performance game-playing program. In: International Joint Conference on Artificial Intelligence (IJCAI), pp. 529–534 (2001)
Baxter, J., Tridgell, A., Weaver, L.: Knightcap: A chess program that learns by combining td(lambda) with game-tree search. In: MACHINE LEARNING Proceedings of the Fifteenth International Conference (ICML 1998), Madison WISCONSIN, pp. 28–36 (1998)
Schraudolph, N.N., Dayan, P., Sejnowski, T.J.: Learning to evaluate go positions via temporal difference methods. In: Baba, N., Jain, L. (eds.) Computational Intelligence in Games, vol. 62, Springer, Berlin (2001)
Walker, S., Lister, R., Downs, T.: On self-learning patterns in the othello board game by the method of temporal differences. In: Proceedings of the 6th Australian Joint Conference on Artificial Intelligence, Melbourne, pp. 328–333. World Scientific, Melbourne (1993)
Alemanni, J.B.: Give-away checkers (1993), http://perso.wanadoo.fr/alemanni/give_away.html
Schaeffer, J., Lake, R., Lu, P., Bryant, M.: Chinook: The world man-machine checkers champion. AI Magazine 17, 21–29 (1996)
Singh, S.P., Sutton, R.S.: Reinforcement learning with replacing eligibility traces. Machine Learning 22, 123–158 (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Mańdziuk, J., Osman, D. (2004). Temporal Difference Approach to Playing Give-Away Checkers. In: Rutkowski, L., Siekmann, J.H., Tadeusiewicz, R., Zadeh, L.A. (eds) Artificial Intelligence and Soft Computing - ICAISC 2004. ICAISC 2004. Lecture Notes in Computer Science(), vol 3070. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24844-6_141
Download citation
DOI: https://doi.org/10.1007/978-3-540-24844-6_141
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22123-4
Online ISBN: 978-3-540-24844-6
eBook Packages: Springer Book Archive