Abstract
A model of matrix stochastic game for decision making in conditions of uncertainty is developed. A Q-learning method is proposed for solving a stochastic game with a priori unknown payoff matrices. The formulation of the game problem is performed, the Markov recurrent method and the algorithm for its solution are described. The results of computer simulation of stochastic game with Q-learning are obtained and analyzed. In this paper, the ranges of changes in the parameters of the game Q-method are experimentally established to ensure the convergence to one of the Nash equilibrium points. As the value of the current win discounting parameter increases, the variance of current winnings reduces, and the order of change for the learning step decreases, and the convergence rate of the game Q-method increases.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Neyman, A., Sorin, S.: Stochastic Games and Applications. (vol. 570). Springer Science & Business Media, Berlin (2003). https://www.springer.com/gp/book/9781402014925
Fudenberg, D., Drew, F., Levine, D.K., Levine, D.K.: The Theory of Learning in Games. vol. 2. MIT press, Cambridge (1998). ISBN 9780262061940
Weiss, G.: Multiagent Systems: A Modern Approach to Distributed Artificial Intelligence. MIT press, Cambridge (1999). ISBN 9780262232036
Wooldridge, M.: An Introduction to Multiagent Systems. John Wiley & Sons, Hoboken (2009). ISBN 978-0-470-51946-2
Hashemi, A.B., Meybodi, M.R.: A note on the learning automata based algorithms for adaptive parameter selection in PSO. Appl. Soft Comput. 11(1), 689–705 (2011). https://doi.org/10.1016/j.asoc.2009.12.030
Watkins, C.J., Dayan, P.: Q-learning. Mach. Learn. 8(3–4), 279–292 (1992). https://doi.org/10.1007/BF00992698
Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: a survey. J. Artif. Intell. Res. 4, 237–285 (1996). https://doi.org/10.1613/jair.301
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. Stanford University Press, Redwood City (2018). ISBN 978-0-262-19398-6
Hu, J., Wellman, M.P.: Nash Q-learning for general-sum stochastic games. J. Mach. Learn. Res. 4(Nov), 1039–1069 (2003)
Weinberg, M., Rosenschein, J.S.: Best-response multiagent learning in non-stationary environments. Proc. Third Int. Joint Conf. Auton. Agents Multiagent Syst. 2, 506–513 (2004)
Podinovskii, V.V., Nogin, V.D.: Pareto-Optimal Solutions of Multicriteria Problems. Nauka, Moscow (1982).(in Russian)
Chen, H.F.: Stochastic Approximation and its Applications. (vol. 64). Springer Science & Business Media, Berlin (2006). https://www.springer.com/gp/book/9781402008061
Moulin, H.: Game Theory with Examples from Mathematical Economics: Transl. from French. Moskow, Mir. (1985). (in Russian)
Burov, Y., Vysotska, V., Kravets, P.: Ontological approach to plot analysis and modeling. In: CEUR Workshop Proceedings, pp 22–31 (2019). Electronic copy: http://ceur-ws.org/Vol-2362/paper3.pdf
Kravets, P., Lytvyn, V., Vysotska, V., Ryshkovets, Y., Vyshemyrska, S., Smailova, S.: Dynamic coordination of strategies for multi-agent systems. In: Babichev, S., Lytvynenko, V., Wójcik, W., Vyshemyrskaya, S. (eds.) ISDMCI 2020. AISC, vol. 1246, pp. 653–670. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-54215-3_42
Loganathan, M., et al.: Reinforcement learning based anti-collision algorithm for RFID systems. Int. J. Comput. 18, 155–168 (2019)
Singh, S., Trivedi, A., Garg, N.: Collaborative anti-jamming in cognitive radio networks using Minimax-Q learning. Int. J. Mod. Educ. Comput. Sci. (IJMECS) 5(9), 11–18 (2013). https://doi.org/10.5815/ijmecs.2013.09.02
Salukvadze, M.E., Beltadze, G.N.: Stochastic game with lexicographic payoffs. Int. J. Mod. Educ. Comput. Sci. (IJMECS) 10(4), 10–17 (2018). https://doi.org/10.5815/ijmecs.2018.04.02
Dembri, A., Redjimi, M.: Towards a meta-modeling and verification approach of multi-agent systems based on the agent petri net formalism. Int. J. Inf. Technol. Comput. Sci. (IJITCS) 11(6), 50–62 (2019). https://doi.org/10.5815/ijitcs.2019.06.06
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Kravets, P., Lytvyn, V., Dobrotvor, I., Sachenko, O., Vysotska, V., Sachenko, A. (2021). Matrix Stochastic Game with Q-learning for Multi-agent Systems. In: Hu, Z., Petoukhov, S., Dychka, I., He, M. (eds) Advances in Computer Science for Engineering and Education IV. ICCSEEA 2021. Lecture Notes on Data Engineering and Communications Technologies, vol 83. Springer, Cham. https://doi.org/10.1007/978-3-030-80472-5_26
Download citation
DOI: https://doi.org/10.1007/978-3-030-80472-5_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-80471-8
Online ISBN: 978-3-030-80472-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)