Skip to main content

Matrix Stochastic Game with Q-learning for Multi-agent Systems

  • Conference paper
  • First Online:
Advances in Computer Science for Engineering and Education IV (ICCSEEA 2021)

Abstract

A model of matrix stochastic game for decision making in conditions of uncertainty is developed. A Q-learning method is proposed for solving a stochastic game with a priori unknown payoff matrices. The formulation of the game problem is performed, the Markov recurrent method and the algorithm for its solution are described. The results of computer simulation of stochastic game with Q-learning are obtained and analyzed. In this paper, the ranges of changes in the parameters of the game Q-method are experimentally established to ensure the convergence to one of the Nash equilibrium points. As the value of the current win discounting parameter increases, the variance of current winnings reduces, and the order of change for the learning step decreases, and the convergence rate of the game Q-method increases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Neyman, A., Sorin, S.: Stochastic Games and Applications. (vol. 570). Springer Science & Business Media, Berlin (2003). https://www.springer.com/gp/book/9781402014925

  2. Fudenberg, D., Drew, F., Levine, D.K., Levine, D.K.: The Theory of Learning in Games. vol. 2. MIT press, Cambridge (1998). ISBN 9780262061940

    Google Scholar 

  3. Weiss, G.: Multiagent Systems: A Modern Approach to Distributed Artificial Intelligence. MIT press, Cambridge (1999). ISBN 9780262232036

    Google Scholar 

  4. Wooldridge, M.: An Introduction to Multiagent Systems. John Wiley & Sons, Hoboken (2009). ISBN 978-0-470-51946-2

    Google Scholar 

  5. Hashemi, A.B., Meybodi, M.R.: A note on the learning automata based algorithms for adaptive parameter selection in PSO. Appl. Soft Comput. 11(1), 689–705 (2011). https://doi.org/10.1016/j.asoc.2009.12.030

    Article  Google Scholar 

  6. Watkins, C.J., Dayan, P.: Q-learning. Mach. Learn. 8(3–4), 279–292 (1992). https://doi.org/10.1007/BF00992698

    Article  MATH  Google Scholar 

  7. Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: a survey. J. Artif. Intell. Res. 4, 237–285 (1996). https://doi.org/10.1613/jair.301

    Article  Google Scholar 

  8. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. Stanford University Press, Redwood City (2018). ISBN 978-0-262-19398-6

    Google Scholar 

  9. Hu, J., Wellman, M.P.: Nash Q-learning for general-sum stochastic games. J. Mach. Learn. Res. 4(Nov), 1039–1069 (2003)

    Google Scholar 

  10. Weinberg, M., Rosenschein, J.S.: Best-response multiagent learning in non-stationary environments. Proc. Third Int. Joint Conf. Auton. Agents Multiagent Syst. 2, 506–513 (2004)

    Google Scholar 

  11. Podinovskii, V.V., Nogin, V.D.: Pareto-Optimal Solutions of Multicriteria Problems. Nauka, Moscow (1982).(in Russian)

    Google Scholar 

  12. Chen, H.F.: Stochastic Approximation and its Applications. (vol. 64). Springer Science & Business Media, Berlin (2006). https://www.springer.com/gp/book/9781402008061

  13. Moulin, H.: Game Theory with Examples from Mathematical Economics: Transl. from French. Moskow, Mir. (1985). (in Russian)

    Google Scholar 

  14. Burov, Y., Vysotska, V., Kravets, P.: Ontological approach to plot analysis and modeling. In: CEUR Workshop Proceedings, pp 22–31 (2019). Electronic copy: http://ceur-ws.org/Vol-2362/paper3.pdf

  15. Kravets, P., Lytvyn, V., Vysotska, V., Ryshkovets, Y., Vyshemyrska, S., Smailova, S.: Dynamic coordination of strategies for multi-agent systems. In: Babichev, S., Lytvynenko, V., Wójcik, W., Vyshemyrskaya, S. (eds.) ISDMCI 2020. AISC, vol. 1246, pp. 653–670. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-54215-3_42

    Chapter  Google Scholar 

  16. Loganathan, M., et al.: Reinforcement learning based anti-collision algorithm for RFID systems. Int. J. Comput. 18, 155–168 (2019)

    Article  Google Scholar 

  17. Singh, S., Trivedi, A., Garg, N.: Collaborative anti-jamming in cognitive radio networks using Minimax-Q learning. Int. J. Mod. Educ. Comput. Sci. (IJMECS) 5(9), 11–18 (2013). https://doi.org/10.5815/ijmecs.2013.09.02

    Article  Google Scholar 

  18. Salukvadze, M.E., Beltadze, G.N.: Stochastic game with lexicographic payoffs. Int. J. Mod. Educ. Comput. Sci. (IJMECS) 10(4), 10–17 (2018). https://doi.org/10.5815/ijmecs.2018.04.02

    Article  Google Scholar 

  19. Dembri, A., Redjimi, M.: Towards a meta-modeling and verification approach of multi-agent systems based on the agent petri net formalism. Int. J. Inf. Technol. Comput. Sci. (IJITCS) 11(6), 50–62 (2019). https://doi.org/10.5815/ijitcs.2019.06.06

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anatoliy Sachenko .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kravets, P., Lytvyn, V., Dobrotvor, I., Sachenko, O., Vysotska, V., Sachenko, A. (2021). Matrix Stochastic Game with Q-learning for Multi-agent Systems. In: Hu, Z., Petoukhov, S., Dychka, I., He, M. (eds) Advances in Computer Science for Engineering and Education IV. ICCSEEA 2021. Lecture Notes on Data Engineering and Communications Technologies, vol 83. Springer, Cham. https://doi.org/10.1007/978-3-030-80472-5_26

Download citation

Publish with us

Policies and ethics