Cooperative Multi-agent Reinforcement Learning with Hierachical Communication Architecture

Liu, Shifan; Yuan, Quan; Chen, Bo; Luo, Guiyang; Li, Jinglin

doi:10.1007/978-3-031-15931-2_2

Shifan Liu¹²,
Quan Yuan¹²,
Bo Chen¹²,
Guiyang Luo¹² &
…
Jinglin Li¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13530))

Included in the following conference series:

International Conference on Artificial Neural Networks

2331 Accesses

Abstract

Communication is an essential way for multi-agent system to coordinate. By sharing local observations and intentions via communication channel, agents can better deal with dynamic environment and thus make optimal decisions. However, restricted by the limited communication channel, agents have to leverage less communication resources to transmit more informative messages. In this article, we propose a two-level hierarchical multi-agent reinforcement learning algorithm which utilizes different timescales in different levels. Communication happens only between high levels at a coarser time scale to generate sub-goals which convey the intention of agents for the low level. And the low level is responsible for implementing these sub-goals by controlling primitive actions at every tick of environment. Sub-goal is the core of this hierachical communication architecture which requires the high level to communicate efficiently and provide guidance for the low level to coordinate. This hierarchical communication architecture conveys several benefits: 1) It coarsens the collaborative granularity and reduces the requirement of communication since communication happens only in high level at a larger scale; 2) It enables the high level to focus on the coordination of goals without paying attention to implementation, thus improves the efficiency of communication; and 3) It makes better control by dividing a complex multi-agent cooperative task into multiple single-agent tasks. In experiments, we apply our approach in vehicle collision avoidance tasks and achieve better performance than baselines.

This work was supported in part by the Natural Science Foundation of China under Grant 61902035, Grant 61876023, Grant 62001054 and Grant 62102041, and in part by the Fundamental Research Funds for the Central Universities.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bacon, P.L., Harb, J., Precup, D.: The option-critic architecture. In: Proceedings of the AAAI, vol. 31 (2017)
Google Scholar
Cao, Y., Yu, W., Ren, W., Chen, G.: An overview of recent progress in the study of distributed multi-agent coordination. IEEE Trans. Ind. Inform. 9(1), 427–438 (2012)
Article Google Scholar
Chu, T., Chinchali, S., Katti, S.: Multi-agent reinforcement learning for networked system control. In: Proceedings of the ICLR (2019)
Google Scholar
Das, A., Gervet, T., Romoff, J., Batra, D., Parikh, D., Rabbat, M., Pineau, J.: Tarmac: targeted multi-agent communication. In: Proceeding of the ICML, pp. 1538–1546. PMLR (2019)
Google Scholar
Foerster, J., Assael, I.A., De Freitas, N., Whiteson, S.: Learning to communicate with deep multi-agent reinforcement learning. In: Advances in Neural Information Processing Systems, vol. 29 (2016)
Google Scholar
Foerster, J., Farquhar, G., Afouras, T., Nardelli, N., Whiteson, S.: Counterfactual multi-agent policy gradients. In: Proceedings of the AAAI, vol. 32 (2018)
Google Scholar
Harb, J., Bacon, P.L., Klissarov, M., Precup, D.: When waiting is not an option: learning options with a deliberation cost. In: Proceedings of the AAAI (2018)
Google Scholar
Hoshen, Y.: Vain: attentional multi-agent predictive modeling. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Jiang, J., Lu, Z.: Learning attentional communication for multi-agent cooperation. Advances in Neural Information Processing Systems, vol. 31 (2018)
Google Scholar
Kim, D., Moon, S., Hostallero, D., Kang, W.J., Lee, T., Son, K., Yi, Y.: Learning to schedule communication in multi-agent reinforcement learning. In: Proceedings of the ICLR (2018)
Google Scholar
Kulkarni, T.D., Narasimhan, K., Saeedi, A., Tenenbaum, J.: Hierarchical deep reinforcement learning: integrating temporal abstraction and intrinsic motivation. In: Advances in Neural Information Processing Systems, vol. 29 (2016)
Google Scholar
Lowe, R., Wu, Y.I., Tamar, A., Harb, J., Pieter Abbeel, O., Mordatch, I.: Multi-agent actor-critic for mixed cooperative-competitive environments. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Peng, P., et al.: Multiagent bidirectionally-coordinated nets: Emergence of human-level coordination in learning to play starcraft combat games. arXiv:1703.10069 (2017)
Rashid, T., Samvelyan, M., Schroeder, C., Farquhar, G., Foerster, J., Whiteson, S.: Qmix: monotonic value function factorisation for deep multi-agent reinforcement learning. In: Proceedings of the ICML, vol. 80, pp. 4295–4304 (2018)
Google Scholar
Singh, A., Jain, T., Sukhbaatar, S.: Learning when to communicate at scale in multiagent cooperative and competitive tasks. In: Proceedings of the IMCL (2018)
Google Scholar
Son, K., Kim, D., Kang, W.J., Hostallero, D.E., Yi, Y.: Qtran: learning to factorize with transformation for cooperative multi-agent reinforcement learning. In: Proceedings of the ICML, vol. 97, pp. 5887–5896 (2019)
Google Scholar
Sukhbaatar, S., Fergus, R., et al.: Learning multiagent communication with backpropagation. In: Advances in Neural Information Processing Systems, vol. 29 (2016)
Google Scholar
Sutton, R.S., Precup, D., Singh, S.: Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning. Artif. Intell. 112(1–2), 181–211 (1999)
Article MathSciNet MATH Google Scholar
Vezhnevets, A.S., et al.: Feudal networks for hierarchical reinforcement learning. In: Proceedings of the ICML, pp. 3540–3549 (2017)
Google Scholar
Vinyals, O., et al.: Starcraft ii: a new challenge for reinforcement learning. arXiv:1708.04782 (2017)

Download references

Author information

Authors and Affiliations

State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, 100876, China
Shifan Liu, Quan Yuan, Bo Chen, Guiyang Luo & Jinglin Li

Authors

Shifan Liu
View author publications
You can also search for this author in PubMed Google Scholar
Quan Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Bo Chen
View author publications
You can also search for this author in PubMed Google Scholar
Guiyang Luo
View author publications
You can also search for this author in PubMed Google Scholar
Jinglin Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shifan Liu .

Editor information

Editors and Affiliations

University of the West of England, Bristol, UK
Elias Pimenidis
Lancaster University, Lancaster, UK
Plamen Angelov
Digital Innovation, Teeside University, Middlesbrough, UK
Chrisina Jayne
Democritus University of Thrace, Xanthi, Greece
Antonios Papaleonidas
The University of the West of England, Bristol, UK
Mehmet Aydin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, S., Yuan, Q., Chen, B., Luo, G., Li, J. (2022). Cooperative Multi-agent Reinforcement Learning with Hierachical Communication Architecture. In: Pimenidis, E., Angelov, P., Jayne, C., Papaleonidas, A., Aydin, M. (eds) Artificial Neural Networks and Machine Learning – ICANN 2022. ICANN 2022. Lecture Notes in Computer Science, vol 13530. Springer, Cham. https://doi.org/10.1007/978-3-031-15931-2_2

Download citation

DOI: https://doi.org/10.1007/978-3-031-15931-2_2
Published: 07 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-15930-5
Online ISBN: 978-3-031-15931-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Cooperative Multi-agent Reinforcement Learning with Hierachical Communication Architecture