Abstract
Communication is an essential way for multi-agent system to coordinate. By sharing local observations and intentions via communication channel, agents can better deal with dynamic environment and thus make optimal decisions. However, restricted by the limited communication channel, agents have to leverage less communication resources to transmit more informative messages. In this article, we propose a two-level hierarchical multi-agent reinforcement learning algorithm which utilizes different timescales in different levels. Communication happens only between high levels at a coarser time scale to generate sub-goals which convey the intention of agents for the low level. And the low level is responsible for implementing these sub-goals by controlling primitive actions at every tick of environment. Sub-goal is the core of this hierachical communication architecture which requires the high level to communicate efficiently and provide guidance for the low level to coordinate. This hierarchical communication architecture conveys several benefits: 1) It coarsens the collaborative granularity and reduces the requirement of communication since communication happens only in high level at a larger scale; 2) It enables the high level to focus on the coordination of goals without paying attention to implementation, thus improves the efficiency of communication; and 3) It makes better control by dividing a complex multi-agent cooperative task into multiple single-agent tasks. In experiments, we apply our approach in vehicle collision avoidance tasks and achieve better performance than baselines.
This work was supported in part by the Natural Science Foundation of China under Grant 61902035, Grant 61876023, Grant 62001054 and Grant 62102041, and in part by the Fundamental Research Funds for the Central Universities.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bacon, P.L., Harb, J., Precup, D.: The option-critic architecture. In: Proceedings of the AAAI, vol. 31 (2017)
Cao, Y., Yu, W., Ren, W., Chen, G.: An overview of recent progress in the study of distributed multi-agent coordination. IEEE Trans. Ind. Inform. 9(1), 427–438 (2012)
Chu, T., Chinchali, S., Katti, S.: Multi-agent reinforcement learning for networked system control. In: Proceedings of the ICLR (2019)
Das, A., Gervet, T., Romoff, J., Batra, D., Parikh, D., Rabbat, M., Pineau, J.: Tarmac: targeted multi-agent communication. In: Proceeding of the ICML, pp. 1538–1546. PMLR (2019)
Foerster, J., Assael, I.A., De Freitas, N., Whiteson, S.: Learning to communicate with deep multi-agent reinforcement learning. In: Advances in Neural Information Processing Systems, vol. 29 (2016)
Foerster, J., Farquhar, G., Afouras, T., Nardelli, N., Whiteson, S.: Counterfactual multi-agent policy gradients. In: Proceedings of the AAAI, vol. 32 (2018)
Harb, J., Bacon, P.L., Klissarov, M., Precup, D.: When waiting is not an option: learning options with a deliberation cost. In: Proceedings of the AAAI (2018)
Hoshen, Y.: Vain: attentional multi-agent predictive modeling. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Jiang, J., Lu, Z.: Learning attentional communication for multi-agent cooperation. Advances in Neural Information Processing Systems, vol. 31 (2018)
Kim, D., Moon, S., Hostallero, D., Kang, W.J., Lee, T., Son, K., Yi, Y.: Learning to schedule communication in multi-agent reinforcement learning. In: Proceedings of the ICLR (2018)
Kulkarni, T.D., Narasimhan, K., Saeedi, A., Tenenbaum, J.: Hierarchical deep reinforcement learning: integrating temporal abstraction and intrinsic motivation. In: Advances in Neural Information Processing Systems, vol. 29 (2016)
Lowe, R., Wu, Y.I., Tamar, A., Harb, J., Pieter Abbeel, O., Mordatch, I.: Multi-agent actor-critic for mixed cooperative-competitive environments. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Peng, P., et al.: Multiagent bidirectionally-coordinated nets: Emergence of human-level coordination in learning to play starcraft combat games. arXiv:1703.10069 (2017)
Rashid, T., Samvelyan, M., Schroeder, C., Farquhar, G., Foerster, J., Whiteson, S.: Qmix: monotonic value function factorisation for deep multi-agent reinforcement learning. In: Proceedings of the ICML, vol. 80, pp. 4295–4304 (2018)
Singh, A., Jain, T., Sukhbaatar, S.: Learning when to communicate at scale in multiagent cooperative and competitive tasks. In: Proceedings of the IMCL (2018)
Son, K., Kim, D., Kang, W.J., Hostallero, D.E., Yi, Y.: Qtran: learning to factorize with transformation for cooperative multi-agent reinforcement learning. In: Proceedings of the ICML, vol. 97, pp. 5887–5896 (2019)
Sukhbaatar, S., Fergus, R., et al.: Learning multiagent communication with backpropagation. In: Advances in Neural Information Processing Systems, vol. 29 (2016)
Sutton, R.S., Precup, D., Singh, S.: Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning. Artif. Intell. 112(1–2), 181–211 (1999)
Vezhnevets, A.S., et al.: Feudal networks for hierarchical reinforcement learning. In: Proceedings of the ICML, pp. 3540–3549 (2017)
Vinyals, O., et al.: Starcraft ii: a new challenge for reinforcement learning. arXiv:1708.04782 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Liu, S., Yuan, Q., Chen, B., Luo, G., Li, J. (2022). Cooperative Multi-agent Reinforcement Learning with Hierachical Communication Architecture. In: Pimenidis, E., Angelov, P., Jayne, C., Papaleonidas, A., Aydin, M. (eds) Artificial Neural Networks and Machine Learning – ICANN 2022. ICANN 2022. Lecture Notes in Computer Science, vol 13530. Springer, Cham. https://doi.org/10.1007/978-3-031-15931-2_2
Download citation
DOI: https://doi.org/10.1007/978-3-031-15931-2_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-15930-5
Online ISBN: 978-3-031-15931-2
eBook Packages: Computer ScienceComputer Science (R0)