Abstract
In multiagent systems, social optimality is a desirable goal to achieve in terms of maximizing the global efficiency of the system. We study the problem of coordinating on socially optimal outcomes among a population of agents, in which each agent randomly interacts with another agent from the population each round. Previous work [Hales and Edmonds 2003; Matlock and Sen 2007, 2009] mainly resorts to modifying the interaction protocol from random interaction to tag-based interactions and only focus on the case of symmetric games. Besides, in previous work the agents’ decision making processes are usually based on evolutionary learning, which usually results in high communication cost and high deviation on the coordination rate. To solve these problems, we propose an alternative social learning framework with two major contributions as follows. First, we introduce the observation mechanism to reduce the amount of communication required among agents. Second, we propose that the agents’ learning strategies should be based on reinforcement learning technique instead of evolutionary learning. Each agent explicitly keeps the record of its current state in its learning strategy, and learn its optimal policy for each state independently. In this way, the learning performance is much more stable and also it is suitable for both symmetric and asymmetric games. The performance of this social learning framework is extensively evaluated under the testbed of two-player general-sum games comparing with previous work [Hao and Leung 2011; Matlock and Sen 2007]. The influences of different factors on the learning performance of the social learning framework are investigated as well.
- Allison, P. D. 1992. The cultural evolution of beneficent norms. Social Forces. 279--301.Google Scholar
- Bowling, M. H. and Veloso, M. M. 2003. Multiagent learning using a variable learning rate. Artif. Intell. 136, 215--250. Google ScholarDigital Library
- Brafman, R. I. and Tennenholtz, M. 2004. Efficient learning equilibrium. Artif. Intell. 159, 27--47. Google ScholarDigital Library
- Chao, I., Ardaiz, O., and Sanguesa, R. 2008. Tag mechanisms evaluated for coordination in open multi-agent systems. In Proceedings of the 8th International Workshop on Engineering Societies in the Agents World. 254--269. Google ScholarDigital Library
- Claus, C. and Boutilier, C. 1998. The dynamics of reinforcement learning in cooperative multiagent systems. In Proceedings of AAAI’98. 746--752. Google ScholarDigital Library
- Conitzer, V. and Sandholm, T. 2006. AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents. In Proceedings of ICML’06. 83--90.Google Scholar
- Crandall, J. W. and Goodrich, M. A. 2005. Learning to teach and follow in repeated games. In Proceedings of the AAAI Workshop on Multiagent Learning.Google Scholar
- Greenwald, A. and Hall, K. 2003. Correlated Q-Learning. In Proceedings of ICML’03. 242--249.Google Scholar
- Hales, D. 2000. Cooperation without space or memory-tag, groups and the Prisoner’s Dilemma. In Multi-Agent-Based Simulation, Lecture Notes in Artificial Intelligence. Google ScholarDigital Library
- Hales, D. and Edmonds, B. 2003. Evolving social rationality for MAS using “Tags”. In Proceedings of AA-MAS’03. 497--503. Google ScholarDigital Library
- Hao, J. Y. and Leung, H. F. 2010. Strategy and fairness in repeated two-agent interaction. In Proceedings of ICTAI’10. 3--6. Google ScholarDigital Library
- Hao, J. Y. and Leung, H. F. 2011. Learning to achieve social rationality using tag mechanism in repeated interactions. In Proceedings of ICTAI’11. 148--155. Google ScholarDigital Library
- Hao, J. Y. and Leung, H. F. 2012. Learning to achieve socially optimal solutions in general-sum games. In Proceedings of the PRICAI’12. 88--99. Google ScholarDigital Library
- Hoen, P. J., Tuyls, K.l, Panait, L., Luke, S., and Poutr, J. A. L. 2005. An overview of cooperative and competitive multiagent learning. In Proceedings of the 1st International Workshop on Learning and Adaption in Multi-Agent Systems. 1--46. Google ScholarDigital Library
- Hogg, L. M. and Jennings, N. R. 1997. Socially rational agents. In Proceedings of the AAAI Fall Symposium on Socially Intelligent Agents. 61--63.Google Scholar
- Hogg, L. M. J. and Jennings, N. R. 2001. Socially intelligent reasoning for autonomous agents. IEEE Trans. Syst. Man. Cybernetics, Part A: Syst. Humans. 381--393. Google ScholarDigital Library
- Holland, J. H., Holyoak, K., Nisbett, R., and Thagard, P. 1986. Induction: Processes of Inferences, Learning, and Discovery. MIT Press, Cambridge, MA. Google ScholarDigital Library
- Howley, E. and O’Riordan, C. 2005. The emergence of cooperation among agents using simple fixed bias tagging. In Proceedings of the IEEE Congress on Evolutionary Computation.Google ScholarCross Ref
- Hu, J. and Wellman, M. 1998. Multiagent reinforcement learning: theoretical framework and an algorithm. In Proceedings of the ICML’98. Google ScholarDigital Library
- Kapetanakis, S. and Kudenko, D. 2002. Reinforcement learning of coordination in cooperative multi-agent systems. In Proceedings of the AAAI’02. 326--331. Google ScholarDigital Library
- Littman, M. 1994. Markov games as a framework for multi-agent reinforcement learning. In Proceedings of ICML’94. 322--328.Google ScholarCross Ref
- Matlock, M. and Sen, S. 2005. The success and failure of tag-mediated evolution of cooperation. In Proceedings of the 1st International Workshop on Learning and Adaption in Multi-Agent Systems. Springer, 155--164. Google ScholarDigital Library
- Matlock, M. and Sen, S. 2007. Effective tag mechanisms for evolving coordination. In Proceedings of AAMAS’07. 1340--1347. Google ScholarDigital Library
- Matlock, M. and Sen, S. 2009. Effective tag mechanisms for evolving coperation. In Proceedings of AAMAS’09. 489--496. Google ScholarDigital Library
- Maynard, J. S. 1982. Evolution and The Theory of Games. Cambridge University Press, Cambridge, UK.Google Scholar
- Nowak, M. and Sigmund, K. 1993. A strategy of winstay, lose-shit that outperforms tit-for-tat in the prisoner’s dilemma game. Nature, 56--58.Google Scholar
- Osborne, M. J. and Rubinstein, A. 1994. A Course in Game Theory. MIT Press, Cambridge, MA.Google Scholar
- Panait, L. and Luke, S. 2005. Cooperative multi-agent learning: The state of the art. In Proceedings of AAMAS. 387--434.Google Scholar
- Pitt, J., Schaumeier, J., Busquets, D., and Macbeth, S. 2012. Self-organising common-pool resource allocation and canons of distributive justice. In Proceedings of SASO’12. 119--128. Google ScholarDigital Library
- Riol, R. and Cohen, M. D. 2001. Cooperation withour reciprocity. Nature, 441--443.Google Scholar
- Sen, S. and Airiau, S. 2007. Emergence of norms through social learning. In Proceedings of IJCAI’07. 1507--1512. Google ScholarDigital Library
- Verbeeck, K., Nowé, A., Parent, J., and Tuyls, K. 2006. Exploring selfish reinforcement learning in repeated games with stochastic rewards. In Proceedings of AAMAS’06. 239--269.Google Scholar
- Villatoro, D., Sabater-Mir, J., and Sen, S. 2011. Social instruments for robust convention emergence. In Proceedings of IJCAI’11. 420--425. Google ScholarDigital Library
- Wakano, J. Y. and Yamamura, N. 2001. A simple learning strategy that realizes robust cooperation better than Pavlov in iterated Prisoner’s Dilemma. J. Ethol. 19, 9--15.Google ScholarCross Ref
- Watkins, C. J. C. H. and Dayan, P. D. 1992. Q-learning. Mach. Learn. 279--292. Google ScholarDigital Library
Index Terms
- Achieving Socially Optimal Outcomes in Multiagent Systems with Reinforcement Social Learning
Recommendations
Multiagent Reinforcement Social Learning toward Coordination in Cooperative Multiagent Systems
Most previous works on coordination in cooperative multiagent systems study the problem of how two (or more) players can coordinate on Pareto-optimal Nash equilibrium(s) through fixed and repeated interactions in the context of cooperative games. ...
ASN: action semantics network for multiagent reinforcement learning
AbstractIn multiagent systems (MASs), each agent makes individual decisions but all contribute globally to the system’s evolution. Learning in MASs is difficult since each agent’s selection of actions must take place in the presence of other co-learning ...
Reinforcement social learning of coordination in cooperative multiagent systems
AAMAS '13: Proceedings of the 2013 international conference on Autonomous agents and multi-agent systemsCoordination in cooperative multiagent systems is an important problem and has received a lot of attention in multiagent learning literature. Most of previous works study the problem of how two (or more) players can coordinate on Pareto-optimal Nash ...
Comments