Abstract
In recent years, important results have been achived in decision-making in uncertain environments, where actions have a direct reward as well as long-term ramifications by bringing in additional information used to improve future decisions. We propose a line of work where this exploration/exploitation tradeoff is applied to distributed settings with interacting independent agents.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Machine Learning 47(2-3), 235–256 (2002)
Chapman, A.C., Leslie, D.S., Rogers, A., Jennings, N.R.: Learning in unknown reward games: application to sensor networks. The Computer Journal, bxt082 (2013)
Kocsis, L., Szepesvári, C.: Bandit based monte-carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006)
Kok, J.R., Vlassis, N.: Using the max-plus algorithm for multiagent decision making in coordination graphs. In: Bredenfeld, A., Jacoff, A., Noda, I., Takahashi, Y. (eds.) RoboCup 2005. LNCS (LNAI), vol. 4020, pp. 1–12. Springer, Heidelberg (2006)
Koller, D., Friedman, N.: Probabilistic graphical models: principles and techniques. MIT press (2009)
Kschischang, F.R., Frey, B.J., Loeliger, H.A.: Factor graphs and the sum-product algorithm. IEEE Transactions on Information Theory 47(2), 498–519 (2001)
Nguyen, D.T., Yeoh, W., Lau, H.C., Zilberstein, S., Zhang, C.: Decentralized multi-agent reinforcement learning in average-reward dynamic dcops. In: Proceedings of the 2014 International Conference on Autonomous Agents and Multi-agent Systems, pp. 1341–1342. International Foundation for Autonomous Agents and Multiagent Systems (2014)
Ottens, B., Dimitrakakis, C., Faltings, B.: Duct: An upper confidence bound approach to distributed constraint optimization problems. In: Proceedings of the National Conference on Artificial Intelligence, pp. 528–534 (2012)
Pearl, J.: Causality: models, reasoning, and inference. Cambridge University Press (2000)
Silver, D., Veness, J.: Monte-carlo planning in large POMDPs. In: Advances in Neural Information Processing Systems, pp. 2164–2172 (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Pfrommer, J. (2015). Exploration/Exploitation in Stochastic Distributed Constraint Optimization Settings. In: Bajo, J., et al. Trends in Practical Applications of Agents, Multi-Agent Systems and Sustainability. Advances in Intelligent Systems and Computing, vol 372. Springer, Cham. https://doi.org/10.1007/978-3-319-19629-9_28
Download citation
DOI: https://doi.org/10.1007/978-3-319-19629-9_28
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19628-2
Online ISBN: 978-3-319-19629-9
eBook Packages: EngineeringEngineering (R0)