Exploration/Exploitation in Stochastic Distributed Constraint Optimization Settings

Pfrommer, Julius

doi:10.1007/978-3-319-19629-9_28

Julius Pfrommer¹²

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 372))

841 Accesses

Abstract

In recent years, important results have been achived in decision-making in uncertain environments, where actions have a direct reward as well as long-term ramifications by bringing in additional information used to improve future decisions. We propose a line of work where this exploration/exploitation tradeoff is applied to distributed settings with interacting independent agents.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Machine Learning 47(2-3), 235–256 (2002)
Article MATH Google Scholar
Chapman, A.C., Leslie, D.S., Rogers, A., Jennings, N.R.: Learning in unknown reward games: application to sensor networks. The Computer Journal, bxt082 (2013)
Google Scholar
Kocsis, L., Szepesvári, C.: Bandit based monte-carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006)
Chapter Google Scholar
Kok, J.R., Vlassis, N.: Using the max-plus algorithm for multiagent decision making in coordination graphs. In: Bredenfeld, A., Jacoff, A., Noda, I., Takahashi, Y. (eds.) RoboCup 2005. LNCS (LNAI), vol. 4020, pp. 1–12. Springer, Heidelberg (2006)
Chapter Google Scholar
Koller, D., Friedman, N.: Probabilistic graphical models: principles and techniques. MIT press (2009)
Google Scholar
Kschischang, F.R., Frey, B.J., Loeliger, H.A.: Factor graphs and the sum-product algorithm. IEEE Transactions on Information Theory 47(2), 498–519 (2001)
Article MATH MathSciNet Google Scholar
Nguyen, D.T., Yeoh, W., Lau, H.C., Zilberstein, S., Zhang, C.: Decentralized multi-agent reinforcement learning in average-reward dynamic dcops. In: Proceedings of the 2014 International Conference on Autonomous Agents and Multi-agent Systems, pp. 1341–1342. International Foundation for Autonomous Agents and Multiagent Systems (2014)
Google Scholar
Ottens, B., Dimitrakakis, C., Faltings, B.: Duct: An upper confidence bound approach to distributed constraint optimization problems. In: Proceedings of the National Conference on Artificial Intelligence, pp. 528–534 (2012)
Google Scholar
Pearl, J.: Causality: models, reasoning, and inference. Cambridge University Press (2000)
Google Scholar
Silver, D., Veness, J.: Monte-carlo planning in large POMDPs. In: Advances in Neural Information Processing Systems, pp. 2164–2172 (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Fraunhofer Institute of Optronics, System Technologies and Image Exploitation, Karlsruhe, Germany
Julius Pfrommer

Authors

Julius Pfrommer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Julius Pfrommer .

Editor information

Editors and Affiliations

Departamento de Inteligencia Artiﬁcial, Universidad Politécnica de Madrid, Madrid, Spain
Javier Bajo
Departamento de Inteligencia Artiﬁcial, Universidad Politécnica de Madrid, Madrid, Spain
Josefa Z. Hernández
Lille University of Science and Technology, Villeneuve d’Ascq Cédex, France
Philippe Mathieu
Department of Computer Science, Dartmouth College, Hanover, USA
Andrew Campbell
Computing Systems Department, Universidad de Castilla-La Mancha ESII, Albacete, Spain
Antonio Fernández-Caballero
Departamento de Informática y Automática, Universidad de Salamanca, Salamanca, Spain
María N. Moreno
Departamento de Sistemas Informáticos y Computación, Universidad Politécnica de Valencia, Valencia, Spain
Vicente Julián
Computer Science Department, Laboratory for Research and Development in Artiﬁcial Intelligence (LIDIA), University of A Coruña, A Coruña, Spain
Amparo Alonso-Betanzos
Facultat de Lletres de la Universitat Rovira i Virgili, Tarragona, Spain
María Dolores Jiménez-López
Departamento de Sistemas Informáticos y Computación, Universidad Politécnica de Valencia, Valencia, Spain
Vicente Botti

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pfrommer, J. (2015). Exploration/Exploitation in Stochastic Distributed Constraint Optimization Settings. In: Bajo, J., et al. Trends in Practical Applications of Agents, Multi-Agent Systems and Sustainability. Advances in Intelligent Systems and Computing, vol 372. Springer, Cham. https://doi.org/10.1007/978-3-319-19629-9_28

Download citation

DOI: https://doi.org/10.1007/978-3-319-19629-9_28
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19628-2
Online ISBN: 978-3-319-19629-9
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics