Assume-Guarantee Reinforcement Learning

Authors

  • Milad Kazemi King's College London, UK
  • Mateo Perez University of Colorado Boulder, USA
  • Fabio Somenzi University of Colorado Boulder, USA
  • Sadegh Soudjani Max Planck Institute for Software Systems, Germany
  • Ashutosh Trivedi University of Colorado Boulder, USA
  • Alvaro Velasquez University of Colorado Boulder, USA

DOI:

https://doi.org/10.1609/aaai.v38i19.30116

Keywords:

General

Abstract

We present a modular approach to reinforcement learning (RL) in environments consisting of simpler components evolving in parallel. A monolithic view of such modular environments may be prohibitively large to learn, or may require unrealizable communication between the components in the form of a centralized controller. Our proposed approach is based on the assume-guarantee paradigm where the optimal control for the individual components is synthesized in isolation by making assumptions about the behaviors of neighboring components, and providing guarantees about their own behavior. We express these assume-guarantee contracts as regular languages and provide automatic translations to scalar rewards to be used in RL. By combining local probabilities of satisfaction for each component, we provide a lower bound on the probability of satisfaction of the complete system. By solving a Markov game for each component, RL can produce a controller for each component that maximizes this lower bound. The controller utilizes the information it receives through communication, observations, and any knowledge of a coarse model of other agents. We experimentally demonstrate the efficiency of the proposed approach on a variety of case studies.

Published

2024-03-24

How to Cite

Kazemi, M., Perez, M., Somenzi, F., Soudjani, S., Trivedi, A., & Velasquez, A. (2024). Assume-Guarantee Reinforcement Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 38(19), 21223-21231. https://doi.org/10.1609/aaai.v38i19.30116

Issue

Section

AAAI Technical Track on Safe, Robust and Responsible AI Track