Assume-Guarantee Reinforcement Learning

Milad Kazemi; Mateo Perez; Fabio Somenzi; Sadegh Soudjani; Ashutosh Trivedi; Alvaro Velasquez

doi:10.1609/aaai.v38i19.30116

Authors

Milad Kazemi King's College London, UK
Mateo Perez University of Colorado Boulder, USA
Fabio Somenzi University of Colorado Boulder, USA
Sadegh Soudjani Max Planck Institute for Software Systems, Germany
Ashutosh Trivedi University of Colorado Boulder, USA
Alvaro Velasquez University of Colorado Boulder, USA

DOI:

https://doi.org/10.1609/aaai.v38i19.30116

Keywords:

General

Abstract

We present a modular approach to reinforcement learning (RL) in environments consisting of simpler components evolving in parallel. A monolithic view of such modular environments may be prohibitively large to learn, or may require unrealizable communication between the components in the form of a centralized controller. Our proposed approach is based on the assume-guarantee paradigm where the optimal control for the individual components is synthesized in isolation by making assumptions about the behaviors of neighboring components, and providing guarantees about their own behavior. We express these assume-guarantee contracts as regular languages and provide automatic translations to scalar rewards to be used in RL. By combining local probabilities of satisfaction for each component, we provide a lower bound on the probability of satisfaction of the complete system. By solving a Markov game for each component, RL can produce a controller for each component that maximizes this lower bound. The controller utilizes the information it receives through communication, observations, and any knowledge of a coarse model of other agents. We experimentally demonstrate the efficiency of the proposed approach on a variety of case studies.

Assume-Guarantee Reinforcement Learning

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription