Robust Phi-Divergence MDPs

Ho, Chin Pang; Petrik, Marek; Wiesemann, Wolfram

Mathematics > Optimization and Control

arXiv:2205.14202 (math)

[Submitted on 27 May 2022 (v1), last revised 12 Jan 2023 (this version, v2)]

Title:Robust Phi-Divergence MDPs

Authors:Chin Pang Ho, Marek Petrik, Wolfram Wiesemann

View PDF

Abstract:In recent years, robust Markov decision processes (MDPs) have emerged as a prominent modeling framework for dynamic decision problems affected by uncertainty. In contrast to classical MDPs, which only account for stochasticity by modeling the dynamics through a stochastic process with a known transition kernel, robust MDPs additionally account for ambiguity by optimizing in view of the most adverse transition kernel from a prescribed ambiguity set. In this paper, we develop a novel solution framework for robust MDPs with s-rectangular ambiguity sets that decomposes the problem into a sequence of robust Bellman updates and simplex projections. Exploiting the rich structure present in the simplex projections corresponding to phi-divergence ambiguity sets, we show that the associated s-rectangular robust MDPs can be solved substantially faster than with state-of-the-art commercial solvers as well as a recent first-order solution scheme, thus rendering them attractive alternatives to classical MDPs in practical applications.

Subjects:	Optimization and Control (math.OC); Machine Learning (cs.LG)
Cite as:	arXiv:2205.14202 [math.OC]
	(or arXiv:2205.14202v2 [math.OC] for this version)
	https://doi.org/10.48550/arXiv.2205.14202
Journal reference:	Advances in Neural Information Processing Systems (Neurips), 2022

Submission history

From: Chin Pang Ho [view email]
[v1] Fri, 27 May 2022 19:08:55 UTC (304 KB)
[v2] Thu, 12 Jan 2023 11:18:41 UTC (660 KB)

Mathematics > Optimization and Control

Title:Robust Phi-Divergence MDPs

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Mathematics > Optimization and Control

Title:Robust Phi-Divergence MDPs

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators