When is Agnostic Reinforcement Learning Statistically Tractable?

Jia, Zeyu; Li, Gene; Rakhlin, Alexander; Sekhari, Ayush; Srebro, Nathan

Computer Science > Machine Learning

arXiv:2310.06113 (cs)

[Submitted on 9 Oct 2023]

Title:When is Agnostic Reinforcement Learning Statistically Tractable?

Authors:Zeyu Jia, Gene Li, Alexander Rakhlin, Ayush Sekhari, Nathan Srebro

View PDF

Abstract:We study the problem of agnostic PAC reinforcement learning (RL): given a policy class $\Pi$, how many rounds of interaction with an unknown MDP (with a potentially large state and action space) are required to learn an $\epsilon$-suboptimal policy with respect to $\Pi$? Towards that end, we introduce a new complexity measure, called the \emph{spanning capacity}, that depends solely on the set $\Pi$ and is independent of the MDP dynamics. With a generative model, we show that for any policy class $\Pi$, bounded spanning capacity characterizes PAC learnability. However, for online RL, the situation is more subtle. We show there exists a policy class $\Pi$ with a bounded spanning capacity that requires a superpolynomial number of samples to learn. This reveals a surprising separation for agnostic learnability between generative access and online access models (as well as between deterministic/stochastic MDPs under online access). On the positive side, we identify an additional \emph{sunflower} structure, which in conjunction with bounded spanning capacity enables statistically efficient online RL via a new algorithm called POPLER, which takes inspiration from classical importance sampling methods as well as techniques for reachable-state identification and policy evaluation in reward-free exploration.

Comments:	Accepted to NeurIPS 2023
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Statistics Theory (math.ST); Machine Learning (stat.ML)
Cite as:	arXiv:2310.06113 [cs.LG]
	(or arXiv:2310.06113v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2310.06113

Submission history

From: Gene Li [view email]
[v1] Mon, 9 Oct 2023 19:40:54 UTC (1,180 KB)

Computer Science > Machine Learning

Title:When is Agnostic Reinforcement Learning Statistically Tractable?

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:When is Agnostic Reinforcement Learning Statistically Tractable?

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators