Robust Imitation of a Few Demonstrations with a Backwards Model

Park, Jung Yeon; Wong, Lawson L. S.

Computer Science > Machine Learning

arXiv:2210.09337 (cs)

[Submitted on 17 Oct 2022]

Title:Robust Imitation of a Few Demonstrations with a Backwards Model

Authors:Jung Yeon Park, Lawson L.S. Wong

View PDF

Abstract:Behavior cloning of expert demonstrations can speed up learning optimal policies in a more sample-efficient way over reinforcement learning. However, the policy cannot extrapolate well to unseen states outside of the demonstration data, creating covariate shift (agent drifting away from demonstrations) and compounding errors. In this work, we tackle this issue by extending the region of attraction around the demonstrations so that the agent can learn how to get back onto the demonstrated trajectories if it veers off-course. We train a generative backwards dynamics model and generate short imagined trajectories from states in the demonstrations. By imitating both demonstrations and these model rollouts, the agent learns the demonstrated paths and how to get back onto these paths. With optimal or near-optimal demonstrations, the learned policy will be both optimal and robust to deviations, with a wider region of attraction. On continuous control domains, we evaluate the robustness when starting from different initial states unseen in the demonstration data. While both our method and other imitation learning baselines can successfully solve the tasks for initial states in the training distribution, our method exhibits considerably more robustness to different initial states.

Comments:	Conference on Neural Information Processing Systems (NeurIPS) 2022
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2210.09337 [cs.LG]
	(or arXiv:2210.09337v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2210.09337

Submission history

From: Jung Yeon Park [view email]
[v1] Mon, 17 Oct 2022 18:02:19 UTC (5,352 KB)

Computer Science > Machine Learning

Title:Robust Imitation of a Few Demonstrations with a Backwards Model

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Robust Imitation of a Few Demonstrations with a Backwards Model

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators