State Representation Learning from Demonstration

Merckling, Astrid; Coninx, Alexandre; Cressot, Loic; Doncieux, Stephane; Perrin, Nicolas

doi:10.1007/978-3-030-64580-9_26

Astrid Merckling¹⁶,
Alexandre Coninx¹⁶,
Loic Cressot¹⁶,
Stephane Doncieux¹⁶ &
…
Nicolas Perrin¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12566))

Included in the following conference series:

International Conference on Machine Learning, Optimization, and Data Science

1355 Accesses

Abstract

Robots could learn their own state and universe representation from perception, experience, and observations without supervision. This desirable goal is the main focus of our field of interest, State Representation Learning (SRL). Indeed, a compact representation of such a state is beneficial to help robots grasp onto their environment for interacting. The properties of this representation have a strong impact on the adaptive capability of the agent. Our approach deals with imitation learning from demonstration towards a shared representation across multiple tasks in the same environment. Our imitation learning strategy relies on a multi-head neural network starting from a shared state representation feeding a task-specific agent. As expected, generalization demands tasks diversity during training for better transfer learning effects. Our experimental setup proves favorable comparison with other SRL strategies and shows more efficient end-to-end Reinforcement Learning (RL) in our case than with independently learned tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Roughly, different tasks refer to objectives of different natures, while different instances of a task refer to a difference of parameters in the task. For example, reaching various locations with a robotic arm is considered as different instances of the same reaching task.
2.
The number of 24 dimensions has been selected empirically (not very large but leading to good RL results).

References

Andrychowicz, M., et al.: Hindsight experience replay. In: Advances in Neural Information Processing Systems, pp. 5048–5058 (2017)
Google Scholar
Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)
Article Google Scholar
Brockman, G., et al.: OpenAI Gym. arXiv preprint arXiv:1606.01540 (2016)
de Bruin, T., Kober, J., Tuyls, K., Babuška, R.: Integrating state representation learning into deep reinforcement learning. IEEE Robot. Autom. Lett. 3(3), 1394–1401 (2018)
Article Google Scholar
Coumans, E., Bai, Y., Hsu, J.: Pybullet physics engine (2018)
Google Scholar
Curran, W., Brys, T., Taylor, M., Smart, W.: Using PCA to efficiently represent state spaces. arXiv preprint arXiv:1505.00322 (2015)
Finn, C., Yu, T., Zhang, T., Abbeel, P., Levine, S.: One-shot visual imitation learning via meta-learning. arXiv preprint arXiv:1709.04905 (2017)
Gaier, A., Ha, D.: Weight agnostic neural networks. In: Advances in Neural Information Processing Systems, pp. 5364–5378 (2019)
Google Scholar
Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. arXiv preprint arXiv:1801.01290 (2018)
Higgins, I., et al..: beta-VAE: learning basic visual concepts with a constrained variational framework. In: International Conference on Learning Representations (2017)
Google Scholar
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Article MathSciNet Google Scholar
Jonschkowski, R., Brock, O.: Learning state representations with robotic priors. Auton. Robots 39(3), 407–428 (2015). https://doi.org/10.1007/s10514-015-9459-7
Article Google Scholar
Jonschkowski, R., Hafner, R., Scholz, J., Riedmiller, M.: PVEs: position-velocity encoders for unsupervised learning of structured state representations. arXiv preprint arXiv:1705.09805 (2017)
Kingma, D.P., Ba, J.: ADAM: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Kingma, D.P., Welling, M.: Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013)
Kober, J., Bagnell, J.A., Peters, J.: Reinforcement learning in robotics: a survey. Int. J. Robot. Res. 32(11), 1238–1274 (2013)
Article Google Scholar
Kober, J., Wilhelm, A., Oztop, E., Peters, J.: Reinforcement learning to adjust parametrized motor primitives to new situations. Auton. Robots 33(4), 361–379 (2012). https://doi.org/10.1007/s10514-012-9290-3
Article Google Scholar
Lesort, T., Díaz-Rodríguez, N., Goudou, J.F., Filliat, D.: State representation learning for control: an overview. Neural Netw. 108, 379–392 (2018)
Article Google Scholar
Mnih, V., et al.: Playing Atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013)
Pastor, P., Hoffmann, H., Asfour, T., Schaal, S.: Learning and generalization of motor skills by learning from demonstration. In: 2009 IEEE International Conference on Robotics and Automation, ICRA 2009, pp. 763–768. IEEE (2009)
Google Scholar
Pinto, L., Gupta, A.: Learning to push by grasping: using multiple tasks for effective learning. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 2161–2168. IEEE (2017)
Google Scholar
Rusu, A.A., et al.: Policy distillation. arXiv preprint arXiv:1511.06295 (2015)
Schultz, W., Dayan, P., Montague, P.R.: A neural substrate of prediction and reward. Science 275(5306), 1593–1599 (1997)
Article Google Scholar
Shin, H.C., et al.: Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans. Med. Imaging 35(5), 1285–1298 (2016)
Article Google Scholar
Taylor, M.E., Stone, P.: Transfer learning for reinforcement learning domains: a survey. J. Mach. Learn. Res. 10(Jul), 1633–1685 (2009)
Google Scholar
Watter, M., Springenberg, J., Boedecker, J., Riedmiller, M.: Embed to control: a locally linear latent dynamics model for control from raw images. In: Advances in Neural Information Processing Systems, pp. 2746–2754 (2015)
Google Scholar

Download references

Acknowledgments

This article has been supported within the Labex SMART supported by French state funds managed by the ANR within the Investissements d’Avenir program under references ANR-11-LABX-65 and ANR-18-CE33-0005 HUSKI. We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan Xp GPU used for this research.

Author information

Authors and Affiliations

Sorbonne University, CNRS, Institute for Intelligent Systems and Robotics (ISIR), Paris, France
Astrid Merckling, Alexandre Coninx, Loic Cressot, Stephane Doncieux & Nicolas Perrin

Authors

Astrid Merckling
View author publications
You can also search for this author in PubMed Google Scholar
Alexandre Coninx
View author publications
You can also search for this author in PubMed Google Scholar
Loic Cressot
View author publications
You can also search for this author in PubMed Google Scholar
Stephane Doncieux
View author publications
You can also search for this author in PubMed Google Scholar
Nicolas Perrin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Astrid Merckling .

Editor information

Editors and Affiliations

University of Catania, Catania, Italy
Giuseppe Nicosia
University of Reading, Reading, UK
Varun Ojha
University of Oxford, Oxford, UK
Emanuele La Malfa
University of Cambridge, Cambridge, UK
Giorgio Jansen
Almawave, Rome, Italy
Vincenzo Sciacca
University of Florida, Gainesville, FL, USA
Panos Pardalos
University of Catania, Catania, Italy
Giovanni Giuffrida
Harvard University, Cambridge, MA, USA
Renato Umeton

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Merckling, A., Coninx, A., Cressot, L., Doncieux, S., Perrin, N. (2020). State Representation Learning from Demonstration. In: Nicosia, G., et al. Machine Learning, Optimization, and Data Science. LOD 2020. Lecture Notes in Computer Science(), vol 12566. Springer, Cham. https://doi.org/10.1007/978-3-030-64580-9_26

Download citation

DOI: https://doi.org/10.1007/978-3-030-64580-9_26
Published: 07 January 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-64579-3
Online ISBN: 978-3-030-64580-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics