Skip to main content

State Representation Learning from Demonstration

  • Conference paper
  • First Online:
Machine Learning, Optimization, and Data Science (LOD 2020)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12566))

  • 1355 Accesses

Abstract

Robots could learn their own state and universe representation from perception, experience, and observations without supervision. This desirable goal is the main focus of our field of interest, State Representation Learning (SRL). Indeed, a compact representation of such a state is beneficial to help robots grasp onto their environment for interacting. The properties of this representation have a strong impact on the adaptive capability of the agent. Our approach deals with imitation learning from demonstration towards a shared representation across multiple tasks in the same environment. Our imitation learning strategy relies on a multi-head neural network starting from a shared state representation feeding a task-specific agent. As expected, generalization demands tasks diversity during training for better transfer learning effects. Our experimental setup proves favorable comparison with other SRL strategies and shows more efficient end-to-end Reinforcement Learning (RL) in our case than with independently learned tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Roughly, different tasks refer to objectives of different natures, while different instances of a task refer to a difference of parameters in the task. For example, reaching various locations with a robotic arm is considered as different instances of the same reaching task.

  2. 2.

    The number of 24 dimensions has been selected empirically (not very large but leading to good RL results).

References

  1. Andrychowicz, M., et al.: Hindsight experience replay. In: Advances in Neural Information Processing Systems, pp. 5048–5058 (2017)

    Google Scholar 

  2. Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)

    Article  Google Scholar 

  3. Brockman, G., et al.: OpenAI Gym. arXiv preprint arXiv:1606.01540 (2016)

  4. de Bruin, T., Kober, J., Tuyls, K., Babuška, R.: Integrating state representation learning into deep reinforcement learning. IEEE Robot. Autom. Lett. 3(3), 1394–1401 (2018)

    Article  Google Scholar 

  5. Coumans, E., Bai, Y., Hsu, J.: Pybullet physics engine (2018)

    Google Scholar 

  6. Curran, W., Brys, T., Taylor, M., Smart, W.: Using PCA to efficiently represent state spaces. arXiv preprint arXiv:1505.00322 (2015)

  7. Finn, C., Yu, T., Zhang, T., Abbeel, P., Levine, S.: One-shot visual imitation learning via meta-learning. arXiv preprint arXiv:1709.04905 (2017)

  8. Gaier, A., Ha, D.: Weight agnostic neural networks. In: Advances in Neural Information Processing Systems, pp. 5364–5378 (2019)

    Google Scholar 

  9. Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. arXiv preprint arXiv:1801.01290 (2018)

  10. Higgins, I., et al..: beta-VAE: learning basic visual concepts with a constrained variational framework. In: International Conference on Learning Representations (2017)

    Google Scholar 

  11. Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)

    Article  MathSciNet  Google Scholar 

  12. Jonschkowski, R., Brock, O.: Learning state representations with robotic priors. Auton. Robots 39(3), 407–428 (2015). https://doi.org/10.1007/s10514-015-9459-7

    Article  Google Scholar 

  13. Jonschkowski, R., Hafner, R., Scholz, J., Riedmiller, M.: PVEs: position-velocity encoders for unsupervised learning of structured state representations. arXiv preprint arXiv:1705.09805 (2017)

  14. Kingma, D.P., Ba, J.: ADAM: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  15. Kingma, D.P., Welling, M.: Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013)

  16. Kober, J., Bagnell, J.A., Peters, J.: Reinforcement learning in robotics: a survey. Int. J. Robot. Res. 32(11), 1238–1274 (2013)

    Article  Google Scholar 

  17. Kober, J., Wilhelm, A., Oztop, E., Peters, J.: Reinforcement learning to adjust parametrized motor primitives to new situations. Auton. Robots 33(4), 361–379 (2012). https://doi.org/10.1007/s10514-012-9290-3

    Article  Google Scholar 

  18. Lesort, T., Díaz-Rodríguez, N., Goudou, J.F., Filliat, D.: State representation learning for control: an overview. Neural Netw. 108, 379–392 (2018)

    Article  Google Scholar 

  19. Mnih, V., et al.: Playing Atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013)

  20. Pastor, P., Hoffmann, H., Asfour, T., Schaal, S.: Learning and generalization of motor skills by learning from demonstration. In: 2009 IEEE International Conference on Robotics and Automation, ICRA 2009, pp. 763–768. IEEE (2009)

    Google Scholar 

  21. Pinto, L., Gupta, A.: Learning to push by grasping: using multiple tasks for effective learning. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 2161–2168. IEEE (2017)

    Google Scholar 

  22. Rusu, A.A., et al.: Policy distillation. arXiv preprint arXiv:1511.06295 (2015)

  23. Schultz, W., Dayan, P., Montague, P.R.: A neural substrate of prediction and reward. Science 275(5306), 1593–1599 (1997)

    Article  Google Scholar 

  24. Shin, H.C., et al.: Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans. Med. Imaging 35(5), 1285–1298 (2016)

    Article  Google Scholar 

  25. Taylor, M.E., Stone, P.: Transfer learning for reinforcement learning domains: a survey. J. Mach. Learn. Res. 10(Jul), 1633–1685 (2009)

    Google Scholar 

  26. Watter, M., Springenberg, J., Boedecker, J., Riedmiller, M.: Embed to control: a locally linear latent dynamics model for control from raw images. In: Advances in Neural Information Processing Systems, pp. 2746–2754 (2015)

    Google Scholar 

Download references

Acknowledgments

This article has been supported within the Labex SMART supported by French state funds managed by the ANR within the Investissements d’Avenir program under references ANR-11-LABX-65 and ANR-18-CE33-0005 HUSKI. We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan Xp GPU used for this research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Astrid Merckling .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Merckling, A., Coninx, A., Cressot, L., Doncieux, S., Perrin, N. (2020). State Representation Learning from Demonstration. In: Nicosia, G., et al. Machine Learning, Optimization, and Data Science. LOD 2020. Lecture Notes in Computer Science(), vol 12566. Springer, Cham. https://doi.org/10.1007/978-3-030-64580-9_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-64580-9_26

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-64579-3

  • Online ISBN: 978-3-030-64580-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics