Skip to main content
Log in

Goal-Conditioned Variational Autoencoder Trajectory Primitives with Continuous and Discrete Latent Codes

  • Original Research
  • Published:
SN Computer Science Aims and scope Submit manuscript

Abstract

Imitation learning is an intuitive approach for teaching motion to robotic systems. Although previous studies have proposed various methods to model demonstrated movement primitives, one of the limitations of existing methods is that the shape of the trajectories is encoded in high dimensional space. The high dimensionality of the trajectory representation can be a bottleneck in the subsequent process such as planning a sequence of primitive motions. We address this problem by learning the latent space of the robot trajectory. If the latent variable of the trajectories can be learned, it can be used to tune the trajectory in an intuitive manner even when the user is not an expert. We propose a framework for modeling demonstrated trajectories with a neural network that learns the low-dimensional latent space. Our neural network structure is built on the variational autoencoder (VAE) with discrete and continuous latent variables. We extend the structure of the existing VAE to obtain the decoder that is conditioned on the goal position of the trajectory for generalization to different goal positions. Although the inference performed by VAE is not accurate, the positioning error at the generalized goal position can be reduced to less than 1 mm by incorporating the projection onto the solution space. To cope with requirement of the massive training data, we use a trajectory augmentation technique inspired by the data augmentation commonly used in the computer vision community. In the proposed framework, the latent variables that encodes the multiple types of trajectories are learned in an unsupervised manner, although existing methods usually require label information to model diverse behaviors. The learned decoder can be used as a motion planner in which the user can specify the goal position and the trajectory types by setting the latent variables. The experimental results show that our neural network can be trained using a limited number of demonstrated trajectories and that the interpretable latent representations can be learned.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. We customized the color to show the motion more clearly.

  2. Available at https://github.com/Schlumberger/joint-vae.

  3. Please note that the frames shown in the figures are not synchronized due to the limitation of our implementation.

References

  1. Osa T, Pajarinen J, Neumann G, Bagnell JA, Abbeel P, Peters J. An algorithmic perspective on imitation learning. Found Trends Robot. 2018;7(1–2):1–179.

    Google Scholar 

  2. Ijspeert AJ, Nakanishi J, Schaal S. Learning attractor landscapes for learning motor primitives. In Advances in neural information processing systems (NIPS). New York: Springer; 2002.

    Google Scholar 

  3. Ijspeert AJ, Nakanishi J, Hoffmann H, Pastor P, Schaal S. Dynamical movement primitives: learning attractor models for motor behaviors. Neural Comput. 2013;25(2):328–73.

    Article  MathSciNet  Google Scholar 

  4. Paraschos A, Daniel C, Peters J, Neumann G. Probabilistic movement primitives. In: Proceedings of advances in neural information processing systems (NIPS). MIT Press; 2013.

  5. Huang Y, Rozo L, Andand JS, Caldwell DG. Kernelized movement primitives. Int J Robot Res. 2019;38:833.

    Article  Google Scholar 

  6. Kingma DP, Welling M. Auto-encoding variational Bayes. In: Proceedings of the international conference on learning representations (ICLR), 2014.

  7. Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y. Generative adversarial nets. Advances in neural information processing systems (NIPS). New York: Springer; 2014.

    Google Scholar 

  8. Dupont E. Learning disentangled joint continuous and discrete representations. In: Advances in neural information processing systems 31 (NIPS 2018)), 2018.

  9. Zucker M, Ratliff N, Dragan A, Pivtoraiko M, Klingensmith M, Dellin C, Bagnell JA, Srinivasa S. Chomp: Covariant Hamiltonian optimization for motion planning. Int J Robot Res. 2013;32:1164–93.

    Article  Google Scholar 

  10. Schulman J, Duan Y, Ho J, Lee A, Awwal I, Bradlow H, Pan J, Patil S, Goldberg K, Abbeel P. Motion planning with sequential convex optimization and convex collision checking. Int J Robot Res. 2014;33(9):1251–70.

    Article  Google Scholar 

  11. Osa T, Sugita N, Mitsuishi M. Online trajectory planning and force control for automation of surgical tasks. IEEE Trans Autom Sci Eng. 2018;15(2):675–91.

    Article  Google Scholar 

  12. Shon A, Grochow K, Hertzmann A, Rao RP. Learning shared latent structure for image synthesis and robotic imitation. In: Advances in neural information processing systems (NIPS), 2005.

  13. Grimes DB, Rao RP. Learning nonparametric policies by imitation. In: Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS), 2008.

  14. Lawrence N. Probabilistic non-linear principal component analysis with Gaussian process latent variable models. J Mach Learn Res. 2005;6:1783–816.

    MathSciNet  MATH  Google Scholar 

  15. Levine S, Finn C, Darrell T, Abbeel P. End-to-end training of deep visuomotor policies. J Mach Learn Res. 2016;17(39):1–40.

    MathSciNet  MATH  Google Scholar 

  16. Haarnoja T, Pong VH, Zhou A, Dalal M, Abbeel P, Levine S. Composable deep reinforcement learning for robotic manipulation. In: Proceedings of the IEEE conference on robotics and automation (ICRA), 2018.

  17. Merel J, Hasenclever L, Galashov A, Ahuja A, Pham V, Wayne YWTG, Heess N. Neural probabilistic motor primitives for humanoid control. In: Proceedings of the international conference on learning representations (ICLR), 2019.

  18. Levine S. Reinforcement learning and control as probabilistic inference: tutorial and review 2018. arXiv.

  19. Arnold S, Yamazaki K. Fast and flexible multi-step cloth manipulation planning using an encode-manipulate-decode network (EM*D net). Front Neurorobot. 2019;13:22.

    Article  Google Scholar 

  20. Higgins I, Matthey L, Pal A, Burgess C, Glorot X, Botvinick M, Mohamed S, Lerchner A. beta-vae: Learning basic visual concepts with a constrained variational framework. In: Proceedings of the international conference on learning representations (ICLR), 2016.

  21. Kingma DP, Mohamed S, Rezende DJ, Welling M. Semi-supervised learning with deep generative models. In: Advances in neural information processing systems (NIPS), 2014.

  22. Maddison CJ, Mnih A, Teh YW. The concrete distribution: a continuous relaxation of discrete random variables. In: Proceedings of the international conference on learning representations (ICLR), 2017.

  23. Kalakrishnan M, Chitta S, Theodorou E, Pastor P, Schaal S. Stomp: Stochastic trajectory optimization for motion planning. In: Proceedings of the IEEE international conference on robotics and automation (ICRA), 2011. p. 4569–4574.

  24. Dragan AD, Muelling K, Bagnell JA, Srinivasa SS. Movement primitives via optimization. In: Proceedings of the IEEE international conference on robotics and automation (ICRA), 2015. p. 2339–2346.

  25. Rohmer E, Signgh SPN, Freese M. V-rep: a versatile and scalable robot simulation framework. In: Proceedings of IEEE/RSJ international conference on intelligent robots and systems (IROS), 2013.

  26. Cremer C, Li X, Duvenaud D. Inference suboptimality in variational autoencoders. In: Proceedings of the international conference on machine learning (ICML), 2018.

Download references

Funding

T.O. was supported by JSPS KAKENHI Grant number 19K20370, and S.I. was supported by JSPS KAKENHI Grant number 18H01410 and 19K22875.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Takayuki Osa.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Osa, T., Ikemoto, S. Goal-Conditioned Variational Autoencoder Trajectory Primitives with Continuous and Discrete Latent Codes. SN COMPUT. SCI. 1, 303 (2020). https://doi.org/10.1007/s42979-020-00324-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-020-00324-7

Keywords

Navigation