Co-imagination of Behaviour and Morphology of Agents

Sliacka, Maria; Mistry, Michael; Calandra, Roberto; Kyrki, Ville; Luck, Kevin Sebastian

doi:10.1007/978-3-031-53969-5_24

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14505))

Included in the following conference series:

International Conference on Machine Learning, Optimization, and Data Science

163 Accesses

Abstract

The field of robot learning has made great advances in developing behaviour learning methodologies capable of learning policies for tasks ranging from manipulation to locomotion. However, the problem of combined learning of behaviour and robot structure, here called co-adaptation, is less studied. Most of the current co-adapting robot learning approaches rely on model-free algorithms or assume to have access to an a-priori known dynamics model, which requires considerable human engineering. In this work, we investigate the potential of combining model-free and model-based reinforcement learning algorithms for their application on co-adaptation problems with unknown dynamics functions. Classical model-based reinforcement learning is concerned with learning the forward dynamics of a specific agent or robot in its environment. However, in the case of jointly learning the behaviour and morphology of agents, each individual agent-design implies its own specific dynamics function. Here, the challenge is to learn a dynamics model capable of generalising between the different individual dynamics functions or designs. In other words, the learned dynamics model approximates a multi-dynamics function with the goal to generalise between different agent designs. We present a reinforcement learning algorithm that uses a learned multi-dynamics model for co-adapting robot’s behaviour and morphology using imagined rollouts. We show that using a multi-dynamics model for imagining transitions can lead to better performance for model-free co-adaptation, but open challenges remain.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bonyadi, M.R., Michalewicz, Z.: Particle swarm optimization for single objective continuous space problems: a review. Evol. Comput. 25(1), 1–54 (2017). https://doi.org/10.1162/EVCO_r_00180
Article Google Scholar
Chen, T., He, Z., Ciocarlie, M.: Hardware as policy: mechanical and computational co-optimization using deep reinforcement learning (CoRL) (2020). http://arxiv.org/abs/2008.04460
Chua, K., Calandra, R., McAllister, R., Levine, S.: Deep reinforcement learning in a handful of trials using probabilistic dynamics models. In: Advances in Neural Information Processing Systems 2018-Decem (Nips), pp. 4754–4765 (2018)
Google Scholar
Coumans, E., Bai, Y.: PyBullet, a python module for physics simulation for games, robotics and machine learning (2016-2021). http://pybullet.org
Dinev, T., Mastalli, C., Ivan, V., Tonneau, S., Vijayakumar, S.: Co-designing robots by differentiating motion solvers. arXiv preprint arXiv:2103.04660 (2021)
Gupta, A., Savarese, S., Ganguli, S., Fei-Fei, L.: Embodied intelligence via learning and evolution. Nat. Commun. 12(1) (2021). https://doi.org/10.1038/s41467-021-25874-z, http://dx.doi.org/10.1038/s41467-021-25874-z
Ha, D.: Reinforcement learning for improving agent design. Artif. Life 25(4), 352–365 (2019). https://doi.org/10.1162/artl_a_00301
Article Google Scholar
Ha, S., Coros, S., Alspach, A., Kim, J., Yamane, K.: Computational co-optimization of design parameters and motion trajectories for robotic systems. Int. J. Robot. Res. 37(13–14), 1521–1536 (2018)
Article Google Scholar
Haarnoja, T., et al.: Soft actor-critic algorithms and applications (2018). https://doi.org/10.48550/ARXIV.1812.05905, https://arxiv.org/abs/1812.05905
Hafner, D., Lillicrap, T., Ba, J., Norouzi, M.: Dream to control: learning behaviors by latent imagination, pp. 1–20 (2019). http://arxiv.org/abs/1912.01603
Harpak, A., et al.: Genetic adaptation in New York City rats. Genome Biol. Evol. 13(1) (2021). https://doi.org/10.1093/gbe/evaa247
Jackson, L., Walters, C., Eckersley, S., Senior, P., Hadfield, S.: ORCHID: optimisation of robotic control and hardware in design using reinforcement learning. In: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4911–4917 (2021). https://doi.org/10.1109/IROS51168.2021.9635865
Leong, M., Bertone, M.A., Savage, A.M., Bayless, K.M., Dunn, R.R., Trautwein, M.D.: The habitats humans provide: factors affecting the diversity and composition of arthropods in houses. Sci. Rep. 7(1), 15347 (2017). https://doi.org/10.1038/s41598-017-15584-2
Article Google Scholar
Liao, T., et al.: Data-efficient learning of morphology and controller for a microrobot. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 2488–2494 (2019). https://doi.org/10.1109/ICRA.2019.8793802
Lipson, H., Pollack, J.B.: Automatic design and manufacture of robotic lifeforms. Nature 406(6799), 974–978 (2000). https://doi.org/10.1038/35023115
Article Google Scholar
Luck, K.S., Amor, H.B., Calandra, R.: Data-efficient co-adaptation of morphology and behaviour with deep reinforcement learning. In: Kaelbling, L.P., Kragic, D., Sugiura, K. (eds.) Proceedings of the Conference on Robot Learning. Proceedings of Machine Learning Research, vol. 100, pp. 854–869. PMLR (2020). https://proceedings.mlr.press/v100/luck20a.html
Mitteroecker, P.: How human bodies are evolving in modern societies. Nat. Ecol. Evol. 3(3), 324–326 (2019). https://doi.org/10.1038/s41559-018-0773-2
Article Google Scholar
Parks, S.E., Johnson, M., Nowacek, D., Tyack, P.L.: Individual right whales call louder in increased environmental noise. Biol. Let. 7(1), 33–35 (2011)
Article Google Scholar
Potts, R.: Evolution and environmental change in early human prehistory. Annu. Rev. Anthropol. 41(1), 151–167 (2012). https://doi.org/10.1146/annurev-anthro-092611-145754
Article Google Scholar
Racanière, S., et al.: Imagination-augmented agents for deep reinforcement learning. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Rajani, C., Arndt, K., Blanco-Mulero, D., Luck, K.S., Kyrki, V.: Co-imitation: learning design and behaviour by imitation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, no. 5, pp. 6200–6208 (2023). https://doi.org/10.1609/aaai.v37i5.25764, https://ojs.aaai.org/index.php/AAAI/article/view/25764
Reil, T., Husbands, P.: Evolution of central pattern generators for bipedal walking in a real-time physics environment. IEEE Trans. Evol. Comput. 6(2), 159–168 (2002). https://doi.org/10.1109/4235.996015
Article Google Scholar
Rosser, K., Kok, J., Chahl, J., Bongard, J.: Sim2real gap is non-monotonic with robot complexity for morphology-in-the-loop flapping wing design. In: 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 7001–7007. IEEE (2020)
Google Scholar
Schaff, C., Yunis, D., Chakrabarti, A., Walter, M.R.: Jointly learning to construct and control agents using deep reinforcement learning. Proceedings - IEEE International Conference on Robotics and Automation, vol. 2019-May, pp. 9798–9805 (2019). https://doi.org/10.1109/ICRA.2019.8793537
Sims, K.: Evolving 3D morphology and behavior by competition. Artif. Life 1(4), 353–372 (1994). https://doi.org/10.1162/artl.1994.1.4.353
Article Google Scholar
Stanley, K.O., Miikkulainen, R.: Competitive coevolution through evolutionary complexification. J. Artif. Intell. Res. 21, 63–100 (2004)
Article Google Scholar
Todorov, E., Erez, T., Tassa, Y.: MuJoCo: a physics engine for model-based control. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 5026–5033 (2012). https://doi.org/10.1109/IROS.2012.6386109

Download references

Acknowledgments

This work was supported by the Research Council of Finland Flagship programme: Finnish Center for Artificial Intelligence FCAI and by the German Research Foundation (DFG, Deutsche Forschungsgemeinschaft) as part of Germany’s Excellence Strategy – EXC 2050/1 – Project ID 390696704 – Cluster of Excellence “Centre for Tactile Internet with Human-in-the-Loop” (CeTI) of Technische Universität Dresden.

The authors wish to acknowledge the generous computational resources provided by the Aalto Science-IT project and the CSC – IT Center for Science, Finland.

We thank the reviewers for their insightful comments and help for improving the manuscript.

Author information

Authors and Affiliations

Department of Electrical Engineering and Automation (EEA), Aalto University, Espoo, Finland
Maria Sliacka, Ville Kyrki & Kevin Sebastian Luck
University of Edinburgh, Edinburgh, UK
Michael Mistry
Learning, Adaptive Systems, and Robotics (LASR) Lab, TU Dresden, Dresden, Germany
Roberto Calandra
The Centre for Tactile Internet with Human-in-the-Loop (CeTI), Dresden, Germany
Roberto Calandra
Finnish Center for Artificial Intelligence, Espoo, Finland
Kevin Sebastian Luck
Vrije Universiteit Amsterdam, Amsterdam, Netherlands
Kevin Sebastian Luck

Authors

Maria Sliacka
View author publications
You can also search for this author in PubMed Google Scholar
Michael Mistry
View author publications
You can also search for this author in PubMed Google Scholar
Roberto Calandra
View author publications
You can also search for this author in PubMed Google Scholar
Ville Kyrki
View author publications
You can also search for this author in PubMed Google Scholar
Kevin Sebastian Luck
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kevin Sebastian Luck .

Editor information

Editors and Affiliations

University of Catania, Catania, Catania, Italy
Giuseppe Nicosia
Newcastle University, Newcastle upon Tyne, UK
Varun Ojha
University of Oxford, Oxford, UK
Emanuele La Malfa
University of Cambridge, Cambridge, UK
Gabriele La Malfa
University of Florida, Gainesville, FL, USA
Panos M. Pardalos
Dana-Farber Cancer Institute, Boston, MA, USA
Renato Umeton

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sliacka, M., Mistry, M., Calandra, R., Kyrki, V., Luck, K.S. (2024). Co-imagination of Behaviour and Morphology of Agents. In: Nicosia, G., Ojha, V., La Malfa, E., La Malfa, G., Pardalos, P.M., Umeton, R. (eds) Machine Learning, Optimization, and Data Science. LOD 2023. Lecture Notes in Computer Science, vol 14505. Springer, Cham. https://doi.org/10.1007/978-3-031-53969-5_24

Download citation

DOI: https://doi.org/10.1007/978-3-031-53969-5_24
Published: 16 February 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-53968-8
Online ISBN: 978-3-031-53969-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Co-imagination of Behaviour and Morphology of Agents