Abstract
The field of robot learning has made great advances in developing behaviour learning methodologies capable of learning policies for tasks ranging from manipulation to locomotion. However, the problem of combined learning of behaviour and robot structure, here called co-adaptation, is less studied. Most of the current co-adapting robot learning approaches rely on model-free algorithms or assume to have access to an a-priori known dynamics model, which requires considerable human engineering. In this work, we investigate the potential of combining model-free and model-based reinforcement learning algorithms for their application on co-adaptation problems with unknown dynamics functions. Classical model-based reinforcement learning is concerned with learning the forward dynamics of a specific agent or robot in its environment. However, in the case of jointly learning the behaviour and morphology of agents, each individual agent-design implies its own specific dynamics function. Here, the challenge is to learn a dynamics model capable of generalising between the different individual dynamics functions or designs. In other words, the learned dynamics model approximates a multi-dynamics function with the goal to generalise between different agent designs. We present a reinforcement learning algorithm that uses a learned multi-dynamics model for co-adapting robot’s behaviour and morphology using imagined rollouts. We show that using a multi-dynamics model for imagining transitions can lead to better performance for model-free co-adaptation, but open challenges remain.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bonyadi, M.R., Michalewicz, Z.: Particle swarm optimization for single objective continuous space problems: a review. Evol. Comput. 25(1), 1–54 (2017). https://doi.org/10.1162/EVCO_r_00180
Chen, T., He, Z., Ciocarlie, M.: Hardware as policy: mechanical and computational co-optimization using deep reinforcement learning (CoRL) (2020). http://arxiv.org/abs/2008.04460
Chua, K., Calandra, R., McAllister, R., Levine, S.: Deep reinforcement learning in a handful of trials using probabilistic dynamics models. In: Advances in Neural Information Processing Systems 2018-Decem (Nips), pp. 4754–4765 (2018)
Coumans, E., Bai, Y.: PyBullet, a python module for physics simulation for games, robotics and machine learning (2016-2021). http://pybullet.org
Dinev, T., Mastalli, C., Ivan, V., Tonneau, S., Vijayakumar, S.: Co-designing robots by differentiating motion solvers. arXiv preprint arXiv:2103.04660 (2021)
Gupta, A., Savarese, S., Ganguli, S., Fei-Fei, L.: Embodied intelligence via learning and evolution. Nat. Commun. 12(1) (2021). https://doi.org/10.1038/s41467-021-25874-z, http://dx.doi.org/10.1038/s41467-021-25874-z
Ha, D.: Reinforcement learning for improving agent design. Artif. Life 25(4), 352–365 (2019). https://doi.org/10.1162/artl_a_00301
Ha, S., Coros, S., Alspach, A., Kim, J., Yamane, K.: Computational co-optimization of design parameters and motion trajectories for robotic systems. Int. J. Robot. Res. 37(13–14), 1521–1536 (2018)
Haarnoja, T., et al.: Soft actor-critic algorithms and applications (2018). https://doi.org/10.48550/ARXIV.1812.05905, https://arxiv.org/abs/1812.05905
Hafner, D., Lillicrap, T., Ba, J., Norouzi, M.: Dream to control: learning behaviors by latent imagination, pp. 1–20 (2019). http://arxiv.org/abs/1912.01603
Harpak, A., et al.: Genetic adaptation in New York City rats. Genome Biol. Evol. 13(1) (2021). https://doi.org/10.1093/gbe/evaa247
Jackson, L., Walters, C., Eckersley, S., Senior, P., Hadfield, S.: ORCHID: optimisation of robotic control and hardware in design using reinforcement learning. In: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4911–4917 (2021). https://doi.org/10.1109/IROS51168.2021.9635865
Leong, M., Bertone, M.A., Savage, A.M., Bayless, K.M., Dunn, R.R., Trautwein, M.D.: The habitats humans provide: factors affecting the diversity and composition of arthropods in houses. Sci. Rep. 7(1), 15347 (2017). https://doi.org/10.1038/s41598-017-15584-2
Liao, T., et al.: Data-efficient learning of morphology and controller for a microrobot. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 2488–2494 (2019). https://doi.org/10.1109/ICRA.2019.8793802
Lipson, H., Pollack, J.B.: Automatic design and manufacture of robotic lifeforms. Nature 406(6799), 974–978 (2000). https://doi.org/10.1038/35023115
Luck, K.S., Amor, H.B., Calandra, R.: Data-efficient co-adaptation of morphology and behaviour with deep reinforcement learning. In: Kaelbling, L.P., Kragic, D., Sugiura, K. (eds.) Proceedings of the Conference on Robot Learning. Proceedings of Machine Learning Research, vol. 100, pp. 854–869. PMLR (2020). https://proceedings.mlr.press/v100/luck20a.html
Mitteroecker, P.: How human bodies are evolving in modern societies. Nat. Ecol. Evol. 3(3), 324–326 (2019). https://doi.org/10.1038/s41559-018-0773-2
Parks, S.E., Johnson, M., Nowacek, D., Tyack, P.L.: Individual right whales call louder in increased environmental noise. Biol. Let. 7(1), 33–35 (2011)
Potts, R.: Evolution and environmental change in early human prehistory. Annu. Rev. Anthropol. 41(1), 151–167 (2012). https://doi.org/10.1146/annurev-anthro-092611-145754
Racanière, S., et al.: Imagination-augmented agents for deep reinforcement learning. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Rajani, C., Arndt, K., Blanco-Mulero, D., Luck, K.S., Kyrki, V.: Co-imitation: learning design and behaviour by imitation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, no. 5, pp. 6200–6208 (2023). https://doi.org/10.1609/aaai.v37i5.25764, https://ojs.aaai.org/index.php/AAAI/article/view/25764
Reil, T., Husbands, P.: Evolution of central pattern generators for bipedal walking in a real-time physics environment. IEEE Trans. Evol. Comput. 6(2), 159–168 (2002). https://doi.org/10.1109/4235.996015
Rosser, K., Kok, J., Chahl, J., Bongard, J.: Sim2real gap is non-monotonic with robot complexity for morphology-in-the-loop flapping wing design. In: 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 7001–7007. IEEE (2020)
Schaff, C., Yunis, D., Chakrabarti, A., Walter, M.R.: Jointly learning to construct and control agents using deep reinforcement learning. Proceedings - IEEE International Conference on Robotics and Automation, vol. 2019-May, pp. 9798–9805 (2019). https://doi.org/10.1109/ICRA.2019.8793537
Sims, K.: Evolving 3D morphology and behavior by competition. Artif. Life 1(4), 353–372 (1994). https://doi.org/10.1162/artl.1994.1.4.353
Stanley, K.O., Miikkulainen, R.: Competitive coevolution through evolutionary complexification. J. Artif. Intell. Res. 21, 63–100 (2004)
Todorov, E., Erez, T., Tassa, Y.: MuJoCo: a physics engine for model-based control. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 5026–5033 (2012). https://doi.org/10.1109/IROS.2012.6386109
Acknowledgments
This work was supported by the Research Council of Finland Flagship programme: Finnish Center for Artificial Intelligence FCAI and by the German Research Foundation (DFG, Deutsche Forschungsgemeinschaft) as part of Germany’s Excellence Strategy – EXC 2050/1 – Project ID 390696704 – Cluster of Excellence “Centre for Tactile Internet with Human-in-the-Loop” (CeTI) of Technische Universität Dresden.
The authors wish to acknowledge the generous computational resources provided by the Aalto Science-IT project and the CSC – IT Center for Science, Finland.
We thank the reviewers for their insightful comments and help for improving the manuscript.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Sliacka, M., Mistry, M., Calandra, R., Kyrki, V., Luck, K.S. (2024). Co-imagination of Behaviour and Morphology of Agents. In: Nicosia, G., Ojha, V., La Malfa, E., La Malfa, G., Pardalos, P.M., Umeton, R. (eds) Machine Learning, Optimization, and Data Science. LOD 2023. Lecture Notes in Computer Science, vol 14505. Springer, Cham. https://doi.org/10.1007/978-3-031-53969-5_24
Download citation
DOI: https://doi.org/10.1007/978-3-031-53969-5_24
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-53968-8
Online ISBN: 978-3-031-53969-5
eBook Packages: Computer ScienceComputer Science (R0)