Abstract
Imitation learning allows social robots to learn new skills from human teachers without substantial manual programming, but it is difficult for robotic imitation learning systems to generalize demonstrated skills as well as human learners do. Contemporary neurocomputational approaches to imitation learning achieve limited generalization at the cost of data-intensive training, and often produce opaque models that are difficult to understand and debug. In this study, we explore the viability of developing purely-neural controllers for social robots that learn to imitate by reasoning about the underlying intentions of demonstrated behaviors. We present a novel hypothetico-deductive reasoning algorithm that combines bottom-up abductive inference with top-down predictive verification and captures important aspects of human causal reasoning that are relevant to a broad range of cognitive domains. We also present NeuroCERIL, a neurocognitive architecture that implements this algorithm using only neural computations, and produces generalizable and human-readable explanations for demonstrated behavior. Our empirical results demonstrate that NeuroCERIL can learn various procedural skills in a simulated robotic imitation learning domain. We also show that its causal reasoning procedure is computationally efficient, and that its memory use is dominated by highly transient short-term memories, much like human working memory. We conclude that NeuroCERIL is a viable neural model of human-like imitation learning that can improve human-robot collaboration and contribute to investigations of the neurocomputational basis of human cognition.
Similar content being viewed by others
Data Availability
The datasets generated during and/or analysed during the current study are available in the NeuroCERIL repository (https://github.com/vicariousgreg/neuroceril), which includes an implementation of the model as well as the encodings of behavioral demonstrations and model outputs generated during testing.
Notes
The model was tested on a GPU accelerated desktop computer, which completed one million timesteps of model execution in \(\sim \) 88 min using \(\sim \) 20.5 GB of GPU memory.
We used slightly more complex versions of the IL and AI block stacking tasks that include more blocks and actions than those reported in [8].
References
Jones SS (2009) The development of imitation in infancy. Philos Trans R Soc B Biol Sci 364(1528):2325–2335
Meltzoff AN, Kuhl PK, Movellan J, Sejnowski TJ (2009) Foundations for a new science of learning. Science 325(5938):284–288
Ravichandar H, Polydoros AS, Chernova S, Billard A (2020) Recent advances in robot learning from demonstration. Ann Rev Control Robot Autonom Syst 3:297–330
Hussein A, Gaber MM, Elyan E, Jayne C (2017) Imitation learning: a survey of learning methods. ACM Comput Surv (CSUR) 50(2):1–35
Billard A, Calinon S, Dillmann R, Schaal S (2008) Survey: robot programming by demonstration. Springer, Technical report
Schaal S (1999) Is imitation learning the route to humanoid robots? Trends Cogn Sci 3(6):233–242
Trafton JG, Cassimatis NL, Bugajska MD, Brock DP, Mintz FE, Schultz AC (2005) Enabling effective human–robot interaction using perspective-taking in robots. IEEE Trans Syst Man Cybern Part A Syst Hum 35(4):460–470
Katz G, Huang D-W, Hauge T, Gentili R, Reggia J (2017) A novel parsimonious cause-effect reasoning algorithm for robot imitation and plan recognition. IEEE Trans Cognit Dev Syst 10(2):177–193
Bandura A (2017) Psychological modeling: conflicting theories. Transaction Publishers, New Jersey
Meltzoff AN (1995) Understanding the intentions of others: re-enactment of intended acts by 18-month-old children. Dev Psychol 31(5):838
Baldwin DA, Baird JA (2001) Discerning intentions in dynamic human action. Trends Cogn Sci 5(4):171–178
Tomasello M, Kruger AC, Ratner HH (1993) Cultural learning. Behav Brain Sci 16(3):495–511
Oztop E, Kawato M, Arbib MA (2013) Mirror neurons: functions, mechanisms and models. Neurosci Lett 540:43–55
Jackson PL, Meltzoff AN, Decety J (2006) Neural circuits involved in imitation and perspective-taking. Neuroimage 31(1):429–439
Fogassi L, Ferrari PF, Gesierich B, Rozzi S, Chersi F, Rizzolatti G (2005) Parietal lobe: from action organization to intention understanding. Science 308(5722):662–667
Köster M, Langeloh M, Kliesch C, Kanngiesser P, Hoehl S (2020) Motor cortex activity during action observation predicts subsequent action imitation in human infants. Neuroimage 218:116958
Argall BD, Chernova S, Veloso M, Browning B (2009) A survey of robot learning from demonstration. Robot Auton Syst 57(5):469–483
Lee J (2017) A survey of robot learning from demonstrations for human–robot collaboration. arXiv:1710.08789
Barros JJO, dos Santos VMF, da Silva FMTP (2015) Bimanual haptics for humanoid robot teleoperation using ros and v-rep. In: 2015 IEEE international conference on autonomous robot systems and competitions. IEEE, pp 174–179
Fitzgerald T, Goel AK, Thomaz AL (2014) Representing skill demonstrations for adaptation and transfer. In: 2014 AAAI fall symposium series
Wu Y, Su Y, Demiris Y (2014) A morphable template framework for robot learning by demonstration: integrating one-shot and incremental learning approaches. Robot Auton Syst 62(10):1517–1530
Abbeel P, Coates A, Ng AY (2010) Autonomous helicopter aerobatics through apprenticeship learning. Int J Robot Res 29(13):1608–1639
Argall B, Browning B, Veloso M (2011) Learning mobile robot motion control from demonstrated primitives and human feedback. Robot Res 70:417–432
Ho J, Ermon S (2016) Generative adversarial imitation learning. Adv Neural Inf Process Syst 29
Osa T, Pajarinen J, Neumann G, Bagnell JA, Abbeel P, Peters J (2018) An algorithmic perspective on imitation learning. Found Trends Robot 7(1–2):1–179
MacGlashan J, Littman ML (2015) Between imitation and intention learning. In: Twenty-fourth international joint conference on artificial intelligence
Sun S-H, Noh H, Somasundaram S, Lim J (2018) Neural program synthesis from diverse demonstration videos. In: International conference on machine learning. PMLR, pp 4790–4799
Xu D, Nair S, Zhu Y, Gao J, Garg A, Fei-Fei L, Savarese S (2018) Neural task programming: learning to generalize across hierarchical tasks. In: 2018 IEEE international conference on robotics and automation (ICRA). IEEE, pp 3795–3802
Boteanu A, Kent D, Mohseni-Kabir A, Rich C, Chernova S (2015) Towards robot adaptability in new situations. In: 2015 AAAI fall symposium series
Le H, Jiang N, Agarwal A, Dudik M, Yue Y, Daumé H (2018) III: hierarchical imitation and reinforcement learning. In: Proceedings of the 35th international conference on machine learning. Proceedings of machine learning research, vol. 80, pp 2917–2926
Friesen AL, Rao RP (2010) Imitation learning with hierarchical actions. In: 2010 IEEE 9th international conference on development and learning. IEEE, pp 263–268
De Haan P, Jayaraman D, Levine S (2019) Causal confusion in imitation learning. Adv Neural Inf Process Syst 32
Zhang J, Kumor D, Bareinboim E (2020) Causal imitation learning with unobserved confounders. Adv Neural Inf Process Syst 33:12263–12274
Swamy G, Choudhury S, Bagnell D, Wu S (2022) Causal imitation learning under temporally correlated noise. In: International conference on machine learning. PMLR, pp 20877–20890
Reggia JA, Katz GE, Davis GP (2018) Humanoid cognitive robots that learn by imitating: implications for consciousness studies. Front Robot AI 5:1
Duan Y, Andrychowicz M, Stadie B, Ho J, Schneider J, Sutskever I, Abbeel P, Zaremba W (2017) One-shot imitation learning. In: Proceedings of the 31st international conference on neural information processing systems, pp 1087–1098
Liu Y, Gupta A, Abbeel P, Levine S (2018) Imitation from observation: learning to imitate behaviors from raw video via context translation. In: 2018 IEEE international conference on robotics and automation (ICRA). IEEE, pp 1118–1125
Bunel R, Hausknecht M, Devlin J, Singh R, Kohli P (2018) Leveraging grammar and reinforcement learning for neural program synthesis. arXiv:1805.04276
Silver D, Huang A, Maddison CJ, Guez A, Sifre L, Van Den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M (2016) Mastering the game of go with deep neural networks and tree search. Nature 529(7587):484–489
Kalyan A, Mohta A, Polozov O, Batra D, Jain P, Gulwani S (2018) Neural-guided deductive search for real-time program synthesis from examples. arXiv:1804.01186
Davis GP, Katz GE, Gentili RJ, Reggia JA (2021) Compositional memory in attractor neural networks with one-step learning. Neural Netw 138:78–97
Katz GE, Davis GP, Gentili RJ, Reggia JA (2019) A programmable neural virtual machine based on a fast store-erase learning rule. Neural Netw 119:10–30
Sylvester J, Reggia J (2016) Engineering neural systems for high-level problem solving. Neural Netw 79:37–52
Davis GP, Katz GE, Gentili RJ, Reggia JA (2022) NeuroLISP: high-level symbolic programming with attractor neural networks. Neural Netw 146:200–219
Katz GE, Akshay, Davis GP, Gentili RJ, Reggia JA (2021) Tunable neural encoding of a symbolic robotic manipulation algorithm. Front Neurorobot 167
Gentili RJ, Oh H, Huang D-W, Katz GE, Miller RH, Reggia JA (2015) A neural architecture for performing actual and mentally simulated movements during self-intended and observed bimanual arm reaching movements. Int J Soc Robot 7(3):371–392
Lawson AE (2000) How do humans acquire knowledge? and what does that imply about the nature of knowledge? Sci Educ 9(6):577–598
Sprenger J (2011) Hypothetico-deductive confirmation. Philos Compass 6(7):497–508
Marcum JA (2012) An integrated model of clinical reasoning: dual-process theory of cognition and metacognition. J Eval Clin Pract 18(5):954–961
Reggia JA, Peng Y (1987) Modeling diagnostic reasoning: a summary of parsimonious covering theory. Comput Methods Programs Biomed 25(2):125–134
Lawson AE (2000) The generality of hypothetico-deductive reasoning: making scientific thinking explicit. Am Biol Teach 62(7):482–495
Huang D-W, Katz G, Langsfeld J, Gentili R, Reggia J (2015) A virtual demonstrator environment for robot imitation learning. In: 2015 IEEE international conference on technologies for practical robot applications (TePRA). IEEE, pp 1–6
Erol K, Hendler JA, Nau DS (1994) UMCP: a sound and complete procedure for hierarchical task-network planning. Aips 94:249–254
Lake BM, Ullman TD, Tenenbaum JB, Gershman SJ (2017) Building machines that learn and think like people. Behav Brain Sci 40
Hupkes D, Dankers V, Mul M, Bruni E (2020) Compositionality decomposed: How do neural networks generalise? J Artif Intell Res 67:757–795
Lake B, Baroni M (2018) Generalization without systematicity: on the compositional skills of sequence-to-sequence recurrent networks. In: International conference on machine learning, pp 2873–2882
Loula J, Baroni M, Lake B (2018) Rearranging the familiar: testing compositional generalization in recurrent networks. In: Proceedings of the 2018 EMNLP workshop BlackboxNLP: analyzing and interpreting neural networks for NLP, pp 108–114
Reggia JA, Katz GE, Davis GP (2019) Modeling working memory to identify computational correlates of consciousness. Open Philos 2(1):252–269
Lea C, Flynn MD, Vidal R, Reiter A, Hager GD (2017) Temporal convolutional networks for action segmentation and detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 156–165
Farha YA, Gall J (2019) Ms-tcn: multi-stage temporal convolutional network for action segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3575–3584
Simonyan K, Zisserman A (2014) Two-stream convolutional networks for action recognition in videos. Adv Neural Inf Process Syst 27
Acknowledgements
This work was supported by ONR award N00014-19-1-2044.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have no relevant financial or non-financial interests to disclose.
Ethical approval
Our submitted work is original and has has not been published or submitted elsewhere for review. There are no human/animal subjects or biological data involved in this work, and no confidential information of any kind.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Davis, G.P., Katz, G.E., Gentili, R.J. et al. NeuroCERIL: Robotic Imitation Learning via Hierarchical Cause-Effect Reasoning in Programmable Attractor Neural Networks. Int J of Soc Robotics 15, 1277–1295 (2023). https://doi.org/10.1007/s12369-023-00997-z
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12369-023-00997-z