Symbol grounding through robotic manipulation in cognitive systems

https://doi.org/10.1016/j.robot.2007.07.011Get rights and content

Abstract

Though proposals have been put forth to solve the classical symbol grounding problem through robotic sensorimotor interactions, only little progress has been made in this direction with actual working systems, and symbol grounding through physical interaction has been rarely dealt with. We address this problem in the context of robotic manipulation for cognitive systems, and claim that there are symbols which do not refer simply to physical objects but rather to the embodied interactions between the robot and the objects in its environment. Through the description of two manipulation experiments we offer a proposal on which to build a theory of symbolic representations for physical interactions. Some important neuroscience studies that support our view are also briefly described.

Introduction

The symbol grounding problem is a classical challenge for cognitive science [1]: the symbols in a symbol system are systematically interpretable as meaning something; however, in a traditional AI system, that interpretation is not intrinsic to the system; it is always given by an external interpreter (e.g., the designer of the system). Neither the symbol system in itself nor the computer, as an implementation of the symbol system, can ground their symbols in something other than more symbols. And yet, when we reason, unlike computers, we use symbol systems that need no external interpreter to have meanings. The meanings of our thoughts are intrinsic, the connection between our thoughts and their meanings is direct and causal, it cannot be mediated by an interpreter. To assume that they are interpretable by someone else would lead to an infinite regress. Some authors have speculated about robotic sensorimotor interactions as a solution to this paradox [2], [3]; they claim that in a cognitive robotic system the symbols could be grounded in the system’s own capacity to interact physically with what its symbols are about. Such a system should be able to perceive, manipulate, recognize, categorize, modify, and reason about the real-world objects and situations that it encounters. In this way, its symbols would be grounded in the same sense that a person’s symbols are grounded, because it is precisely those objects and situations that their symbols are about. If we think of a symbol that corresponds to a word, for instance, we ground it when we first learn our mother tongue through interaction with the outer world, because we cannot obviously ground it in more words. Our symbol systems are thus directly related to our experience, and the meaning of symbols depends on our sensory abilities and the way we can interact with the world. For example, for a blind child the meanings of his/her symbol system necessarily differ from those of a child with intact vision, because his/her interaction with the world is impaired [4].

However, though robotics has advanced significantly in the last years, only little progress has been made in symbol grounding by means of actual working systems, and mostly related to visual recognition of individual objects. In this context, the term anchoring has been coined to refer to the symbolic representation of perceivable physical objects [5], whereas symbol grounding through actual sensorimotor interaction within the manipulation context has been rarely addressed [6]. We claim that there is an additional class of symbols, fundamental for cognitive robotics, which do not refer simply to physical objects but rather to the embodied physical interactions between the robot itself and the objects in the world. This kind of symbols would be more related to sensorimotor action, and they seem to have appeared in the evolutionary landscape long before vision as “sight” [7].

This article addresses this rather neglected issue, namely symbol grounding through robotic manipulation in cognitive systems. Through the description of two grasping and manipulation experiments we offer a proposal to bridge the gap between robotics and cognitive science, in which the symbols related to the interaction between agent and target object can be inferred from the execution of a planned action. The robotic research performed by our group, and summarized in the paper, serves as a base on which to build a theory of symbolic representations for physical interactions. Our proposal is sustained by important neuroscience studies, especially those related to the two streams of the visual cortex and the mirror system, which are also briefly described.

In Section 2 we discuss how the distinction between “vision for action” as opposed to “vision for perception” is supported by neurophysiological findings. Then, as proof of concept, we present two working systems: in Section 3 we describe an implementation for robotic grasping, in which each symbol refers to a certain interaction between the robot hand and a physical object. Since neural networks can be a feasible mechanism for learning the invariants in the analog sensory projection on which categorization is based [8], a possible answer to the question of how to ground such symbols is the use of connectionism; in Section 4 we provide a detailed example of this approach in the context of another manipulation task: the peg-in-hole insertion problem. We conclude with Section 5 in which we discuss the implications of our contributions in the context of symbol grounding for cognitive systems.

Section snippets

Lessons from neurophysiology

The strategies employed by our brain when we interact with the world, and the more or less explicit symbolic meanings we assign to such interactions is an insightful source of inspiration for dealing with the symbol grounding problem in artificial agents. Here we briefly review some recent neuroscience findings relevant to our purposes.

Grasping in robot: An emerging categorization of synthesized grips

Within the context of a robotic application for grasping and manipulation in a semi-structured environment, we devised a framework for characterizing candidate grips in a natural way according to their properties in relation with the execution of the grasping action. Clustering of the candidate grip configurations occurs due to implicit properties of grips, and an eventual symbolic meaning emerges through interaction of the agent with its world.

The experimental setup we used is the UJI service

Physical interactions as symbols: The peg-in-hole case

In the previous example the symbolic meaning extracted from the sensorimotor experience and assigned to each grip was used to compare grips and predict the reliability of oncoming actions. This section describes our approach for extracting explicit symbolic information from sensor data through the use of artificial neural networks. In this case, the symbolic relation refers to the contact states between an already grasped object and the goal position. Our study is based on the two-dimensional

Final discussion: Hand-object interactions as symbols

The two case studies exposed above show how symbolic meanings can naturally arise from the physical interaction between an agent and objects in its environment, suggesting that indeed motor primitives constitute a consistent source of symbolic knowledge. Looking once more at cut-edge research in neurophysiology, one of the most important discoveries of the last 20 years, that of mirror neurons, supports the idea of extending the symbol concept to motor behaviors.

Mirror neurons were first found

Conclusion

We have discussed the existence of symbols that are fundamental for robotic cognition and do not refer directly to physical objects but are grounded in the physical sensorimotor interactions between the robot itself and the objects in the world. These symbols are directly related to sensorimotor experience and allow to build a representation of the interaction of the system with its surrounding environment. They seem to have appeared in the evolutionary landscape long before vision as “sight”.

Acknowledgments

This paper describes research done in the Robotic Intelligence Laboratory. Support for this laboratory is provided in part by the Spanish Department of Education and Science under projects DPI2004-01920, DPI2005-08203-C02-01, by Generalitat Valenciana under project GV/ 2007/109 and by Fundacio Caixa-Castello under project P1-1B2005-28 and P1-1A2006-11. The neural network simulations have been done with the software package SOMPAK, which was developed at the Helsinki University of Technology.

Eris Chinellato received his M.Sc. in Artificial Intelligence, together with the Best Student Prize, from the University of Edinburgh (UK) in 2002, and his Industrial Engineering Degree from the Università degli Studi di Padova (Italy) in 1999. He is now pursuing his Ph.D. in the Robotic Intelligence Lab of the Jaume I University (Spain). His interdisciplinary research is mainly focused, but not restricted to, the use of visual information for grasping actions in natural and artificial systems.

References (23)

  • A. del Pobil et al.

    Objects, actions and physical interactions

  • Cited by (4)

    • Communicating emotions and mental states to robots in a real time parallel framework using Laban movement analysis

      2010, Robotics and Autonomous Systems
      Citation Excerpt :

      Most of the existing sociable robots are mainly used in a tele-operated or “wizard of Oz” manner [3,4], although there is a number of exciting developments in robots that interact autonomously in a meaningful for the human way [5–7] or use brain-like mechanisms or neural mechanisms [8–10]. Brain inspired robots have been used for investigating animal locomotion and motor control, [11–14,9], to learn to avoid obstacles [15,16], produce accurate vision functions [17–19] generate adaptive arm movements [17,20,13,9], perform (rat-like) learning and memory tasks [21,14,22–24], or to emulate the human or rodent reward and value systems [25,26]. Many of these functionalities are building blocks for social behavior.

    • Intelligent robotic grasping?

      2016, Cognitive Systems Monographs
    • Development of object and grasping knowledge by robot exploration

      2010, IEEE Transactions on Autonomous Mental Development
    • A neural wake-sleep learning architecture for associating robotic facial emotions

      2008, Proceedings of the International Joint Conference on Neural Networks

    Eris Chinellato received his M.Sc. in Artificial Intelligence, together with the Best Student Prize, from the University of Edinburgh (UK) in 2002, and his Industrial Engineering Degree from the Università degli Studi di Padova (Italy) in 1999. He is now pursuing his Ph.D. in the Robotic Intelligence Lab of the Jaume I University (Spain). His interdisciplinary research is mainly focused, but not restricted to, the use of visual information for grasping actions in natural and artificial systems. He has published in influential journals and proceedings in robotics, neuroscience, and computational neuroscience. He has served as reviewer and program committee member for international journals and conferences, and collaborated with renowned scientists such as M.A. Goodale and R.B. Fisher.

    Antonio Morales holds a B.S. in Computer Engineering and Ph.D. in Computer Science obtained from the Jaume I University (Spain) in September 1996 and January 2004, respectively. Through this time he has also collaborated with the Fraunhofer AIS - Institute of Autonomous Intelligent Systems (Germany), in the Laboratory for Perceptual Robotics of the University of Massachusetts (US), and in the Institute of Computer Science and Engineering of the University of Karlsruhe (Germany). He currently is associate professor at the Robotic Intelligence Lab at the Department of Computer Engineering and Science in the Jaume I University. He is author of over a dozen technical publications and proceedings, and has collaborated as a reviewer for several journals and conferences committees. His research interests include dexterous manipulation, sensor-guided grasping, application of computational neuroscience to robotics, mobile manipulators, service robotics and multimedia technology.

    Dr. Enric Cervera (1970) completed his Ph.D. in Computer Science (1997), he achieved a position as Associate Professor of Computer Science and Artificial Intelligence at Jaume I University in 1999. Since then, he has leaded several research projects funded by the Spanish Government, and he is currently collaborating in a FP6 European project. He has published several research articles in international journals, conference proceedings and book chapters. He has served as reviewer of several top international journals, and in the program committee of several international conferences (IROS’04-06, IASTED). His present research deals with collaborative approaches to robotics and sensor-based control.

    Angel P. del Pobil is Professor of Computer Science and Artificial Intelligence at Jaume I University (Spain), and founder director of the Robotic Intelligence Lab. He holds a B.S. in Physics (Electronics, 1986) and a Ph.D. in Engineering (Robotics, 1991), both from the University of Navarra (Spain). He is Co-Chair of the Research Key Area of EURON-II (European Robotics Network, 2004–2008) and has been Co-Chair of other important societies. He has over 140 publications, including nine books. He was co-organizer of several workshops and tutorials in IROS and ICRA, Program Co-Chair of IEA/AIE-98 and General Chair of the last five editions of the International Conference on Artificial Intelligence and Soft Computing (2004–2008). He has served on the program committees of 64 international conferences. He has been involved in robotics research for the last twenty years, his past and present research interests include: motion planning, visually-guided grasping, humanoid robots, service robotics, mobile manipulators, internet robots, collective robotics, sensorimotor transformations, visual servoing, self-organization in robot perception, and the interplay between neurobiology and robotics.

    View full text