Beyond spatial reasoning: Challenges for ecological problem solving

: This vision piece reﬂects upon virtues of early computer science due to scarcity and high cost of computational resources. It critically assesses divergences between real-world problems and their computational counterparts in commonsense problem solving. The paper points out the different objectives of commonsense versus scientiﬁc approaches to problem solving. It describes how natural cognitive systems exploit space and time without explicitly representing their properties and why purely computational approaches are less efﬁcient than their natural role models, as they depend on explicit representations. We argue for investigating spatio-temporally integrated methods to spatial problem solving. We contrast these methods to sequential computational approaches that require digital twins of the environment and cannot make direct use of simultaneous spatio-temporal interactions. The paper concludes with predicting future developments in problem solving, praising the relative merits of different routes to be taken. It advocates the translation of fundamental cognitive principles into technical robotic solutions.


Past
Once upon a time, when computers were big, their processor words short, their cycle times long, and their memory capacities tiny and expensive, computer scientists and artificial intelligence (AI) researchers strove for understanding the structural essence of problems to be solved. They investigated ways of solving problems in order to develop efficient and effective algorithms that could be run on these computers within reasonable time frames. They asked questions like: • Which are the essential pieces of information needed to solve the problem?
• What spatial and temporal resolution is required for tackling the problem?
• What precision is needed for the different attributes?
• Can we form clusters of concepts to operate on a more abstract level? • Can we find meaningful categories rather than working on quantities too fine to be meaningful? • Do we need to introduce absolute reference systems to obtain objective observations? or • How does my algorithm scale with the size of the problem?
Such questions helped protagonists tremendously to understand problems in depth and to identify problem-adequate solutions. Cognitive scientists looked for inspiration from nature where similar problems seemed to be solved in smart ways. They learned about the environmental conditions in which problems need to be solved. Over the years, computers became cheaper and ubiquitous, their memory capacities and speed became enormous, the existential need to conserve resources disappeared, the virtues of ecological computing were forgotten, and the sense of the nature of a problem often got lost. We now can implement amazing neural networks and general purpose deep learning algorithms which generate solutions to big problems without deeply understanding them. Unfortunately, in many cases nobody knows to what extent the problems they solve in the computer correspond to the problems in the world that they were supposed to solve. One reason is that the appropriateness and correctness of the problem representation usually is not thoroughly scrutinized.

Natural cognitive systems
We can learn much about smart approaches from nature by investigating spatial and temporal problem solving [14]. So, what is special about space and time? Natural cognitive agents (humans, animals, and even plants) first and foremost must deal with their spatiotemporal environments [1,7,16]. They benefit particularly from resources and social interactions in their vicinities, as these can be managed most economically in terms of perception, action, and energy. To get around in their environments and to make use of them, cognitive agents initially depend completely on local perception and action that provide information about the spatial and temporal organization of their surroundings [10]. They have neither an abstract model of the world nor ways of reasoning on the basis of such a model. Therefore these agents first of all develop a spatial and temporal understanding of the world.
The spatial conception of the world develops from coarse to fine, as initially there is no perception for fine distinctions and consequently there are no concepts for them [8].
In the course of interacting with the environment finer distinctions successively develop, for example, when a toddler discovers that not every four-legged moving object is a dog. In developing an understanding of spatio-temporal environments cognitive agents receive www.josis.org strong structural support from their own anatomy as well as from the environment [18]. Our perception and motor organs as well as the neural maps connecting them are spatially organized such that near things in the perceived environment correspond to near things in perceptual organs (e.g., the retina) and often to near things in the mental representation of the cortex [9,12].
This holds for most of our senses, including visual, auditory, haptic, olfactory, and gustatory sense. In other words, perception, action, and their neural representations are largely spatially organized. Through perception, action, and mental processes they also become temporally organized, with a strong correspondence between spatial and temporal structure [16]. As perception and action operate on the external world (the spatial environment), the spatial structure of the environment participates in the spatial organization of our internal representations of the perceived world; these, in turn, form the basis of concepts about the world including dynamics and interactions.
I put forward the hypothesis that concepts acquired through perception and action in the spatial world form the basis of more abstract concepts such as strength, consumption, accessibility, wealth, etc. that cannot be perceived directly but can easily be related to perceivable features. In this way, the spatial organization of our bodies and our environments forms the foundation of what we perceive, how we think, and how we act. A notion of time evolves as a direct correlate to spatial distance as our motion modalities ensure that nearby locations will be reached more quickly and directly than more distant locations, both in locomotion and perception.
With this design we have cognitive systems that are spatio-temporally organized and whose concepts develop from coarse to fine. Various perceptual relations are intimately related to one another and therefore are strongly intertwined. For example, in a 2dimensional image of a 3-dimensional landscape scene, the vertical dimension represents height and depth information in a geometrically systematic way, distances and orientations are interrelated in systematic ways, time becomes accessible through speed which in turn is strongly related to distance and perceivable through its relation to sizes of moving objects. In other words, in the realm of our perception and actions, aspects of space and time are inherently connected; they become separated only when we focus our attention from integrated configurations and events to individual aspects which tend to be less meaningful in terms of our overall experience.
A highly interesting aspect of this kind of organization of cognitive systems is that they can operate on the world level early on by means of their aspectually integrated systems. They may not (yet) be able to distinguish strongly related notions like "amount", "height", and "volume" [11] and therefore may not be able to answer correctly, whether an amount of fluid has increased or decreased when poured from a narrow tall container into a wide flat container. But, nevertheless, they may be able to perform the pouring operation and obtain a valid result ("amount remains equal"). The valid result is guaranteed by spaceinherent dependencies that cannot be violated in the real world. In such ways, integrated cognitive systems correctly solve spatial problems without representing all aspects of the world or understanding all aspects of the result.
In the process of learning about the world spatially organized systems can refine their concepts in a top-down manner, focusing on individual aspects of the integrated overall systems rather than on the wholes [4]. For example, they may distinguish between dogs and cats or between "amount", "height", and "volume". Such top-down processes from coarse to finer can proceed indefinitely, as new methods or new distinctions become avail-

Formal AI systems
Formal representations and conceptions of the world in AI are structured differently: they are copies of aspects of the world ("digital twins") designed to describe the world and solutions to problems from an external (scientist's) rather than from an internal or developmental perspective. A major discovery was that we can solve problems by reasoning about assertions about the world. This turns out to be such a powerful method not only to solving problems but also to proving the correctness of solutions (assuming the formal representation of the problems is correct!) that reasoning about problems became a universal approach to solving them. It is the standard approach of weak AI [13].
In reasoning successfully, we may have neglected that commonsense cognitive agents have different objectives than scientists. When humans (or animals) want to meet their friend, they want to establish certain spatio-temporal relations with the person, and are happy when they recognize and meet the friend. Commonsense agents do not verify a world model that leads to the conclusion that the person they encountered actually is the targeted friend (although this may be an observing scientist's objective). In other words, the commonsense agent's perspective is an internal perspective that focuses on the primary objective (establishing a relation); it does not care about how and why the target was met. On the other hand, the scientist's perspective is external and focuses on the validity of inference steps that lead to the final result; the established result itself may not have a particular significance. It is this scientist's perspective and not the commonsense reasoner's perspective that is implemented in typical AI "commonsense reasoning" systems.
Formal approaches to representing space start in an assumption-free abstract space in which we can introduce three independent (!) spatial dimensions and an additional independent temporal dimension. From the independent dimensions we build an absolute reference system that uses scales at a resolution that will cover the foreseen relevant distinctions. We represent atomic entities from which we construct compound systems in a bottom-up manner. A feature of this approach is that we may be able to derive properties of the compound from properties of its parts. Knowledge about dependencies between the compound elements (e.g., between dimensions) can be added as they become known or relevant for solving specific problems. But once the aspects to be represented and the reference system and scales to be used have been decided upon, it becomes difficult to conceptually modify or grow the system. In contemporary big data approaches, we also feed our systems with data collected under pre-selected reference frames and scales; demanddriven interactive data acquisition generally is not supported.

Towards cognitively oriented AI systems
Qualitative spatial and temporal reasoning (QSTR) [2] offers approaches to account for important aspects of natural cognitive systems in the framework of formal AI settings. In particular, QSTR supports internal reference frames, coarse-to-fine representation and processing, aggregation of neighboring relations, as well as adapting to problem-centered granularities (instead of using absolute reference frames). In this way, QSTR supports conceptual neighborhood [6] in horizontal (similarity) and vertical (abstraction) dimensions.
QSTR presents a first step towards problem solving on the basis of spatial structures and supports efficient spatial and temporal reasoning. It also points the way towards a more drastic exploration of spatial structures for spatial problem solving. In [5] we describe an approach that avoids the generation of digital twins in favor of employing spatially faithful structures that maintain inherent spatial relations and interactions rather than formally describing them. We named this approach "strong spatial cognition", as it fulfills essential requirements of strong AI [13].

Future
I predict that we will continue to virtualize real-world structures by building "digital twins" of everything we can get hold of. From a scientific point of view, this will be quite rewarding, as we will learn a lot about how our environments and our agents are structured and perceived. Solving problems by means of digital twins, however, requires much more computational effort than mental effort in solving comparable everyday problems, as each piece of knowledge must be made explicit and must be computationally addressed. Presently, we can estimate and compare the effort between natural and artificial problem solutions in terms of "intelligence per Kilowatt hour" [15,17], but as research proceeds, we will learn more about structural differences concerning hard-and software in neural and artificial systems and we may become more aware of unwarranted power consumption for computational processing. This will eventually lead to technical problem-solving approaches that will integrate the spatio-temporal real world in a way similar to nature.
A highly significant scientific issue we have not really tackled is the fact that with current computer architectures "digital twins" operate on linear descriptions of space. These do not maintain essential spatial structures that make spatial problem solving effective and efficient. There are two ways of addressing this issue: (1) developing computer hardware that maintains interacting spatial relations according to affordances and constraints as individual relations change (it appears conceivable that human minds incorporate such structures capable of modest imagery processes as part of their working memory); and (2) combining physical spatial structures with sequential computational control structures to directly benefit from spatial updating capabilities. The latter is the approach most commonly found in biological problem-solving systems that immerse their perception and actuation apparatus in spatio-temporal environments to make use of spatial affordances and constraints. In this way, spatial structure needs not to be described for reasoning, it simply can be used.
My vision is that besides logic-based computational reasoning engines that are particularly useful for processing sophisticated scientific knowledge we develop spatial relationbased inference engines that structurally support perception and action processes for everyday problem solving [3]. This should become a very natural approach to autonomous mobile robotics, as robots need to interact directly with their environments. I hope we will return to addressing fundamental principles of cognitive processing and understanding their ecological roles for intelligent problem solving. At times when big data is en vogue we continue to look into the power of small data.