Cognition: Differential-geometrical View on Neural Networks

A neural network taken as a model of a trainable system appears to be nothing but a dynamical system evolving on a tangent bundle with changeable metrics. In other words to learn means to change metrics of a definite manifold.


I. INTRODUCTION
An application of differential or integro-differential calculus for modeling of dynamical and self- organizing processes in social and natural systems has become a tradition since the works of .Lottca who released a book "Elements of Physical Biol- ogy" (Baltimore, 1925) and W. Waltterra whose paper "Sulla periodicita delle fluttuazioni bio- logiche" appeared in 1927.Lots of complicated problems in mathematics, physics, astronomy, chemistry and biology find their decisions (Hilborn  and Tufillaro, 1997) when implementing the modern sophisticated and carefully elaborated nonlinear-dynamical approach.It combines dynamical systems (Katok and Hasselblatt, 1995) and category theories, topology (Akin, 1993) and dif- ferential geometry, ergodic (Pollicott and Michiko, 1997) and fixed point theories, combinatorics * E-mail: pznr.rff@elefot.tsu.tomsk.ru.(Harper and Mandelbaum, 1985), representation theory (Vershik, 1992), domain theory (Potts, 1995)   etc. Pleiad of these theories works wonderful for natural phenomena and is absolutely helpless as only one tries to apply one of them to social and cultural events.
Today one should ask oneself whether a formal- ism of integro-differential equations he applies in social realm is sufficient for adequate synergetic exposition of phenomenon, for example, socio- economic development?To be fair the most often answer is going to be "no".The reason is in man.
Models of some natural, socioeconomic, political etc. dynamical and self-organizing processes should take into account a presence of anthropological factor intrinsic to these ones.A man with his diverse set of behavioral patterns enriches any kind of human-loaded phenomena (HLP) with unpredict- ability and enormous complexity.
In particular, in this humanitarian context a cognitive activity of a human being appears to be a part of HLP almost the most difficult for explication and at the same time to be a generic feature of a carrier of cultural patterns and archetypes.In modeling of synergetic aspects in physical, chemical and other "behavioral systems" there is no such difficulty.Therefore a due regard for cognition in social-synergetic models, being an independent scientific problem, is suitable to be a criterion of their completeness.
The article presents some kind of an elabora- tion of HLP models that use differential calculus by introducing a mathematical caption of cogni- tion due to consideration of a dynamical system embedded in a manifold with inconstant metrics.
The author shows that such system is nothing but an "intellectually and mentally inspired" neural network (Buffalov, 1998)   (Ripley, 1996), gen- eralizing and forecasting.It is also shown that metrics alteration actually is the training of this neural network.
There is an alternative attempt of Scott and Fucks (1995) to depict some features of human brain using the theories of attractors and Sil'nikov chaos.It gives a notion about dynamics complexity and perpetuity by means of the dynamical systems theory, and we try using the same theory and differential geometry to show how to provide a dynamical system with intellectual and mental properties to make it suitable for modeling of social and cultural HLP.
Intellectual systems with cognition and self- regulation usually represent a wide class of complex adaptive living beings studied by humanitarian, medical and biological sciences.Machine learning theory (Mitchell, 1997) reflects on manmade self- training devices analogue to their biological proto- types.We address a neural network studied by this theory as one of such artificial systems endowed with a synthetic intellect and cognition suitable for "intellectual" sophistication of the ordinary differential calculus.

II. NEURAL NETWORKS
Neural networks (Ripley, 1996) are an information processing technique based on the way biological nervous systems, such as the brain, process infor- mation.The fundamental concept of neural net- works is the structure of the information processing system.Composed of a large number of highly interconnected processing elements or neurons, a neural network system uses the human-like tech- nique of learning by example to resolve problems.The neural network is configured for a specific application, such as data classification or pattern recognition, through a learning process called training.Just as in biological systems, learning involves adjustments to the synaptic connections that exist between the neurons.Neural networks can differ on: the way their neurons are connected; the specific kinds of computations their neurons do; the way they transmit patterns of activity throughout the network; and the way they learn including their learning rate.
In this article we are going to use the differential- geometrical formalism to describe neural networks of a certain architecture outlined in Petritis (1995)   and to implement them for "intellectualization" of differential formalism and dynamical systems, in particular.This approach is rather new though there were some attempts in Potts (1995) concern- ing forgetful neural networks to derive the embed- ding strength decay rate of the stored patterns using recent advances in domain and topology theories.
We consider neural networks, which can be defined as a cascade conjunction of several properly constructed layers.The typical one has the follow- ing structure (Petritis, 1995): (1) A level of input neurons fed with a vector of external signals.
(2) A linear transformation level.Here the input vector is multiplied on a matrix of synaptic weights responsible for information storing.
(3) Nonlinear transformations level (a set of neu- rons with nonlinear transfer functions).Here a linearly transformed signal is nonlinearly converted.
(4) A level of output neurons.
The last layer is fedback to the first one.
The previous passage outlines the neural net- work's description framework giving a strict defini- tion to its structure what's of fundamental meaning in neural network technique.Relaying on that fact we assume that any system including a dynamical one, which allows a description within that frame- work can be treated as "intellectual" and possessing cognition so far as neural network.Now one can transfer the concept of cognition to the scene of the differential calculus and the theory of dynamical systems in a very simple and universal fashion.Just develop a generalized description of the dynamical systems in such a manner that it incorporates the neural network's description as particular case.Such unifying generalization will automatically assign all properties of the neural net- work to the dynamical system and vice versa.The context accompanying the assignment will define the differential-geometric content of cognition.

III. DIFFERENTIAL GEOMETRY BACKGROUND
Topological space is a set of points with subsets indicated to be open.It is required that an arbitrary intersection or disjoint union of any final number of open sets should be open as well.The set .itself and empty set should be open.We will work with important particular case of topological space metric space for any two points x and y of which there is defined a function p(x, y) called a distance between x and y with the following properties: 1. p(x, y) p(y, x); 2. p(x, x) 0 and p(x, y) > 0, if xy; 3. Triangle inequality: p(x, y) <_ p(x, z) + p(z, y).
Let M be a differentiable manifold.We say that M is a Riemannian manifold if there is an inner product gx(', ") defined on each tangent space TxM for x M such that for any smooth vector fields X and Y on M the function x gx(X(x),Y(x)) is a smooth function of x.
In every neighborhood Ui with local coordinates (x)]= a positively defined symmetrical matrix gci (X] X) sets a Riemannian metrics so that for any vector in a point x the equality l 2 gi holds.
Metrics gij(Y,... ,yn) is said to be Euclidean if there exists a system of coordinates xl,...,xn, xi= xi(y,..., y), i= 1,..., n, such that Given a set M one say that there is a structure of n-dimensional differentiable manifold on M if for each x M there exists a neighborhood U of x and homeomorphism h from U to an open ball in .We call (U,h) a chart (or system of local coordinates) about x.
If M is a manifold and x M is a point, then we define the tangent space to M at x (denote TxM) to be the set of all vectors tangent to M in x.
The tangent bundle of M, denote TM, is defined to be the disjoint union over x M of TxM, i.e.TM Ux M TxM.We think of TM as the set of pairs (x,v), where x M and v TxM.The tangent bundle is in fact a manifold itself.One can introduce the cotangent bundle if we consider a covector instead of a vector.det \OyJJ 0 and Ox Ox k:l Oyi oyj These coordinates x,..., x" are called Euclidean.

IV. DYNAMICAL SYSTEMS BACKGROUND
For the purpose of this paper a dynamical system is a topological metric space X and a continuous vector field F. The system is denoted as a pair (X,F).Locally it is described by a system of ordinary differential equations of the first order.
There exist two principal approaches for dy- namical systems, which suppose a construction of developed theoretical base.Actually these are Lagrangian and Hamilton formalisms.The first is the particular case of the last.That is why we restrict ourselves to Hamiltonian dynamical systems.
In any space n with coordinates (yl,..., yn) and metrics gij, i, j= 1,..., n, it is possible to define a scalar product and index raising up.So the gradient Vf of the function f(yl,..., y) looks like (vf)i g*J OyJ" Vector field X7f has a corresponding system of differential equations i_ (vf)i called gradient system.
The space with a skew-symmetrical metrics i=l is called a phase space if it allows such coordinates (q, p) that: I where I is the unit matrix, p is a covector and (q, p) belongs on a cotangent bundle of a configuration manifold.
A gradient system in a phase space is called a Hamiltonian dynamical system.In general an even- dimensional manifold (phase space), a symplectic structure on it (integral Poincare invariant) and a function on it (Hamiltonian) completely define a Hamiltonian system.

V. "INTELLECTUALIZATION" OF DYNAMICAL SYSTEMS
We are avoiding of considering of an arbitrary dy- namical system so far and address the Hamiltonian one embedded in a cotangent bundle of a con- figuration manifold with the Riemannian skew- symmetrical metrics G-(gij)2n ij=l" Let it be described by the Hamilton equations for generalized coordinates q and impulses p., which can be written in the following form: where yi= qi, yn+i__pi, i= 1,..., n, OH(y, t)/OyJ, j 1,..., n.
Fj(y, t) In the case of an arbitrary nonobligatory gradient dynamical system gl Qi(y, t), /5i Pi(y, t), 1,..., n, in a cotangent bundle quantities in Eq. (1) will have the following denotation: Fi(y,t)=Qi(y,t), 2n Fi(y, t)-Pi(Y, t), and G(y, t) (Offi/OyJ)id= is the Jacobi matrix of frame transformation (GG T is the Euclidean metrics).Equation (1) can be written in a form of a finite difference scheme with a sufficiently small time dis- cretization step -.According to the Euler method we obtain an iterative process with nth step giving Yn Yn-1 + 7-GnF(Yn-1, tn-1). (2) It can be easily interpreted in terms of a neural network with input vector y, linear transformation G, nonlinear transformation, i.e. a set of transfer functions F., and a feedback signal decay rate -.It is known from numerical methods that accuracy of the approximation (2) can be substan- tially improved if to add in the right part of (2) a vector of errors calculated using the first formula of Runge y y R-k2   (3) where y and y are approximations calculated with decay ratesand k-for any integer k.This procedure can be interpreted as a fruitful discussion between two neural networks with different decay rates (or "intellectual levels").
The discretization of Eq. ( 1) provides two ways for displaying of cognition in the framework of dynamical systems by means of interpretations held in terms of neural network theory: Mathematical caption of cognition through metrics alterations Any Hamiltonian dynamical system (X, F) evolving in the phase space X with the changeable Riemannian skew-symmetrical metrics G defines a neural network with the set of transfer functions F;, i= 1,...,2n, and G as the matrix of synaptic weights.
Mathematical caption of cognition through frame alterations Both an arbitrary non-Hamiltonian dynamical system (X,F) evolving in the metric space X and an inconstant Jacobi matrix G of frame transformation define a neural network with the set of transfer functions Fi, i= 1,..., 2n, and G as the matrix of synaptic weights.
Resting upon one of these interpretations one can treat a dynamical system in a differentiable manifold with a changeable metrics as a neural network along a training process; give more explicit solution of the central problem of the neural network theory: memorization of an arbitrary set of patterns and determination of their attraction basins.With a certain network's architecture in hand this problem is solved by appropriate choosing of its transfer functions (i.e. a vector field Fi, 1,..., n, which is a dynamical system in fact) and training algorithm (a law of evolution of a manifold metrics).In other words the solution is given by correct setting up of a dynamical system (X, F), where Xis a metric space with a metrics G providing the best (according to a given criterion) patterns' memorization; take use of rich toolkit of topology and smooth theories for investigation of "knowledge" struc- tures generated by neural network invariant to continuous and smooth changes of coordinates, i.e. patterns remaining stable in the memory of the network under its training.Such patterns can be called unconditioned reflexes; address the fixed point theory as the most powerful tool for perception of patterns stable under network's "cognitive" dynamics when recognizing, generalizing, predicting and etc.In particular, these patterns can be called condi- tioned reflexes obtained throughout learning for certain external inputs; sophisticate and deepen., a research of neural networks using Lie algebras of vector fields and a phase portrait of the trained neural network (its output signal's dynamics during recognizing and etc.), namely, of appropriate dynamical system in a curved manifold; generalize one's investigations due to categories of topological spaces and vector fields elaborated in the category theory.

VI. METRICS ALTERATION VERSUS TRAINING
"Intellectualization" endows a dynamical system (X, F) with one more degree of freedom revealed in plasticity of quantities defining the metrics of X.This plasticity reflects training abilities of the neural network associated with the dynamical system.Let us consider autonomous differential equa- tions establishing an arbitrary training algorithm: where y is defined through integral with G in integrand [refer Eq. ( 2)].
If close enough to an end of the training process the integro-differential equation ( 4) pertaining to G can be simplified to an ordinary differential equa- tion (see Appendix)   (5) lJ lkr where R/(y,t) OGOt; Fr(y,t), i,j,k,r-1,...,2n.
Here and further we mean a summation all over dummy indexes values.
As you can see the metrics evolution equations (5) describe the motion of 2n coupled oscillators.

VII. SOLUTION OF THE METRICS EVOLUTION EQUATIONS
We rewrite Eqs.(5) in a concise matrix form: R, where , is a vector representation of the metric tensor G (gij)2n ij=l and R is an operator represen- tation of the tensor Rk.
If R is implicitly time-dependent and '(Yo, to) g, (Yo, to) -'o are entry conditions then Eq. ( 6) has a solution: where W(y) and
Equation ( 8) is the solution of the metrics evolution equation ( 6).It describes a complicated oscillatory dynamics of the neural network's synaptic weights defining the metrics of the mani- fold.Such solution is very interesting from the neuro-dynamical point of view since it allows to speak about existence in the neural network theory of analog of unfading oscillatory neocortex electro- chemical activity, i.e. brain's rhythms (Haken and  Stadler, 1990).
During the training the behavior of ,(y, t) is rather complicated because of constantly varying amplitudes, frequencies and phases of coupled harmonics in (8).But in the very moment when the neural network is trained all these magnitudes accept fixed values and do not vary in time any more.The network passes in a phase of unfading oscillations which parameters reflect an informa- tion stored by it.

VIII. CATASTROPHE
As soon as the dynamical system (2) settles down to some fixed point y, i.e.F(y , t)= 0, the elements of the metric tensor (or matrix of synaptic weights) are subjected to an unbounded linear growth in time.
It becomes evident if to consider Eq. ( 6) where the right part is set to zero.Such a catastrophic outcome occurs only if y is a stable fixed point and the "cognitive" dynamics of the neural network fades (assume that our brain stops functioning.It's impossible!).Otherwise, when y is unstable the output signals of the network evolve endlessly and never settle down.The catastrophe never occurs but another problem of everlasting dynamics appears.
To solve this problem and to make the procedure of training of the neural network declining one have to restrict a scope of synaptic weights evolution in light of a special kind of dynamical system (2).
One of the possible ways, which lies in wonderful agreement with experiment is to consider a dynamical system displaying the Sil'nikov chaos (Scott  and Fucks, 1995).In this case it never actually settles in a stable fixed point at all, but continuously evolves in the vicinity of a saddle focus.
So to avoid the catastrophe and to provide an adequate memorization of a given set ofpatterns we should construct an appropriate dynamical system (2) exhibiting the Sil'nikov chaos and a training algorithm (4) in such a manner that any given pattern is a stable fixed point of the map Ctr; any stable fixed point of the map Gtr coincides with one of the saddle focuses laying on homo- clinic orbits of the dynamical system, i.e. transfer functions of the neural network.Now we say that the neural network is trained when its output signal dynamics is restrained to a vicinity of one of the saddle focuses.In this very moment amplitudes, frequencies and phases of coupled harmonics in (8) accept "fixed" values but vary insignificantly in time.The network passes in a phase of unfading slowly varying oscillations which parameters reflect an information stored by it.

IX. CONCLUSION
We tried to make a due regard for cognition in social-synergetic models of HLP that use differen- tial calculus by introducing a mathematical caption of cognition due to consideration of a dynamical system embedded in a manifold with changeable metrics.
Any dynamical system (X,F) evolving in the phase space X with changeable Riemannian metrics G appears to be a neural network with transfer functions Fi, i= 1,... ,n, and G as the matrix of synaptic weights.Such interpretation has two very important consequences: It enriches exceedingly the neural network theory by the theoretical and computational power of topology and smooth theories, cate- gory and ergodic theories, dynamical systems and fixed point theories, Lie algebras, phase portrait technique etc.It endows social-synergetic models with extra "cognitive" degrees of freedom giving a real pos- sibility to grasp anthropological dimension of some natural, cultural, socioeconomic, political, dynamical and self-organizing processes etc.
When close enough to a fixed point the dynamics of synaptic weights defining the metrics G is described by the system of differential equations for 2n 2 coupled oscillators.We find this solution to be in wonderful coherence with the fact of the neocortex oscillatory activity.
The idea of the dynamical system embedded in the manifold with inconstant metrics plays con- siderable role in the new understanding of neural networks and the nature of training.The inter- pretation offered here does not apply for generality and completeness of an exposition of all details.Its main purpose is to designate the new approach to comprehension of anthropological dimension in social-synergetic models; understanding of neural networks within the framework of the nonlinear dynamics (synergetics).

Call for Papers
Thinking about nonlinearity in engineering areas, up to the 70s, was focused on intentionally built nonlinear parts in order to improve the operational characteristics of a device or system.Keying, saturation, hysteretic phenomena, and dead zones were added to existing devices increasing their behavior diversity and precision.In this context, an intrinsic nonlinearity was treated just as a linear approximation, around equilibrium points.
Inspired on the rediscovering of the richness of nonlinear and chaotic phenomena, engineers started using analytical tools from "Qualitative Theory of Differential Equations," allowing more precise analysis and synthesis, in order to produce new vital products and services.Bifurcation theory, dynamical systems and chaos started to be part of the mandatory set of tools for design engineers.
This proposed special edition of the Mathematical Problems in Engineering aims to provide a picture of the importance of the bifurcation theory, relating it with nonlinear and chaotic dynamics for natural and engineered systems.Ideas of how this dynamics can be captured through precisely tailored real and numerical experiments and understanding by the combination of specific tools that associate dynamical system theory and geometric tools in a very clever, sophisticated, and at the same time simple and unique analytical environment are the subject of this issue, allowing new methods to design high-precision devices and equipment.
Authors should follow the Mathematical Problems in Engineering manuscript format described at http://www .hindawi.com/journals/mpe/.Prospective authors should submit an electronic copy of their complete manuscript through the journal Manuscript Tracking System at http:// mts.hindawi.com/according to the following timetable: