Application of spike neural network for stabilizing pendulum in the nonlinear formulation

. The article describes the solution to the problem of stabilizing a non-linear system using machine learning methods. Neural networks are one of the promising directions in this area. The article describes a model of spiking neural network, which di ﬀ ers from previous generations of networks by its similarity to biological neurons. A pendulum on an elastic foundation was chosen as a dynamic system for the study. The input layer of the neural network is the so-called sensory neuron, and information about the deviation of the pendulum from the equilibrium position was received on it. The Leaky Integrate-and-Fire model of the spiking neural network was used. The article shows the process of stabilization of a pendulum on an elastic foundation. The closed system was built and a method for a numerical solution was implemented. Two conﬁgura-tions of control functions have been considered. It is shown that the time required to bring the system into a steady equilibrium state depends on the choice of the control function.


Introduction
Despite the existence of classical approaches for solving the optimal control problem, the need for other approaches still remains.Nowadays, solving this class of problems using artificial intelligence has become popular.One of the promising areas in this field is neural networks.A spiking neural network is an artificial neural network of the third generation, its mathematical model is as close as possible to biological processes.The process of generating a spike (neuron output value) depends on the sequence of signal pulses received at the input [1,2].Despite the complexity of the model and the need to solve a system of differential equations at each calculation step, the advantage of spiking neural networks over multilayer perceptrons has been shown [3].Due to their higher computational power, such networks require fewer neurons in their architecture.
The application of spiking neural networks in control problems is gaining popularity [4][5][6][7][8][9].Such an approach makes it possible to depart from classical methods for solving optimal control problems: Pontryagin's maximum principle and Bellman's dynamic programming.Since these methods require complex computational processes, they are not always conveniently implemented in practice, especially in cases of discontinuous control function or complex dynamic systems.It was shown in [10] the example of modeling of a virtual insect, how the potential of neural networks can be used in the future to solve practical problems.Other works [11][12][13][14][15][16] show how a spiking neural network can be used to build a self-learning control system of the virtual agent.However, the aforementioned articles used simplified mathematical models (e.g., do not take into account inertia force), which led to our study.The purpose of this paper is to apply spiking neural networks to complicated dynamic systems.

Materials and methods
The design of a spiking neural network is fundamentally different from a multilayer network by the mathematical model of the neuron itself.From a biological point of view, neurons are complex structures, due to which a number of scientific studies have been devoted to their mathematical description.Because of this, at the moment there are several corresponding mathematical models.In the research Leaky Integrate-and-Fire (LIF) model of a neuron was used.The LIF model can be described by an equation: g e,m (E e − V) where τ mem is the membrane time constant, V is the membrane voltage, E leak is the reversal potential for the leak, E e is the reversal potential for excitatory (depolarizing) inputs, E i is the reversal potential for inhibitory inputs, g e is the excitatory conductance, g i is the inhibitory conductance, N e and N i is the quantity of excitatory and inhibitory neurons, respectively.Excitatory conduction can be described by the equation: where τ e is the time constant of postsynaptic potential, w e is the strength of the excitatory synapse, t s is the time of an excitatory input spike, δ is the Dirac delta function.Inhibitory conduction can be described by the equation: where τ i is the time constant of postsynaptic potential, w i is the strength of the inhibitory synapse, t p is the time of an inhibitory input spike.In this approach, each neuron has characterizing value-potential.The potential is compared to the neuron threshold value.And if the potential exceeds the threshold value, the neuron sends an impulse to the next layer and its potential drops to a relaxing level.Such a process calls spike.Otherwise, potential accumulation occurs.
Let us consider a pendulum on an elastic foundation.To derive the equations of motion we use Lagrange's equation, the angle of deviation from the equilibrium position ϕ and the vertical displacement x were taken as generalized coordinates (figure 1).
The kinetic energy of the system can be written as follows: where m is rod mass; l is rod length; x, ϕ are generalized coordinates.The generalized forces can be written as follows: Then differential equations of motion can be rewritten as follows: where U ϕ , U x -are generalized control functions.
Consider the problem of stabilization of a pendulum.The behavior of a dynamical system is described by two generalized coordinates corresponding to angular and translational oscillations.Let us introduce control functions for the corresponding oscillations and assume that they depend on the corresponding pairs of generalized coordinates.These control functions can be force factors for the corresponding character of motion, namely, moment for angular oscillations and force for translational oscillations (figure 2  At the initial moment of time, the pendulum is in a certain state: initial velocity and position.It is required to stabilize the system by the control functions.It was decided to use four networks at once, each consisting of one neuron responsible for each of the four control functions.The input layer for the first two neurons is the sensory neuron, which receives a discrete signal of constant amplitude with a frequency proportional to the deviation of the generalized coordinates from the equilibrium position.When the deviation is zero, no signal is applied.So, if the output spike from the corresponding neuron denotes as L, the generalized control functions can be introduced as follows: where λ defines generalized coordinate-x or ϕ, R-corresponding damping coefficient, Ccorresponding scale factor.Equations ( 1)-( 3), (6), and ( 7) with the initial states for the dynamic systems are the closed system for solving a stabilization problem.
Two options for generalized control functions are considered for angular coordinate: where e ϕ -direction cosines of the pendulum, e h -direction cosines of corresponding force direction.Let's call them case 1 and case 2.

Results and discussion
The numerical simulation of stabilization of pendulum on an elastic foundation using spike neural network performed.In calculations constants for equations (1)-(3) were as follows: τ mem = 0.2 ms, E leak = −60 mV, E e = 0, E i = −80 mV, τ e = 0.05 ms, N e = 1, N i = 0.In calculations constants for equations (6) were as follows: m = 0.8 kg, l = 0.5 m, k = 100 N/m.In calculations constants for equations (7) were as follows: -( 3), (6), and (7) were solved numerically.Let's consider the results of stabilization in more detail.Let's focus on results for case 1.Moreover, in figure 3(a) a time point about 250 sec should be mentioned.At this time the first jump of the angle law envelope appears.The second jump appears after 1500 sec.From figure 3(b) it can be noted that the stabilizing moment reduces up to zero and there is a sharp decrease in the amplitude of the oscillation in figure 3(a) at the same time.It means that the system stabilized instantaneously.As soon as a constant amplitude of the applied force is established in figure 3(d), the cessation of oscillation can be observed in figure 3(c).It is also worth noting that the damping of oscillations in different directions occurs at the same point in time.Sharp decrease of control function's amplitude explains by the neuron's spike "capturing" the near-equilibrium area.It's like the control system "catch" the dynamic system near equilibrium position.The weak point of such control system is amplitude-jumping pattern.In practice such behavior provide revolute joint wobble.
Going over to results for case 2 some difference occurs.In figures 3(e), (f) it can be noted a smoother envelope of the vibration wave profile during attenuation, in contrast to the previous case.It can be explained by the fact that in this case the control function is not a constant, but depends on the coordinates of the phase space.
The time required for stabilizing the system decreases.The control function became continuous; as a result, the behavior of the generalized coordinates is smoother.In contrast to the first case, there are no envelope jumps.Despite specific embodiment of control function (in mechanical terms), received results illustrate more suitable mode of dynamic system.

Conclusion
The article presents a method for stabilizing a nonlinear dynamic system using machinelearning tools.The LIF model of the spiking neural network was used.The article shows the process of stabilization of a pendulum on an elastic foundation.The closed system was built and a method for a numerical solution was implemented.Two configurations of control functions have been considered.It is shown that the time required to bring the system into a steady equilibrium state depends on the choice of the control function.Further development of this approach and its application to complex systems with a large number of degrees of freedom is envisaged.This research was funded by RFBR, grant number 20-01-00535.The results are obtained under support from the Kazan Federal University Strategic Academic Leadership Program ("PRIORITY-2030").

Figure 1 .
Figure 1.Pendulum with two degrees of freedom and generalized coordinates ϕ and x (a)), or two forces (figure 2(b)).

Figure 2 .
Figure 2. (a) The system with stabilizing forces and moments.(b) The system with stabilizing forces, here h is the distance from the suspension point of the pendulumto the point of thread fastening

Figure 3 .
Figure 3. (a) The graph of the angle (deg) ϕ versus time (s).(c) The graph of the vertical displacement (m) x versus time (s).(b), (d) Graphs of the generalized control function for moments and forces, respectively versus time (s).(e) The graph of the angle (deg) ϕ versus time (s).(f) The graph of the vertical displacement (m) x versus time (s)