Learning from Adaptive Neural Control of Electrically-Driven Mechanical Systems

This study presents deterministic learning from adaptive neural control of unknown electrically-driven mechanical systems. An adaptive neural network system and a high-gain observer are employed to derive the controller. The stable adaptive tuning laws of network weights are derived in the sense of the Lyapunov stability theory. It is rigorously shown that the convergence of partial network weights to their optimal values and locally accurate NN approximation of the unknown closed-loop system dynamics can be achieved in a stable control process because partial Persistent Excitation (PE) condition of some internal signals in the closed-loop system is satisfied. The learned knowledge stored as a set of constant neural weights can be used to improve the control performance and can also be reused in the same or similar control task. Numerical simulation is presented to show the effectiveness of the proposed control scheme.


INTRODUCTION
The motion tracking control of uncertain mechanical systems described by a set of second-order differential equations has attracted the interest of researchers over the years.For the mechanical systems without the actuator dynamics, many approaches have been introduced to treat the motion tracking control problem and various adaptive control algorithms have been found (Ge et al., 1997;Zhang et al., 2008;Wai, 2003;Lee and Choi, 2004;Chang and Yen, 2005;Sun et al., 2001;Xu et al., 2009;Chang and Chen, 2005).However, as pointed out by Tarn et al. (1991) in order to construct high-performance tracking controllers, especially in the cases of high-velocity movements and high varying loads, the inclusion of the actuator dynamics in mechanical systems was very significant (Tarn et al., 1991).The incorporation of the actuator dynamics into the mechanical model complicates considerably the equations of motion.In particular, the electrically-driven mechanical systems were described by third-order differential equations (Tarn et al., 1991) and the number of degrees of freedom was larger than the number of control inputs.
Many works addressing the tracking problem of mechanical systems with actuator dynamics have been described in Dawson et al. (1998), Su andStepanenko (1996, 1998), Chang (2002) and Driessen (2006).These works were based on the integrator backstepping technique.In backstepping design procedures, regression matrix is required and the procedures will become very tedious for mechanical systems with multiple degrees of freedom.Based on the universal approximation ability of Neural Networks (NNs) and fuzzy neural networks, adaptive neural/fuzzy neural control schemes have been developed to treat the tracking control of uncertain electro-mechanical systems (Kwan et al., 1998;Huang et al., 2003Huang et al., , 2008;;Kuc et al., 2003;Wai andChen, 2004, 2006;Wai and Yang, 2008).In Kwan et al. (1998), two-layer NNs were used to approximate two very complicated nonlinear functions with the NN weights being tuned on-line, the designed controller guaranteed the Uniformly Ultimately Bounded (UUB) stability of tracking errors and NN weights with some conditions.In Huang et al. (2003), a NN controller was developed to further reduce the conditions in Kwan et al. (1998) for the stability.In Huang et al. (2008), an adaptive NN control algorithm was proposed for reducing the dimension of NN inputs.In Kuc et al. (2003), employing three neural networks, the designed controller implemented the global asymptotic stability of the learning control system.In Wai andChen (2004, 2006), robust neural fuzzy network control was derived for robot manipulators including actuator dynamics, favorable tracking performance was obtained for complex robot systems.In Wai and Yang (2008), an adaptive FNN controller with only joint position information was designed to cope with the problem caused by the assumption of all system state variables to be measurable in Wai andChen (2004, 2006).In the proposed adaptive neural control schemes above, NNs were used to approximate the nonlinear components in the electrically-driven mechanical systems and Lyapunov stability theory was employed to design closed-loop control systems.However, the learning ability of the approximation-based control is actually very limited and the problem of whether the neural networks employed in adaptive neural controllers indeed implement their function approximation ability has been less investigated.As a consequence, most of the adaptive neural controllers have to recalculate the control parameters even for repeating the same control task.
Recently, deterministic learning approach was proposed for identification and adaptive control of nonlinear systems (Wang andHill, 2006, 2009).By using the localized RBF network, a partial PE condition, i.e., the PE condition of a certain regression subvector constructed out of the RBFs along the recurrent trajectory, is proven to be satisfied.This partial PE condition leads to exponential stability of the closed-loop error system which is in the form of a class of Linear Time-Varying (LTV) systems.Consequently, accurate NN approximation of the unknown closedloop system dynamics is achieved within a local region along the recurrent trajectory.The deterministic learning approach provides an effective solution to the problem of learning in dynamic environments and is useful in many applications (Wang and Chen, 2011;Wu et al., 2012;Zeng et al., 2012;Dai et al., 2012).
This study addresses learning from adaptive neural control of the unknown electrically-driven mechanical systems.An adaptive neural control algorithm is proposed using RBF networks.Partial PE condition of some internal signals in the closed-loop system is satisfied during tracking control to a recurrent reference trajectory.Consequently, the convergence of partial neural weights to their optimal values and learning of the unknown closed-loop system dynamics are implemented in the closed-loop control process.The learned knowledge stored as a set of constant neural weights can be used to improve the control performance and can also be reused in the same or similar control task.Compared with back stepping scheme, the designed adaptive neural controller only uses one RBF network, which significantly reduces the complexity of controller, so that our proposed controller can be easily implemented in practice.

Consider
the following electrically-driven mechanical systems (Tarn et al., 1991;Dawson et al., 1998;Su and Stepanenko, 1996): where, q(t)∈ R n denotes the angle displacement variable vector, M(q)∈ R n× is the symmetric positive ndefinite inertia matrix; V m (q, ‫ݍ‬ሶ )∈ R n×n is the Coriolis and centripetal forces matrix; G(q)∈ R n is the gravity vector; F(‫ݍ‬ሶ )∈ R n is the dynamic frictional force vector; i(t)∈ R n sis the motor armature current vector and u(t)∈ R n is the control input voltage vector, y ∈ R n is the measurement output vector.Multi-axes mechanical systems are driven by the same motor, K T ∈ R n×n is the positive definite constant diagonal matrix which characterizes the electro-mechanical conversion between current and torque and K T = k t l n×n ; L = R n×n is a positive definite constant diagonal matrix denoting the electrical inductance and L = Il n×n ; R(I, ‫ݍ‬ሶ )∈ R n represents the electrical resistance and the motor backelectromotive force vector; and M(q), V m (q, ‫ݍ‬ሶ ), G(q), K T , L, R(I, ‫ݍ‬ሶ ), are all unknown.Some fundamental properties of mechanical system dynamics are stated as follows (Tarn et al., 1991;Dawson et al., 1998;Su and Stepanenko, 1996).
Property 1: The Coriolis and centripetal forces matrix can always be selected so that the matrix ‫ܯ[‬ ሶ (q) -2V m (q, ‫ݍ‬ሶ )] is skew symmetric.
Property 2: The inertia matrix M(q) is symmetric, uniformly positive definite, for some positive constants m 2 ≥ m 1 > 0, m 1 I M(q) ≤ m 2 I, ∀ q ∈ R n .
Let, x 1 = q, x2 = q, ‫ݔ‬ሶ 1 = ‫ݍ‬ሶ , x 3 = i, then Eq.1 can be transformed as follows: ( . The objective of the paper is: given a bounded and smooth recurrent reference output y d (t), to design an adaptive neural controller using the localized RBF networks for the system (1) such that output y track the desired output y d and both control and learning can be achiseved.It is assumed that y d (t) and its derivatives up to the 3th order are uniformly bounded and known smooth recurrent orbits.
In the following, we show that Eq. ( 2) can be transformed into the normal form with respect to the newly defined state variables.
Let z 1 = y , z 2 = ‫ݖ‬ሶ 2 , z 3 = ‫ݖ‬ሶ 2 = F 2 (x 1 , x 2 ) + G2(x 1 , x 2 )x 3 .The derivative of z 3 is derived as: where, Therefore, the electrically-driven mechanical systems defined by Eq.1 can be described as the following normal form with respect to the new state variables: It should be noted that apart from the fact that functions F z (x) and Gz(x) are functions of x, they are completely unknown.
Property 4: From Property 3, it is also noted that there exist constants Remark 1: For mechanical systems, the components of ) (q M are only the linear combination of constants and trigonometric function of q, so the components of ) (q M & are only the combination of ‫ݍ‬ሶ and trigonometric function of q, ‫ܯ‬ ሶ (q)q is bounded.According to ‫ܯ‬ ሶ (q and ‫ܩ‬ ሶ z -1 (x) are also bounded.

ADAPTIVE NEURAL CONTROL AND LEARNING
High gain observer design: From Eq.4, we noted that z i is incomputable.Since F 2 (x), F 3 (x), and G 2 (x), G 3 (x) are unknown nonlinear functions.The HGO used to estimate the state z i is the same as the one in (Ge et al., 1999;Eehow et al., 2010) and is described by the following equations: where, ε is a small positive design constant and parameters d j (i = 1, … , n -1) are chosen such that the polynomial s n + d1 s n-1 =+ … + d n-1 s + 1 is Hurwitz.Then, there exist positive constants h and t * such that ∀ > t * we have: To prevent peaking (Khalil, 2002), saturation functions can be employed on the observer signals whenever they are outside the domain of set Ω, as follows: Adaptive neural controller design: For the system defined by Eq. ( 4) and recurrent reference orbit y d (t), an adaptive neural controller using RBF networks is designed as follows.Vector Y d , E and filtered tracking error vector r are defined as: where, E = [e T , ݁ሶ T , ݁ሷ T ] T , e = y -y d is the output tracking error vector, K = [λ 2 I, λ 1 I] T is appropriately chosen such that polynomial s 2 + λ 1 s + λ 2 is Hurwitz: Differentiating r: where, Choosing the control: are used to approximate the unknown functions: are the NN inputs, W * is the optimal NN weight vector and ε(X) are the NN approximation errors, with ||ε(X)|| < ε * (ε * > 0), ∀X∈ Ω x The weight update law is given by: where, Г = Г T > 0 is a constant design matrix, σ >0 is a small positive constant.
The overall closed-loop system consisting of systems defined by Eq.1, filtered tracking error defined by Eq. ( 10), the controller defined by Eq. ( 15) and the NN adaptive law defined by Eq. ( 17) can be summarized into the following form: where, Theorem 1: Consider the closed-loop system defined by Eq. ( 18).For any given recurrent reference orbit starting from initial condition y d (0) ∈ Ω d (Ω d is a compact set) and with initial condition y (0) ∈ Ω 0 (Ω 0 is a compact set) and ܹ (0) = 0 we have that: • All signals in the closed-loop system remain ultimately uniformly bounded.• There exists a finite time T 1 (T 1 > t * )such that the state tracking errors E= [e T , ݁ሶ T , ݁ሷ T ] T converge to a small neighborhood around zero for all t ≥ T 1 by appropriately choosing design parameters.

Proof:
• Consider the following Lyapunov function: Thus, it follows that if ||ܹ ||> s * / σ, then ܸ ሶ w > 0 s * is the upper bound of ||S(X)|| (see reference literature (Slotine and Li, 1991).This leads the UUB of ||ܹ || as ||ܹ || ≤ s * / σ.According to ܹ ෪ ܹ = W * ,we have that ܹ ෩ is UUB as follows: Take the Lyapunov function V r = r T G -1 z r/2.Differentiating V r , we have: . Note that the equality z E = can be easily induced from Eq.9 and Eq. ( 12) as follows: , Eq. ( 22) is further derived as: This implies that the filtered tracking error vector r is UUB as follows: Whose boundary can be made small enough by increasing the control gain k v and decreasing ε.
Because r = ݁ሷ + λ݁ሶ + λ 2 e is stable by appropriately choosing design parameters λ 1 , λ 2 and y d , ‫ݕ‬ሶ d, ‫ݕ‬ሷ d are bounded, then z 1 , z 2 , z 3 are bounded.S(X) is bounded for all values of X, we conclude that control u is also bounded.Thus, all the signals in the closed-loop system remain ultimately uniformly bounded.
• Consider the following Lyapunov function: The derivative of V r is: where, ܿ̅ λ = k v c λ ‫||ݖ||‬ = O(εh)，s * is the upper bound of ||S(X)||, ܹ ෩ * is the upper bound of ||ܹ ෩ ||which is given in Eq. ( 21).Then Eq. ( 27) becomes: Let, δ = (ܹ ෩ *2 s *2 + ε *2 + ܿ̅ 2 λ)/ (4݇ ത v2 ), it is clear that δ can be made small enough using large enough k v , so we have: The above equation implies that given , there exists a finite time T 1 , determined by δ and ݇ ത v1 , such that for all t ≥ T 1 , the filtered tracking error r satisfy: where, β is the size of a small residual set that can be made small enough by appropriately choosing By choosing a large k v , the filtered tracking error r can be made small enough ∀t ≥ T 1 That is to say, there exists a T 1 such that the state tracking errors converge to a small neighborhood around zero for all t ≥ T 1 by appropriately choosing design parameters (Slotine and Li, 1991) , so that the tracking states z(t)| t≥T1 .follow closely to Y d (t)| t≥T1 .
Remark 2: Theorem 1 indicated that the system orbits ) (t z will become as recurrent as Y d (t) that after time T 1 (T 1 > t * ), so x(t) will also become recurrent.‫ݒ‬ ො converges to a small neighborhood around y(3)d ) 3 ( d y after T ≥ T 1 ，which indicates that ܸ is as recurrent as y (3)  d .Since X = [x T , ‫ݒ‬ ො T ] T are selected as the RBF networks inputs, according to theorem 2.7 in Wang and Hill (2009), S(X) will satisfy the partial PE condition, i.e., along Y d (t)| t≥T1 ., S ξ (X) satisfies the PE condition.

Remark 3:
The key aspect of the proposed method is that electrically-driven mechanical systems are transformed into the affine nonlinear system in the normal form with state transformation.Thus, learning and stability analysis avoid using virtual control terms and their time derivatives, which require complex analysis and computing.In our proposed approach, only one RBF network is employed to approximate the unknown lumped system nonlinear dynamics, which shows the superiority of our proposed learning control scheme.
Learning from adaptive neural control stability of a class of LTV systems: For deterministic learning from adaptive neural control of nonlinear systems with unknown affine term, the associated LTV system is extended in the following form Liu et al. (2009): where, , where diag refers to block diagonal form and C(t) : = ГB(t)H(t).
Assumption 1: (Loría and Panteley, 2002).There exists a ∅ M < 0such that, for all t ≥ 0, the following bound is satisfied: Assumption 2: (Liu et al., 2009).There exist symmetric matrices P((t) and Q Lemma 1: (Liu et al., 2009).With assumption 3.1 and 3.2 satisfied in a compact set Ω, system defined by Eq.36 is uniformly exponentially stable in the compact set Ω if S(t) satisfies the PE condition.
Learning from adaptive neural control : Using the localization property of RBF network, after time T 1 , system defined by Eq.18 can be expressed in the following form along the tracking orbits X(t)| t≥T1 .as: subvector of S(X); ܹ ξ is the corresponding weight subvector; the subscript ߦ ̅ stands for the region far away from the trajectories X(t)| t≥T1 .; ε ξ are the local approximation errors, ||ε ξ || is small.Theorem 2: Consider the closed-loop system defined by Eq. ( 38).For any given recurrent reference orbit starting from initial condition y d (0) ∈ Ω d and with initial condition y d (0) ∈ Ω d , 0 ) 0 ( ˆ= W and control parameters appropriately chosen, we have that Along the tracking orbits X(t)| t ≥ T , neural weight estimates ܹ ξ converge to a small neighborhood of the optimal values * ξ W and locally accurate approximation of the unknown closed-loop system dynamics ψ(X) are obtained by , where W is obtained from: ) where, [t α , t b ] (t b > t α > T , T 1 ) represents a time segment after transient process of W ˆ.
Proof: Let θ = G -1 z (x)r，and η = ܹ ෩ ξ , then system defined by Eq.38 is transformed into: Rewrite Eq. ( 41) in matrix, we have: are small, system defined by Eq.42 can be considered as a perturbed system (Wang and Chen, 2011).where, The satisfaction of Assumption 3.1 can be easily checked.With G z (.) and ‫ܩ‬ ሶ z ‫ܩ,).(‬ሶ -1 z (.),being bounded, k v can be designed such that 2G z (k v ‫ܩ=‬ ሶ z -1 )Gz = ‫ܩ‬ ሶ z is strictly positive definite and the negative definite of ܲ ሶ + PA + A T P, is guaranteed.Thus, Assumption 2 is satisfied.
After time T 1 , the NN inputs X(t) follow recurrent orbits and the partial PE condition (Wang andHill, 2006, 2009) can be satisfied by the regression subvector S ξ (X), which consists of RBF networks with centers located in a small neighborhood of the tracking orbits X(t)| t ≥ T1 .Thus, uniformly exponentially stability of the nominal system of system defined by Eq. ( 42) is guaranteed by Lemma 1.For the perturbed system defined by Eq. ( 42), using Lemma 4.6 in Khalil (2002), the parameter errors η = ܹ ෩ ξ converge exponentially to a small neighborhood of zero in a finite time T(T > T 1 ), with the size of the neighborhood being determined by the NN approximation ability and state tracking errors.
The convergence of ܹ ξ to a small neighborhood of W * ξ implies that along the tracking trajectories X(t)| t ≥ T , the unknown closed-loop system dynamics ψ(X) can be represented by regression subvector S ξ (X) with small error, i.e.,: where,

=
, Eq. ( 47) can be expressed as: where, ܹ ഥ and ߝ̅ ξ are errors using ܹ ഥ T ξ S ξ (X)to approximate the unknown closedloop system dynamics.After the transient process, can approximate the unknown functions ψ(X) along tracking trajectories X(t)| t ≥ T , as: can approximate the unknown nonlinear functions ψ(X) along the tracking trajectories X(t)| t ≥ T ,.
Remark 4: Theorem 2 reveals that deterministic learning (i.e., parameter convergence) can be achieved during tracking control to a recurrent reference orbit.
The learned knowledge can be stored in the constant RBF networks ) ( X S W T , but it is generally difficult to represent and store the learned knowledge using the time-varying neural weights.Through the deterministic learning, the representation and storage of the past experiences become a simple task.

SIMULATION
A single-link robotic manipulator coupled to a DC motor is considered.The dynamic equations of the system are:  50) can be expressed in the following form: The initial state values are x 1 (0) = 0x 2 (0) and x 3 (0) = 0 and the desired output is set to y d = 0.8sint.
Figure 1 to 5 show the simulation results.The tracking performance of system is shown to be good in Fig. 1 and 2 and the tracking performance become (b): control input u using ܹ ഥ T S(X) loop system dynamics during tracking control to a recurrent reference orbit.The learned knowledge stored as a set of constant neural weights can be used to improve the control performance of system.

CONCLUSION
In this study, we have investigated deterministic learning from adaptive neural control of electricallydriven mechanical systems with completely unknown system dynamics.Compared with back stepping scheme, the key factor of the proposed method is that the electrically-driven mechanical systems are transformed into the affine nonlinear systems in the normal form, which avoids back stepping in controller design.Only one RBF network was used to approximate the unknown lumped nonlinear function, which shows the superiority of our proposed control algorithm.The designed controller has not only implemented the UUB of all signals in the closed-loop system, but also achieved learning of the unknown closed-loop system dynamics during the stable adaptive control process.The learned knowledge stored as a set of constant neural weights can be used to improve the control performance and can also be reused in the same or similar control task so that the electrically-driven mechanical systems can be easily controlled with little effort.
angular position J = The inertia of the actuator's rotor L 0 = The length of the link m = The mass of the link M 0 = Payload mass R 0 = The radius of the payload g = The gravitational constan K T = The torque constant K B = The back-EMF constant R = The armature resistance L = The armature inductance I = The armature current u = The armature voltage Eq. (