Adaptive Robot Control – An Experimental Comparison

This paper deals with experimental comparison between stable adaptive controllers of robotic manipulators based on Model Based Adaptive, Neural Network and Wavelet -Based control. The above control methods were compared with each other in terms of computational efficiency, need for accurate mathematical model of the manipulator and tracking performances. An original management algorithm of the Wavelet Network control scheme has been designed, with the aim of constructing the net automatically during the trajectory tracking, without the need to tune it to the trajectory itself. Experimental tests, carried out on a planar two link manipulator, show that the Wavelet-Based control scheme, with the new management algorithm, outperforms the conventional Model-Based schemes in the presence of structural uncertainties in the mathematical model of the robot, without pre-training and more efficiently than the Neural Network approach.


Introduction
Model Based Adaptive (MBA) control of robotic manipulators has been widely investigated due to its capability in coping with those system modifications which can be modelled as parameter variations, also of high entities, of the nonlinear mathematical model of the manipulators themselves [1][2][3][4][5] such as, for example, phenomena of wear and ageing. The main assumption in the synthesis of MBA controllers is the knowledge of the structure of the mathematical model of the system, whereas the parameters appearing in this model can be unknown.
Unfortunately, the manipulators can rarely be perfectly modelled. For example, manipulators which grasp certain unknown mechanical characteristics, for example, can be modelled with uncertainties both in the structure and in the parameters of the mathematical model. Moreover, it is difficult to model stiction components of friction torques and, consequently, it is not convenient to take them into account explicitly in the model used for the MBA controller design.
Obviously, the presence of structural uncertainties in the mathematical model of the plant can deteriorate the performance of the above MBA controller, depending on their entity, resulting in unsatisfactory results or instability [6][7][8]. To overcome this drawback, some authors propose the addition of additional loops to the MBA controller (cf. for example [6]), whereas other authors propose non-parametric adaptive control structures, such as Neural Networks (NN) [8]- [19].The NN is usually designed in order to identify the inverse dynamics of the system to be controlled and then implemented online as a feedforward controller [8]- [12]. An optimal structure of the NN can be obtained using Genetic Algorithms (GA) [12]- [14]. Furthermore, if parameter variations occur during operation, additional features can be introduced consisting of online adaptation sto the weights of the net, the so-called "specialised learning phase", which allows one to obtain a neural controller whose performance is optimised for the considered reference trajectories [9], [11].
The gradient descent optimisation method, employed for adjustment of the parameters of the net, is effective in practice in the presence of structural modifications to the system [7], [12]; however, currently there is no systematic way to ensure these methods will be successfully employed. Moreover, offline pre-training of the net is often necessary. The analysis of the system becomes further complicated when learning and control are attempted simultaneously.
In two studies [19] and [20], NN controllers for robot manipulators are designed with enhanced performances while retaining the stability and convergence characteristics of MBA controllers. These algorithms treat the inverse dynamics of the manipulator as an unknown function and try to approximate this function. Practical applications of these techniques crucially depend on both the accuracy of the function approximation and efficiency of the computational structure of the approximate function.
The structure of a neural net is the topological representation of a weighted superposition of functions, particularly in the case of the single-hidden-layer perceptron and, consequently, function decomposition over a given functional basis can be represented as a neural net. There are two main topics related to function approximation via NN; the first topic regards the network structure, i.e., the choice of the topology, the family of activation functions of the nonlinear nodes, and the number of these nodes. The second topic concerns the choice of an updating law for the output weights, so that the net output could approximate the unknown function in a context of adaptive nonlinear approximation.
Within the context of the function approximation by means of NNs, the kind of activation function holds great importance. In three studies [21]- [23], fuzzy membership functions have been used. In this context, wavelets are potentially a good choice as activation functions because they have a practically-bounded support both in space and frequency [24], [26]. These characteristics allow the dynamic management of the net, adding and removing nodes from its structure without interfering with the weights of the other nodes.
Another advantage of wavelets over other basis functions, like splines or even sinusoids, is the multiresolution property which allows more efficient representation of local details of the approximated function. The multiresolution capabilities make the wavelets a very efficient tool, since by retaining only the coefficients which are above a certain threshold, it is possible to obtain a "sparse" representation which does not imply a significant loss of details [26].
An interesting control structure is that illustrated in Cannon and Slotine's study [27], which uses a dynamic wavelet network, the structure of which is updated according to the hard-threshold algorithm [26]. In Sanner and Slotine's study [28], the approach used in [27] was extended to the adaptive control of robotic manipulators (WN) and a simple algorithm was proposed to manage the network nodes: for each active neuron the nearest neurons can be activated if the weight of the active neuron is greater than a given threshold. A drawback of this mechanism is that the behaviour of the system is greatly influenced by the choice of the above threshold, which implies that the net can have a number of nodes either too low or too high for the task to be executed.
In Alonge et al.'s study [29], a new activation mechanism is proposed, based on the observation that in order to ensure a good tracking of a given trajectory, the active nodes of the net must cover at least all the points of the net input space, which are actually part of the trajectory itself. To this end, the following procedure is employed: a) during the evolution of the system, a new node at a maximum scale is inserted each time the trajectory of the system reaches an interval of the net input space not already covered by other nodes; b) insertion and removal of nodes with activation function at minor scales is carried out so that local details of the new unknown dynamics can be better approximated.
Step b) is executed according to a routine based on the threshold algorithm previously described, using local thresholds instead of global ones. In particular, for each maximum scale node inserted in the network structure, a couple of thresholds are considered and updated starting from the local tracking error. When the weight of a node exceeds the insertion threshold, the nodes which are adjacent in the wavelet index space are inserted; in the same way, nodes are removed when their weights go below the removal threshold. Since the thresholds are locally updated on the basis of the tracking error, insertion of new nodes is more probable in those areas of the input space in which the dynamics are insufficiently modelled. In the paper, the previously described algorithm for basis function selection is considered. The resulting adaptive algorithm, named WN+, was experimentally compared with the MBA and NN approach. Moreover, the complexity of the resulting network is compared with the complexity of the WN controller illustrated in Sanner and Slotine's study [28].
In Section 2 the MBA approach we consider in the paper is first reviewed then the WN+ approach is introduced with the new network management algorithm. In Section3, experimental results on a planar two-link manipulator are displayed with the aim of comparing the WN approach with the MBA approach in two studies, [2] and [5] and NN based control laws [7], [12] and [28]. Finally, Section 4 deals with some conclusions.

Model-Based Adaptive Controller
A robotic manipulator is a dynamic structure consisting of a number of links joined together by joints typically actuated by motors coupled to the joints themselves. For a manipulator with m joints, a typical mathematical representation of the dynamics of the system is the Euler model given by: is the vector of the torques generated by the motors connected to the joints at the generic instant t and m  q is the vector of the generalized joint coordinates at the same instant. The matrix ( ) B q is the inertia matrix and collects terms relative to the inertia effects at the joints, the matrix ( , ) C q q  collects Coriolis and centripetal torques and the vector ( , ) E q q  represents every torque acting on the joints due, for example, to environmental forces, such as gravity forces, dynamic and static friction.
Given an assigned desired trajectory in the joint space, specified by means of a couple of position and velocity functions d d ( ), ( ) t t q q  , it is possible to force the system to track asymptotically this trajectory by generating at the joints the control torque vector given by [2]: where ˆ( ), ( , ) and ( , ) B q C q q E q q   are estimates of the corresponding matrices and The control law (2) is based on an inverse dynamic model of the system used to compensate the nonlinearities of the model (1), so that the resulting closed-loop system, decoupled and linear, can be driven by the PD action D K s . This kind of approach, called "inverse dynamic control", allows an asymptotically exact tracking of the desired trajectory, assuming that exact knowledge of the inverse dynamics model is assured. This led to the introduction of various techniques used to make the control structure robust against uncertainties in the model, like the use of a sliding controller in the control loop (cf. for example [6]), which allows it to cope with unmodelled dynamics such as, for example, some components of friction. By applying such robust techniques, the control structure can be used when an approximated modelling of the inverse dynamics of the manipulator is available.
If model (1) admits the linear parameterisation: where Y is a matrix of nonlinear known functions and p is a vector of unknown constant parameters, an adaptive version of (2) is given by: where  and D K are positive definite matrices.
However, some nonlinear systems which can be modelled using (1) cannot be parameterized as in (3) and, consequently, the adaptive approach (4) and (5) cannot be applied even if the structure of the matrices ( ), ( , ) and ( , ) B q C q q E q q   is known.

Wavelet Network Controllers
Following [25], let us assume that the matrices ( ), ( , ) and ( , ) B q C q q E q q   are unknown. In order to design an adaptive control structure following (2), it is convenient to express the dynamic model of the robotic manipulator in terms of a matrix ( ) M x of unknown functions, as: and With previous positions, the control law (2) can be written as:   and ( ) N x is an estimate of ( ) M x . Each element of the matrix ( ) N x can be approximated by means of a linear combination of functions from a suitable set of basic functions consisting of wavelets. The weights of this linear combination can be updated online, therefore realizing a kind of nonlinear adaptive control.
A wavelet network is a family of functions which present some useful properties in the field of function approximation, resulting in a powerful tool to perform signal analysis and synthesis. As mentioned previously, the main advantages of the wavelet analysis over other methods of function decomposition, like splines or even Fourier analysis, are the multi-resolution property and the compact support.
A wavelet basis is a family of functions having a double index each function of the set has a translation index, which is a vector if the set is defined over a multidimensional input space, and a scaling index which is often scalar. Multidimensional scale indices are also used in the so-called multiscaling wavelet frame. A wavelet basis is thus defined as an infinite collection of translated and scaled versions of a single motherwavelet. Since the mother wavelet is a function with a compact support, the multiple scales of the wavelets allows representation of the analysed function with various levels of detail, maintaining the localization of the information about the characteristics of the analysed function. The multi-resolution analysis also makes the wavelets a very efficient tool, since retaining only the basis function whose coefficients are above a certain threshold, it is possible to obtain a ʺsparseʺ representation of the function which does not involve a significant loss of details.
Within the context of function approximation by dynamic neural networks, wavelets are potentially a good choice as activation functions; in fact, while the compact support allows the management of the network dynamically, adding and removing nodes from the net structure without interfering with the weights of the other nodes, the multi-resolution property allows efficient representation of the local details of the approximated function.
The wavelet expansion used for function approximation can be considered as a neural net with a single layer the nodes of which have wavelets as activation functions.
In order to approximate the function matrix ( ) M x it is necessary to use a multidimensional wavelet representation; this has been obtained as a multidimensional radial wavelet frame based on the Mexican hat scalar mother wavelet given by: For an n-dimensional input space, the corresponding wavelet family is obtained as follows: are the scaling and translation indices.
Wavelets of the form (10) satisfy the so-called "frame" properties [13], which ensure that a function 2 ( ) n f L   having a compact support can be approximated within a given error by a superposition of a finite number of elements in the form (10). The above error depends on the need for truncation of the frame in order to obtain a neural network with a finite number of nodes and, consequently, physically realisable or with acceptable computational cost, so that its implementation on a digital system can be efficient.
In order to affect the frame truncation it is convenient, first of all, to choose a dominion of the space n  of x , where 2 n m  , in which a good approximation of ( ) M x has to be obtained. This dominion denoted by d I is that containing the desired trajectories of the manipulator. Moreover, the elements of ( ) M x are continuous function but do not belong to 2 ( ) n L  ; consequently, the conditions for their approximation, with a given error, by superposition of functions of a frame, are not satisfied. To overcome this problem, the following procedure is employed. . Outside ex I , ( ) N x is zero and the neural control component is usefulness. The presence of unknown dynamics outside ex I suggests the use of a robust control component which stabilizes the whole system. Finally, when the state is inside ex I but outside d I it is convenient to modulate the robust and the neural components.
The previous considerations suggest the following architecture of the control law: where D K is a diagonal matrix and sl u is the sliding mode type robust control component given by: in which s K is a positive definite diagonal matrix whose elements satisfy the conditions: With reference to the structure of the net, it is convenient to observe that x and ( ) M x , and consequently ( ) N x , are, respectively, 2 1 m  and (2 1) m m   matrices; consequently, the output layer consists of (2 1) m m   nodes, whereas the 2m inputs are directly connected to the nodes of the hidden layer. The output layer is connected to the hidden layer by means of weighted connections whose weights are the coefficients of the wavelet expansion of the elements of ( ) M x to be estimated of ( ) N x . These weights are updated according to an adaptive law obtained from the stability analysis of the whole system.  for rotational joints and [0, ] L for translational joints. Moreover, the velocities at the joints are also bounded due to the performances of the actuators.
The superior limit on j is chosen so that the support of the wavelet has the minimum dimension of d I , whereas the inferior limit is chosen according to the spectral content of the unknown function to approximate. The limits on i k for assigned j are given by ,min ,max contains the potential wavelets and consequently, the potential nodes of the net which can be activated to participate in the reconstruction of the unknown dynamics. Since the number of these nodes is usually high, it is convenient to use a dynamical structure of the net, i.e., an adaptive structure in which the neurons are either added or removed according to the actual trajectory to track. It follows that the set of active neurons at a given instant of time, denoted by net K , is a subset of pot K . Moreover, in order to avoid discontinuities in the control signals, the weights of the nodes to be activated start from zero and are updated gradually using the adaptive updating law, whereas the weights of the nodes to be removed are gradually brought to zero using another updating law.
The generic element ( ) il n x of ( ) N x is computed online as follows.
, , where the quantities , il w k are the output weights of the net corresponding to the active node whose activation function is , j  k . As shown in [10], choosing the weights of the network according to the updating law: for those nodes which are selected for deletion, the overall system results are stable and converge asymptotically when the filtered tracking error is greater than a given threshold given by: where: a) , and r d T E E E are, respectively, the error due to the gradual updating of the weights of the nodes to be removed, the error due to the dynamical management of the net which implies that some nodes are removed and consequently the wavelet expansion involves a reduced number of wavelets of the chosen frame, and the errors due to the truncation of the frame; b) rem N and max w are the number of the nodes to be removed and the maximum absolute value of the weights of the net. The proof is similar to that shown in a study by Gomi and Kawato [10].
Note that (17) implies that the s and, consequently, the tracking error e do not converge to zero. To increase the tracking performance of the whole system it needs to reduce and A great importance is assigned to the algorithm which causes the insertion and removal of the nodes of the net. In a study by Gomi and Kawato [10], for a similar control system, an algorithm based on a simple threshold has been proposed. In this solution, whenever the weights of a node active in the net overcome a certain threshold  , all the nodes which have adjacent wavelet indexes are inserted in the net structure. In a similar way, all the nodes whose weights go under a given threshold  are removed from the structure, after their weights have been gradually brought to zero. A drawback of this solution is that the behaviour of the system is greatly influenced by the choice of the numerical values of the thresholds, so that the result may be either insensitivity to the weight growth, thus keeping too low the number of nodes of the network, or oversensitivity generating a network with an excessive complexity. Within a context of adaptive control, it seems unrealistic that the algorithm managing the net needs to be tuned to the problem in order to give good results.
Moreover, in order to bring an efficient structure to the net, the algorithm managing the net must correlate the subset of the input space effectively covered by current trajectories with the position of the active neurons and the number of wavelets with minor scale index, with tracking error, which is a measure of the wellness of the local identification. Following the above consideration, in order to ensure a good approximation of the unknown dynamics of the manipulator, the active nodes in the net must, at least, cover all the points of the net input space which are actually part of the trajectory of the system. Therefore, in the proposed algorithm, since each activation function has a bounded support and centres are located in an m-dimensional grid, during the evolution of the system new nodes are inserted each time the vector reaches an interval in the grid which is not already covered by another node.
Besides this major policy of node insertion, which ensures at least the minimum degree of coverage of the input space, a second routine causes the insertion and removal of nodes with activation function at minor scales, so that local details of the unknown dynamics can be better approximated. This routine is based on the threshold algorithm described previously but uses local thresholds instead of global ones. For each maximum scale node inserted in the network structure, a couple of thresholds are considered, the values of which are updated during the control action on the base of the local tracking error. The approach described before concurs to the insertion of wavelets in the frame, which gives a valuable contribution to the error tracking reduction, taking advantage, in such a way, of space localisation properties of such functions.
The update algorithm of local thresholds is obtained as a discrete time task, the period of which is T. Each time this task is activated, it checks the maximum value of the filtered tracking error s, which has been measured during the previous time interval, and compares it with a given global threshold (Error Threshold). If the filtered error is greater than the Error Threshold then local thresholds of the nodes whose centre is nearest to the actual value of the network input, x, are modified as After this threshold adjustment, the algorithm is applied as in the study by Gomi and Kawato [10]; thus, when a weight of a node exceeds the insertion threshold, nodes which are adjacent in the wavelet index space are inserted; in the same way, nodes are removed when their weights go below the reduction threshold. Since the thresholds are locally updated on the base of the tracking error, insertion of new nodes is more probable in those areas where the dynamics are insufficiently modelled, with the presence of nodes at multiple resolutions. The described procedure can alleviate the problem of the ingestible increasing of the number of basis function with the dimensionality of the input space. In fact, as a result of the algorithm, the wavelet allocation is near the actual trajectories on the net input space by making the problem, in such a way, nearly always1-dimensional.

Experimental tests
Experimental tests were carried out in order to prove the practical application of the WN control law and compare it with both the NN controller described in [7] and [12] and MBA control described in [5] and [6]. In this section, the results of the above experiments are discussed.
The experimental equipment is illustrated in figure 1 and consists of: The manipulator to be controlled is a two-DOF SCARA type whose structure is illustrated in Fig. 2. The mathematical model used for the implementation of MBA control law is given by: where matrices and parameters are given in Tables 1-3. The elements of vector p are considered unknown and were updated online with an MBA approach.
An application of MBA control law is also considered based on the same model as above, but without load (i.e., 0 p M  , 0 p I  , 0   ). The corresponding control law, named MBA-, was considered in this paper in order to test the robustness properties of MBA controllers to the manipulator structure uncertainties. Note that the presence of an asymmetric load implies both structure and parameter variations in the mathematical model, because both matrix Y and nominal values of the parameters p changes.
Finally, a NN controller was considered in the paper because for its implementation, as for WN, the knowledge of the mathematical model of the manipulator is not necessary. The considered approach is that described by Alonge et al. [12], in which a multi-layer sigmoidal perceptron was used to control the manipulator sketched in figure 2 but without load. The final structure of the net was obtained from an offline training phase with input-output data acquired from a closed loop scheme with a PD controller, followed by a optimization of the net structure based on Genetic Algorithm, consisting in the determination of an optimal net architecture, number of hidden layers, number of neurons, connectivity percentage for each layer and percentage of connections between layers. The resulting neural network has the following characteristics: 7 input variable ( 2 q , 14 neurons in the hidden layer, 57% as connectivity percentage towards the hidden layer and 91% towards the output layer, 2 output neuron. Online learning takes place starting from an ANN which gives good initial inverse modelling after offline batch training process but use of multilayer sigmoidal perceptron implies a number of about one hundred parameters to adapt in the online application.
In all tests, the desired trajectory is: and the parameters of the PD control component, for all the controllers, are chosen as (700,80) In order to show the capabilities of the proposed network management algorithm, two experiments were performed using the same control scheme with two different values of the maximum allowed error. The results displayed in Table 4 show that improving the tracking performances, by decreasing of the maximum allowed error, produces increasing of the complexity of the net.
The experimental results are displayed in Figures 3-9. Figures 3-8 show the results obtained applying WN+ control law with the Error Threshold chosen as in the first column of the Table 4. In particular, Figures 3 and 4 show joint tracking errors, Figures 5 and 6 show the applied control torques and Figure 6 shows the shape of the endpoint tracking error. Finally, Figure 8 shows the number of nodes at different scales automatically obtained by the node allocation mechanism. Figure 9 shows the displacement of the filtered end-point tracking errors for WN, NN, MBA and MBA-control laws, which were obtained from MBA without modelling the grasped load. Filtering was necessary in order to extract the average displacement in a suited temporal window, from end-point tracking errors. The chosen temporal window was 5 seconds. The comparative tests show that MBA is the best control approach if a sufficiently accurate model of the manipulator is known. But if the model does not take into account, for example, the grasped load, the performance deteriorates (MBA-). NN can learn the structural modification as illustrated by the corresponding shape of the filtered tracking error but this approach necessitates a long pre-training phase and optimisation phase. A comparison between WN+ and NN controllers shows that the application of NN controller causes the tracking error to start from lower than the average value because of its pre-trained structure, but application of WN+ outperforms that of NN in a short time. Moreover, the final structure of WN has 14 neurons, whereas NN has 100.
Finally, a third experiment has been performed in order to compare performances of the WB and WB+ algorithms in term of complexity of the net in both cases. The results illustrated in Figure 10 show that WB+ bring in a simpler structure than that using WB, as theoretically expected.

Conclusions
In this paper an experimental comparison between Model Based Adaptive (MBA), Neural Network (NN) and Wavelet Network (WN) control has been considered. The comparative tests show that MBA is the best control approach if a sufficiently accurate model of the manipulator is known. But if the model does not take into account, for example, the grasped load, the performance deteriorates. Moreover, NN control application causes the tracking error to start from lower average value with respect to WN approach because of its pre-trained structure but application of WN outperforms that of NN in a short time. Furthermore, the final structure of WN has 14 neurons whereas NN has 100. Wavelet Networks controllers, with the new network management strategy, can learn the complex dynamics of manipulators without a pre-training phase, more efficiently and more accurately than NN and the MBA-approach. Number of nodes total number with scale "-1" with scale "0" with scale "1"