Support Vector Regression for Approximation and Generation of Motion in Humanoid Robots

This paper analyzes the approximation of human movement which preserves dynamic balance under perturbations using Support Vector Machine (SVM) regression. The quality of approximation was evaluated by two criteria. The first one is deviation of approximated motion from the recorded one, while the other criteria is position of the Zero Moment Point (ZMP), because the dynamic balance has to be maintained. For the human movement, ZMP is constantly within the support area. For the approximated motion applied to humanoid the position of ZMP is calculated. It is possible that the ZMP leaves the support area due to deviation of approximated motion from the recorded one, as well as the deviation of dynamic parameters of humanoid from the real human parameters. Initial motion was recorded from humans and then approximated. Approximated data are applied on humanoid robot model and resulting motion is observed. The resulting motion obtained by SVM regression approximation were compared with cubic spline approximation. The approximated motion and calculated ZMP were then used to train a new SVM. This SVM was then used to generate motion in a humanoid robot based on the desired ZMP position. Comparative analysis of results indicates that there are significant potential applications of SVM regression in humanoid robotics for approximation and generation of motion, as well as for other tasks which require the use of


INTRODUCTION
Due to constant presence of disturbances, the primary task of any humanoid robot is to maintain dynamic balance [1].Small disturbances [2] are always present and cannot be avoided.Such disturbances are usually compensated by conventional PID control.In case of larger disturbances (e.g.stumbling upon an obstacle, shoving aside, etc.) maintaining dynamic balance becomes more complicated.Compensatory actions in humans mostly represent a coordinated, vigorous, and synchronized movement [2,3].After a vigorous motion aimed at preserving the dynamic balance, a human uses slow movement to restore the state from which he/she continues to perform the motion disrupted by the disturbance.In order for such motion to be performed by a humanoid robot, it is necessary to analyze previous human motion.
By recording movement made by a human, we collect motion data in every joint.Since it is obvious that there exist differences in kinematic and dynamic parameters between humans and humanoid robots, it is not possi-ble to use identical movements on a humanoid robot (i.e. to emulate the change of values of internal coordinates which happen in human motion).Therefore, it is necessary to modify the recorded motion data in such a way that the effects remain intact, in our case to maintain the dynamic balance of a robot.As an indicator of maintaining the quality of robot dynamic balance we shall use the Zero Moment Point (ZMP) [4].Beside maintaining the dynamic balance, the recorded data must also be modified so as to emulate the form of human motion for the same type of disturbance, and thus produce the same effect.This results in a humanoid robot motion which not only maintains dynamic balance but also corresponds to characteristics of human movement.
The recorded data were approximated.Cubic spline approximation is one of the most popular approximators, but there are also a number of other methods, such as approximation by B-spline curves, polynomial approximation, etc.Another way to approach approximation is to apply machine learning algorithms which IJIEM are used to train Artificial Neural Networks (ANN) and Support Vector Machines (SVM).
The approximated motions were simulated on a model of a humanoid robot.The results of motion simulation obtained by SVM regression were compared to motion obtained by cubic spline.During simulation, ZMP was calculated for every motion to check the maintenance of dynamic balance.Since SVM regression can yield any type of nonlinear relationship between input and output [5], in this paper the training of SVM is presented which finds a relationship between the predefined trajectory of ZMP and the motion of a humanoid robot which corresponds to that trajectory.Human movement and humanoid robot motion obtained from trained SVM were compared as well as the measured and the predefined ZMP trajectory.

SVM REGRESSION
Supervised learning is one of the machine learning algorithms which is also used to train SVMs.It determines the unknown relationships between input and output values based on experimental data.Inputs and outputs represent the training data set, while the training process results in an approximating function ( , ) a f x w .Vector x represents input, while w is the weighted coefficient matrix.
To obtain the approximating function by SVM regression, it is necessary to perform minimization of the expected error ( ) R w In (1), the loss function, ( , ( , )) a L y f x w , was calculated over a training data set which can be L1, L2, or any other norm.For the loss function Vapnik in [6] introduced a linear loss function with ε-insensitivity zone: 0, ( , ) ( , ) ( , ) , ( , ) Thus, the error equals zero if the difference between the approximated and original value is less than ε (Fig. 2.1).Vapnik's loss function with ε-insensitivity zone defines an ε-tube around the output data.

{ }
, where the inputs are ndimensional vectors x n i R ∈ , while the outputs are continuous values i y R ∈ .Based on these data the algorithm is "learning" the input-output relations of the system, thus forming input/output relationship in the form of a function.
If the relationship between input and output data can be established as ( ) , the linear regression is solved by minimizing the following function: In (3) i x and i y represent input and target output.Function ( , ) i f x w is the approximation of the output, while w is the weight coefficient matrix.C and ε ⋅ represent the penalty coefficient and ε-insensitivity zone, respectively.
Real-life problems often demand solution of non-linear input/output relationships.Non-linear regression using a support vector is resolved by mapping the input vector , where the vector z belongs to a space of higher dimensionality than that of vector x .Thus x where Φ is the mapping . In the following step, the linear regression is solved in the higher dimensionality space.Mapping function Φ is selected in advance and represents a fixed function for a given problem.The goal of such a mapping is to obtain, in the space of vector z, a problem which can be solved by linear regression.The solution of regression hypersurface ( ) which is linear is space f R This results in a nonlinear hypersurface in the initial space n R to which the input vector x belongs.Functions Φ (the so called kernels) which are most often used for mapping into higher dimensionality space are the polynomials and the Radial Basis Function (RBF).There are a number of parameters which can vary in the process of solving the regression problem using SVM.The two parameters in (3) which directly impact the solution of approximation are the ε-insensitivity zone and the penalty coefficient C .Fig. 2.2 shows the example of approximation of a noise affected sine func- tion, where we can see the influence of an increase of the ε-insensitivity zone on the smoothness of the function [5].The increase of ε-insensitivity zone in fact causes a reduction of the approximation accuracy.The number of support vectors, resulting in a smoother functions is also reduced.Table 2.1 presents the steps in creating the SVM. ), the quadratic optimization problem becomes extremely complex which represents the major drawback of SVM.Various methods have been devised to overcome this problem; however, this is not within the scope of this paper.One of the ways to overcome this problem is the chunking method proposed by Vapnik [6] which is based on data decomposition into smaller sets.

CONDITION OF DYNAMIC BALANCE
The quality of motion approximation by SVM regression shall be tested from the aspect of internal synergy, as well as maintaining dynamic balance.We shall therefore first explain the concept of dynamic balance and the conditions required for its fulfillment.

Dynamic balance:
A humanoid remains dynamically balanced as long as the support surface is maintained i.e. if there has not been rotation of the humanoid about the edge of the support surface, resulting in a fall.For dynamic balance it is necessary and sufficient that the resultant of the normal pressure forces from the foot (or feet) to the ground act at a point that is inside the support area (excluding the edges) [7].However, if the support surface area is insufficiently large to encompass the required location of the acting point of the reaction force, force R shall act at the edge of the foot (it is important to note that the reaction force cannot leave the support surface) while the unbalanced component of the moment's horizontal component shall cause the mechanism to rotate about the edge of the foot which ultimately leads to the fall of the locomotion system.
According to that, the necessary and sufficient condition of the dynamic equilibrium of locomotion system is that the ground reaction force is acting within the support surface area.Therefore, at that point the following conditions hold: ) Considering the conditions from (4), point P in Fig. 3.1 is called the Zero Moment Point (ZMP).

EXPERIMENT
Results presented in the paper were obtained by simulation.Two SVMs were formed.Firstly described is the SVM regression for motion approximation (location of the ZMP was not taken into account), while the results of motion simulation were compared to the simulation of motion approximated with cubic splines.The other SVM was used to generate complete motion of a humanoid robot by defining the required ZMP trajectory.

Data recording
In co-operation with the research of the Holodeck Gait Laboratory which is part of the Laboratory for Computer Science and Artificial Intelligence at MIT (Massachusetts Institute of Technology), the data were recorded using the VICON 512 system which operates at 120 fps.For recording, 33 markers were employed whose location was recorded with ~1 mm accuracy.Three adults participated in the experiment.Each person was asked to stand on his/her left foot while leaning against an obstacle with their left shoulder.The leaning force was measured and once it reached the limit of 20N the obstacle was abruptly removed.The movement made in an attempt to prevent the fall was recorded.The 20N limit was chosen so that the projection of a person's centre of gravity falls outside of the support surface.In this case, to maintain dynamic balance.Each subject had to perform an energetic movement which brought the projection of their centre of gravity back under their foot while the ZMP constantly remained within the support surface.
During every movement, a force plate (Advanced Mechanical Technology Inc., Watertown, MA) was used to measure and record the location of the ground reaction force, which in this case coincided with the ZMP.The ZMP location measurement accuracy was approximately 2mm.Each subject repeated the movement 10 times, thus a total of 30 movements was recorded [3].In this paper only the data of one recorded movement was used and processed.

Model of a humanoid robot
Kinematic structure of the model of a humanoid robot used in this experiment consists of four kinematic chains as shown in Fig. 4.1.The first kinematic chain represents the legs, the second forms the body and right arm, the third represents the left arm, while the fourth kinematic chain represents the neck and head.The joints with multiple Degrees of Freedom (DoFs) were modeled as a set of virtual segments (segments with zero mass and negligible length) connected by 1DoF joints.For instance, the hip joint which is a spherical joint with 3 DoF, was modeled as a set of three 1DoF segments whose axes of rotation are mutually orthogonal [8].If the recorded movement is completely applied on a humanoid robot, a calculated ZMP location will significantly deviate from the measured one.The reasons are as follows: • There are differences in the kinematic and dynamic parameters between a humanoid and a human subject involved in the experiment.Thus, the same movement applied on a humanoid, whose segment parameters are different from those of a human subject, causes different dynamic effects.
• In the simulation, the foot was treated as a rigid body immobile with respect to the support surface.However, there exist small movements between the foot and the support surface which were too small to measure, but nevertheless influence the behavior of the system.• There is a small relative movement between the markers and the human body to which they are attached, etc.
Since those small movements are constantly present and it is not possible to determine the correct parameters of the human body segments, the value of the internal co-ordinate to be applied to the humanoid should be approximated by smooth functions in such a way as to preserve movement character while maintaining the system's dynamic balance.This demands that the values of the velocities and the accelerations in joints be modified, because they have a direct impact on the ZMP location [3].
Also, if the recorded movement is applied on humanoids of different size * , the calculated ZMP location will also differ.To apply recorded human movement on different robots one can use a semi-inverse method to calculate whole body motion.If the semi-inverse method is used, the motion of the legs is obtained from the recorded movement.With the prescribed ZMP trajectory, the trunk motion can be calculated so that the condition of dynamic balance preservation is satisfied.The problem of applying the recorded movement on different robots is covered as part of the scope of this paper.Approximation of the recorded motion and motion generation will be described on a humanoid robot model whose parameters are very close to those of a human subject .
Two methods were used for approximation of the recorded motion, the cubic spline approximation and approximation by SVM regression.In Fig. 4.3 are stick * If two humanoid robots are different in size, it is clear that corresponding kinematic and dynamic parameters are also different.
IJIEM diagrams of a humanoid robot motion (cubic spline approximation with smoothing parameter of 0.9) and a ZMP trajectory.This high smoothing parameter indicates an approximation which is very close to the recorded movement.
Motion approximation by SVM regression is illustrated in the following examples.A Gaussian function of normal distribution was adopted for the kernel function.An approximation was performed without taking into account the ZMP location (it was calculated as just the motion consequence).The simulation was performed for eight different cases, varying the values of εinsensitivity zones and penalties parameter.In Table 4.1 the results are systematized so that each combination of ε-insensitivity zone and penalty coefficient is paired with the maximum deviation in the ZMP trajectory and the maximum deviation of approximated values of the internal co-ordinates from the recorded ones.It should be noted that the form of movement strongly depends on the value of the ε-insensitivity zone.For higher values of ε, the movement loses its form which is a basic reason that we decided to use just two values of the ε-insensitivity zone: 0 and 0.01.It is obvious from the table that the increase of penalty coefficient lowers the maximum deviations of the internal co-ordinates, but increases the ZMP deviations.Obviously, a tradeoff must be established between the SVM training parameters, i.e. between the deviations of ZMP and the internal co-ordinates.This is illustrated in Fig. 4.4, -4.6.
Let us first compare the cases shown in Fig. 4.4 and 4.5.In both cases the ε-insensitivity zone was 0.01.In the example shown in Fig. 4.4 the penalty coefficient was 1000, while in Fig. 4.5 it equals 10, which reveals its influence on ZMP trajectory.The following discussion illustrates how the increase of ε-insensitivity zone impacts the motion approximation.The case illustrated in Fig. 4.6 was additionally simulated.The approximation was also performed by SVM regression.The value of the ε-insensitivity zone was 0.1, while the penalty coefficient was 10 just as in Fig. Consequently, the ZMP trajectory also completely changed as compared to the previous cases.
Thus, selection of the ε-insensitivity zone requires careful consideration.A wider ε results in smoother motion approximations with lower accelerations in the joints.However, ε must not be too large because the distortion of the desired form of movement can appear.

Motion generation based on defined ZMP trajectory
Since the priority task of every humanoid is to maintain dynamic balance, it is important to generate motion which allows the ZMP to be within the support surface.
In other words, motion should be synthesized so as to be similar in form to the human's (anthropomorphic) while providing dynamic balance.
This section describes the SVM which generates motion of all joints based on a predefined ZMP trajectory, while maintaining anthropomorphic form of the movement.Thus generated changes of all internal coordinates are used to calculate (or measure, in the case of a real robot) the ZMP location which is then compared to the desired one.If the real ZMP location does not significantly deviate from the desired ZMP location, i.e. if it does not leave the support surface (providing that the generated motion has the desired form) the synthesized motion satisfies the requirements.
In this case, the training of the SVM should produce function f a : ( ) such that the relationship between the segment accelerations and the ZMP location are established.The data for the training set were collected as follows.
Four movements (out of the 30 mentioned before) were selected and approximated with different parameters.
For each approximated motion, the ZMP location and velocity were calculated (Fig. 4.7).The total number of time samples taken from the four movements was 5400, resulting in 5400 input-output pairs.From eq. ( 5) it follows that the input data are the SVM location and the ZMP velocity, while the output data are the angular accelerations in the 62 joints.Bearing in mind that SVM performance deteriorates with larger training sets, the initial set of 5400 input-output pairs was reduced to 2000 randomly selected pairs.Upon completion of the training process, the SVM was tested by using the desired ZMP trajectory (selected in a way which allows ZMP to remain constant within the support surface) to generate the corresponding motion.It shows that the form of the motion conforms to the recorded one.However, the motion amplitudes are still lower than those generated by a human.

CONCLUSION
Human movements recorded by a motion capture system cannot be directly used to generate motion of humanoid robots.To apply these movements to humanoid robots, it is necessary to modify the data so that they correspond to kinematic and dynamic parameters of a robot.Adjustment of movements was performed using SVM regression as a novel method for data approximation.The results presented in this paper show that such approximation requires a trade-off between the tracking accuracy of the recorded motion † , and the ZMP trajec- † It should be noted that the motion of humanoid robots does not always require exact following of trajectory.It is clear tory.The decrease of deviation of internal co-ordinates leads to higher deviations in the ZMP trajectory.Alternatively, if the goal is to decrease these deviations, it is necessary to generate smoother joint motions.However, this leads to higher deviations from originally recorded human movements.
Further investigation should include the impact of other SVM parameters on motion approximation, including experiments with various kernel functions.
In [9], SVM and neural network based control algorithms for maintaining dynamic balance were compared.It was shown that SVM algorithms are faster than their ANN counterparts (up to 50 times) and can be used in real time.This implies that SVM could be used to generate reflex movements, e.g. to counterbalance large and abrupt disturbances (shoulder shove, tripping over an obstacle, etc.) for real humanoids performing the same motion.Upon completed training, and based on the detected type of disturbance and the desired trajectory (location) of the ZMP, a humanoid could choose the most suitable compensatory movement.This will be one of our future research directions.
In this paper, the proposed approach was illustrated by just a single movement.The authors believe that such an approach allows the system to train well, to maintain dynamic balance regardless of the generated movement (walking, stair climbing, turning aside, etc.).
that the effect that a sway by hand or leg has on the ZMP location is far more important than following an exact trajectory.

Desired ZMP trajectory
Real ZMP trajectory Measured ZMP trajectory

Figure 2 . 1 .
Figure 2.1.Loss function with ε-insensitivity zoneAlthough primarily developed for classification problems, the SVM method is successfully used in regression problems, i.e. for function approximation.Generally, regression problems are approached in the following way.Regression implies finding of input-output rela-

Figure 2 . 2 .
Figure 2.2.Impact of ε-insensitivity zone on the quality of re gression function (left diagram: ε=0.1, right: dia gram ε=0.5) To deduce the formal, analytical conditions of dynamic balance, let us consider the humanoid during a singlesupport phase (Fig 3.1 a), where the foot rests on the support with its entire surface.For the sake of simplicity, the influence of the part of the humanoid which is above the ankle joint of the support foot (point A) shall be replaced by force A F and moment A M (Fig 3.1 b).The weight of the foot alone acts as the centre of gravity (point G).At point P, the the support surface reaction force acts upon the foot, which maintains the balance of the entire mechanism.Force R and the reaction moment of the support surface M can be broken down into three components ( ) M .The vertical reaction force ( ) z R represents the reaction of the support surface which counterbalances the vertical component of force A F and the foot weight.Horizontal components of the support surface reaction force ( ) , x y R R counterbal- ance the horizontal component of force A F .Under the assumption of a sufficiently large friction coefficient between the foot and the ground, the vertical component of the reaction moment Z M , caused by the vertical component of moment A M , and the vertical component of moment at point P due to force A F , are counterbalanced by friction, and cannot be the cause of any movement.The horizontal components of moment A M and the moment caused by forces A F and G, are counterbalanced by the location change of the reaction force Z R acting point within the support surface.This is shown in Fig 3.1 where, for sake of simplicity, a y-z plane case is shown.Moment AX Mis counterbalanced by the change in the location of the acting point of the ground reaction force Z R .The magnitude of force Z R is defined by the equilibrium of the vertical components of all forces acting upon the foot.It is important to note thatas long as the acting point of the reaction force lies within the surface area which is in contact with the footthe change of the moment acting at the joint shall be counterbalanced by the shift of the acting point of the reaction force.Therefore, at point P there are no horizontal components X M and Y M .

4. 3
Approximation of motionThe recorded motion and the measured ZMP location are shown in Fig.4.2where marker locations are represented by circles.It should be noted that Fig.4.2 depicts a visualization of the recorded movement and illustrates the measured location of the ZMP.It is obvious that the human very skillfully maintains the ZMP within the support surface.

Figure 4 . 1 .
Figure 4.1.Mechanical structure of the model of a humanoid robot with 62 degrees of freedom

Figure 4 . 3 .
Figure 4.3.Stick diagram and locations of the markers on a human (left); ZMP locations (right); the data were approximated by cubic splines with smoothing parameter of 0.9

4 . 5 .
Comparison of the cases shown in Fig. 4.5 and Fig. 4.6 reveals the loss of the required form of movement.

Figure 4 . 7 .
Figure 4.7.ZMP locations for the four approximated move ments which were used for the training set

Fig. 4 .
8 shows the result of this test.The light grey thick line indicates the desired ZMP trajectory, while the thin black line indicates the real ZMP trajectory.To allow comparison, the IJIEM measured ZMP location is also shown for the humanmade movement.

Table 2 . 1 .
Steps in creating the SVM SVM training algorithms perform very well with medium size data sets.However, when the number of data pairs increases ( 2000 l >

Table 4 . 1 .
Influence of the ε -insensitivity zone and the penalty coefficient on deviation of internal co-ordinates and the ZMP