Walking Stability Compensation Strategy of a Small Humanoid Robot Based on the Error of Swing Foot Height and Impact Force

Abstract In order to reduce the impact force of swing legs and improve walking stability when a small humanoid robot is walking, a set of impact dynamics equations based on the second kind Lagrange equation is produced, and an impact compensation control strategy with a BP network optimized by a particle swarm algorithm is designed. The core element of the compensation controller is replacing the error back propagation with a particle swarm algorithm. Due to the regulating joints of the knee, hip and ankle, the walking process is more stable than before. The experiment results show that when the left swing leg lands, the impact force drops by 2N and 1.5N respectively in the moments 4.5s and 10.5s. Therefore, the compensation strategy can reduce the impact force effectively and improve the walking stability.


Introduction
With the promotion of robots' ability to simulate human behaviour, reducing the impact of the landing of the swing leg whilst walking is increasingly prominent for robot walking stability. Therefore, being able to establish impact dynamics equations and compensate for joint angles during the walking process in order to improve walking stability has become an urgent area of study around the world. At present, many institutes are doing research in this area and have achieved a considerable amount.
Based on the concept of a change in dynamics, Ken'ichiio N proposed a coordinated movement control [1]. By using this method, dynamics models can be generated synchronously to maintain balance for smooth walking. The effectiveness of the method has been validated by QRIO which was introduced by SONY. Honda created ASIMO which had a self-discipline movement technology [2], developed the I-WALK movement model, which can effectively reduce impact force and regulate walking processes by predicting the centre of gravity. Kenji H of Waseda University changed the foot structure in order to adapt to walking environments so as to reduce the impact force [3][4]. This special foot structure may absorb vibrations when the foot hits the floor. M2, which was made by MIT, can walk more naturally. A ball screw was designed to drive a spring joint for monitoring the torque at all times [5]. The compensation controller was installed on the THBIP-II made by Tsinghua University [6]. The height error and pace error of the swing leg are regarded as the input, whereas the three joint compensation angles are the output. The National University of Defense Technology designed four generations of humanoid robots with orthogonal axes [7]. The strategy of the BHR-2, a robot created by the Beijing Institute of Technology, is to utilize sensors to interact with the environment, thus reducing the impact with the use of slow foot regulation. As well as this, an approach to revising sensor system errors on feet was also proposed [8]. Moreover, other researchers make use of flexible foot structures to reduce impact force. For example, a cushion material was installed into the foot panel so as to offset impact energy [9][10]. A foot structure that can adapt to the walking environment can be directly designed without feedback components [11][12][13].
To sum up, there are mainly three means to compensate for the joint angle. (1) Utilizing passive flexibility to reduce impact force directly. (2) Revising moving parameters according to the environment. (3) Regulating joint angles based on dynamics models.
The above measures can modify walking characteristics to improve walking quality. Measure (1) seems simple but cannot be used in most fields. Measure (2) can increase the flexibility of a robot, but walking gait cannot solve every kind of impact problem. Measure (3) can revise walking parameters according to sensors; it can also be valid in complicated environments. Therefore, based on past achievements, this paper formulates a Particle Swarm Optimization and BP network (PSO-BP) to compensate for impact force in order to improve walking stability.

The Robot Prototype
The most notable difference between humanoid robots and other forms of robots is that humanoid robots' feet can perform comfortable walking. Its main structure is designed based on a human body structure for this version as is each part of the robotʹs size, quality distribution and DOFs design selection, etc.
Based on a human body structure, a 3D-model of the prototype was established by Solidworks. As shown in Figure 1, when the robot is standing upright, the height from head to feet is 375 mm, the width between the left and right shoulder is 289 mm, and the body weighs about 2.2 kg. To be precise, lower limb length (from feet to hip) is about 206 mm, ankle height is about 31 mm. In addition, the number and allocation of the DOFs' setting are just like the human distribution of bone and muscle tissue. DOF distribution is shown in Figure 2: This research is focused on the lower leg mechanism, which emphasizes the distribution of the lower limbs DOF: Hip 2: Pitch direction + Roll direction, and knee a Pitch direction, ankle 2: Pitch direction + Roll direction.
After the sensor test is finished, sensors are installed on the lower limbs of the robot; at the same time, the potentiometer shaft end and steering gear shaft inside are linked together to measure the real-time turning angle and a pressure sensor in the robot is pasted on the foot motherboard. The strain type pressure sensor and potentiometer can be seen from the picture on the right in figure 3.

Collision Dynamics Equations
The Lagrange equation of the second category with the potential or conservative system is [14]: Where: L is the Lagrange function, T is the total kinetic energy of system, V is the total potential energy, j q are generalized velocities and j q is the generalized coordinates.
The small humanoid robot legs institutions involved in the process of moving can be treated as a rigid body. The position and orientation of the leg's modelling do not change in the process of the walking impact. Its shape and inertia will not change either. Momentum F towards the ground to swing the foot in the collision process is a quantity with a constant size and direction. Therefore, the momentum in this process can be powerful. It can be shown in a function which describes their potential relative to a basis point. So equation (3) can be obtained from equation (1).
1 V is the impulse potential.
After the implicit equation introduced, the explicit equation should be established to make clear the parameter of the equations. Therefore, the D-H coordinate system has to be established, as is depicted in Figure 4. Based on the robot prototype body and the D-H coordinate system described above, coordinate transformation matrixes i+1 A are created to describe the relationship between the adjacent coordinate system [15].
According to the structure of the humanoid robot and the established D-H coordinate system, the coordinate transformation matrixes between 1 A to i A can be expressed as: Each rigid bar i has its own centroid. Centroid position i r can be described as below: Then, each bar's centroid position i r relative to the origin of the coordinates: Therefore, the centroid speeds relative to the origin of the coordinates: Set i dT is on the quality of the micro elements dm kinetic energy, the components i(i 1,2,...,k)  of kinetic energy: Tr is the trace of the matrix.
Put i V generation on the type and finishing. By introducing To simplify the type so that: Among them i W only depends on the quality distribution of component i , and has nothing to do with the position and movement speeds [16]. Therefore, the total kinetic energy of the system is: Thus far, the total kinetic energy of the system has been established. The next step is to consider introducing the characteristics of the potential of the momentum.
Through the analysis of the characteristics of a humanoid robot prototype entity in the process of collision in walking, with regards to component 0 (the support legs) as a reference body, there is a force on component 6 with a known direction and strength, namely the impact on the swinging leg caused by the ground while landing. 6 P is set as the impulse potential on component 6, namely the impact on the swinging leg caused by the ground while landing. Then, r is the coordinates of the impact point. Therefore, in the moment that the swinging leg lands, the lower limbs of the robot are affected by only one impulse; so the total impulse 1 V is equal to 6 P .
From the formula, the kinetic energy and impulse potential of the lower limb of a small humanoid robot, respectively, can be obtained. They are applied in analyzing the form of the collision equation. So, the display equation can be achieved.
Among them: stand for the feet speed before the walking impact and after walking impact. It is obvious that after the walking impact the foot speed is 0, namely 0 So generations into the Lagrange form of the collision equation are: To conclude, all the theory collision dynamics of a humanoid robot equation have been established. The next section will involve the theoretical data needed to conduct research into the compensation controller.

Compensation Controller Design
Before designing the compensation controller, the most important step is to calculate the swing foot height in the moving plane. The limb body structure is simplified by the model shown in Figure 5. According to Figure 5, the distance between each foot plane 6 l to the ground is deemed as the swing foot height. It is obvious from figure 5, 6 l is the ankle height (31mm), 1 l is the lower leg height (85mm), and 2 l is the higher leg height (90mm). Using a simplified model in the Figure 5, formula(18) is established to calculate the swing foot height during walking [17].
h=(l cosq l cos(q q )) cos l -l cos(q (q q q )) cos( )+l cos(q q (q q q )) cos( ) According to the formula (18) and the derivation of the definition, the expected gait programming data joints angle of the freedom and the type are substituted, and the swing expectations of the height of the feet movement tracking can be formulated.
We analyze the swing foot height error and the impact force error, which are caused by many servos' mutual reactions and the moving error. In addition, because of the inevitable error between the robot body and the simplified model as well as the servos' moving track error, a swing foot height error exists. Therefore, this paper focuses on compensating for the relative angle to eliminate the swing foot height error and impact force error.
This paper presents hip, knee and ankle joint angles to compensate for the swing foot height error and impact force error. According to impact dynamics equation and the formula (18), there is a nonlinear relationship between the feet height, the impact force and the steering joint angle. There will be a lot of complex inverse kinematics calculations to calculate the compensated joint angle. Therefore, the swinging leg compensating controller is designed based on the PSO optimization of the BP neural network in this paper, as shown in Figure 6.
The process of humanoid robot compensation for impact control is as follows: Step 1: Servo control board output angle swknd e to the knee joint, while in the process of walking, the real knee joint angle is swkna e , swkn e is the error between swknd e and swkna e , which is defined as a parameter of the control rules. The structure of the control rules will be described in the following text.
Step 2: The error F  is between the theoretical impact force S F and the real impact force F during walking. F  is one of the BP controller's input parameters. Step 3: In the calculation of the real swing foot height 1 H by function (18), S H is the theoretical swing foot height. The error 1 H  is another input parameter of the BP controller.
Step 4: Calculating the adjustment knee incremental rkn  using the BP controller, whose inputs are F  and 1 H  , is defined as another control rules parameter.
Step 5: After the ʺcontrol rulesʺ inputs are set; the actual compensated knee joint angle rkn  can be calculated through its control law; ʺControl ruleʺ will be introduced in table 1.
Step 6: Similarly, swhip e is the error between the designed angle of the servo control board to the hip joint swhipd  and the real hip joint angle swhip  , which is one of the control rules parameters.
Step 7: The error F  is the difference between the theoretical impact force S F and the real impact force F during walking. F  is one of BP controller input parameters.
Step 8: After compensation of the knee joint, the real swing foot height is changed to 2 H , while the theoretical values are still S H . Consequently, the error is 2 H  for the BP controller's input parameter.
Step 9: Calculating the adjusted hip incremental rhip  using a BP controller, with the inputs F  and 2 H  , which are defined as another control rules parameter.
Step 10: After the ʺcontrol rulesʺ inputs are set; the actual compensated hip joint angle thip  can be calculated using its control law.
Step 11: Owing robot's upper limb being upright, the ankle angle will be regulated by the compensated knee and hip angle in accordance with the function , and the compensated ankle angle will be directly calculated.
Step 12: The compensated knee, hip and ankle angle are directly entered into the servo control board as inputs, and the compensation process is accomplished.
The paper will present the control rules as follows.
From the above steps regarding the controller process during walking, the servo control board output angle swknd e , and the actual compensated joint angle rkn  have the following 6 forms: corresponding to the eight kinds of form of ʺcontrol rulesʺ are the actual compensation angles. Among them: the hip control scheme and the knee control scheme are similar, only taking the knee joint compensation as an example, the control rules are given. The algorithm of BP controller is drawn in Figure 7.
Based on the analysis of the traditional BP neural network and the advantage of PSO algorithm, PSO-BP network is designed, in which PSO algorithm is used to optimize the parameter of BP network. After the PSO-BP structure has been established, the relative parameters will be designed as follows.
First of all, the neural network hidden layer number is chosen. Many researchers have proved that if the hidden layer number of the neurons of a neural network is enough, the network can approach any continuous function with arbitrary precision. However, the research showed that two input parameters of the BP network produce an output fitting; so the single hidden layer of a network is applied in this paper.
Secondly, the number of hidden nodes is chosen. Because in different fields the designs of the neural network are different; experimental methods are usually involved in determining the optimal number of hidden nodes. Therefore, the Hecht-Nielsen law is used in this paper at first (one of them is a hidden layer node, the number of samples is the input dimension) and the number of hidden nodes increases, comparing different schemes of the training and test results.
Thirdly, a transfer function is chosen [18]. Since the final compensation angle of the neurons requires nonlinear fitting. Tan-Sigmoid functions are used in the hidden layer, whereas, linear functions (Purelin) are chosen in the output layer.
Finally, after data positive spreading, the error back propagation stage begins, in which the initialization network weights and threshold are optimized. The PSO algorithm is used to replace the traditional BP algorithm to optimize the neural network in weights and threshold.
So, the performance defects that are brought about by the traditional BP algorithm can be avoided (such as: easy in minimum point, fitting, etc.).
From the above analysis, the single hidden layer with 5 nodes is the preliminary which is used to establish the BP network in this paper, and through the experimental proof it can be seen that the BP neural network has a better generalization ability with such a configuration.
The particle swarm parameters are set as follows: Firstly, the swarm population scale is chosen. Owing to the general characteristics of this study, which does not belong to a difficult or certain type, the swarm population size is defined as 10.
Secondly, the length of the particles is connected to the network structure. After determining the BP network for the single hidden layer with the five nodes of the structure, particle length N p a a o a o 21        , can be determined, meaning that the length of each particle is 21 dimensional.
Next, max v stands for the precision of the impact of the resolution between the current position and the best position. If max v is too fast, the particle may miss the minimum point; If max v is too slow, the particle may get stuck in the local extreme value area. Therefore, max v is two based on experience.
Then the inertial factor w and learning factor 1 c and 2 c are chosen. The inertial factor w stands for the ability to extend the search space. 1 c and 2 c represent the acceleration weights when particles fly to best P and best G . Set the w 4.1  , and 1 2 c c 2   .
Finally, the termination conditions are set. When the error is lower than 0.005 and the maximum recycling times are 5000, the algorithm stops running.
Thus far, the parameters of BP controller have been set in Matlab software and the program is built after testing the performance analysis.
Based on the experimental data, an effective program is established. After the training of the network, a performance curve is acquired as shown in Figure 8. Figure 8 shows that the expected precision can be achieved when the training time is 4 seconds and in 307 steps, the precision has declined. Before that, a BP controller is efficient and feasible. In order to further train the quantitative characteristics, we take a training of ten times as a benchmark and five as a hidden layer number of neurons, the specific parameters are listed in Table 2.  It can be seen from Table 2, when the learning rate is 0.01, the training times vary from 5 s to 12 s, the number of times of the training also differ from 485 times to 1109 times, and the fluctuation is obvious. Through calculating the square error of the testing results and the expected output, it can be seen that the mean square errors are all smaller than 0.1. The minimum mean square error is only 0.0367, the maximum mean square error is 0.0772. This suggests that the network structure design is ideal and the expected results can be obtained.

Walking Testing
Through the experimental testing, the control strategy is embedded in a humanoid robot during walking. At the same time, we can obtain the curve of the three joints changing with time, compensated before and after. Figures 9 to 11 are views of the curve of joint points of the knee, hip and ankle of the swinging left leg changing with time compensated before and after. Figures 12 to 14 are views of the curve of the points of three joints, changing with time compensated before, when the right leg swings during walking.  when the swinging leg lands(4.5s&10.5s), both before & after compensation. Joint fluctuations are obvious before the compensation; for example, the change in the angle of the knee-joint is 2 ~ 3 °or so, in the second step in a crash landing; after compensation, we can see that the knee joint angle evidently decreased about 1° due to the influence of the impact. Similarly, the hip and ankle joint angle values have a small variation when the legs come into contact with the ground after compensation. This indicates that the influence of the impact is less when the left leg comes into contact with the ground.
In the same gait, the impact force influences the stability of the walk only when the right leg comes into contact with the earth at 7.5 s. So we should compensate for the three joint angles at 7.5 s particularly. In the same case as a left swinging leg, in figures 12 to 14 we can draw the conclusion that swinging the right leg at the time of 7.5 s will have obvious signs of compensation; there are different changes in the joint angles as well after a crash landing.
By compensating for the lower limbs of the robot's six joints, fluctuations of robot joint angles can be effectively reduced, making steering gears rotate more smoothly and not change largely when legs have a landing collision.
In the process of the gait of a walk after compensation, acquisition of the impact force of the feet takes place as shown in figures 15 and 16: Through the collision impact collection of the swinging feet, the change of collision power before and after compensation can be seen; in the moments at 4.5 s and 10.5 s when the left swinging leg lands, the collision impact rises from zero to peak abruptly both before and after compensation; for example, the impact force decreases from about 17N before compensation to about 15.5N in the first step when the left swinging leg lands. After a short adjustment, it comes to 15N; this indicates that the walking robot has a smaller collision compact when landing after compensation and institutions ontology has a smaller effect than before. In the same way, the collision impact of the second step is even larger before compensation: about 17.5N; however, after compensation the peak is confined to about 16N.

Conclusions
The paper introduces research into a compensation control for the impact force of a small humanoid robot