Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Bipedal walking for humanoid robots is one of the most interesting challenges in robotics. In the papers [13], we have investigated the possibility of creating an dynamically stable and energy efficient gait without an area of support. Here, the absence of an area of support means the ankle joint can move freely while the foot is on the ground. In the sagittal direction the robot’s Center of Mass (CoM) is falling forward till the foot of the swing leg touches the ground. In the lateral direction, the robot balances above the stance foot in the single support phase, and falls towards the new stance foot in the double support phase. The falling towards the new stance foot is stopped by putting a force on the new stance leg. The resulting gaitFootnote 1 was subsequently evaluated on a real Nao robot. The stability of the gait is validated on flat ground but not on uneven terrain since there is no feedback on the controller, thus robot cannot adjust the gait parameters to compensate for the uneven floor. In this paper, we improve the gait’s lateral stability on uneven terrain by introducing such a controller.

The remainder of this paper is organized as follows. In the next section, we will give a brief overview of existing research about kinematics models for humanoid robots, stability criteria and various approaches to obtain energy efficient bipedal walking. Section 3 briefly describes the new gait that we developed and presented in [1]. We used the Inverted Pendulum Model (IPM) to investigate the energy consumption in the sagittal plane. Subsequently, we extended the model to the lateral plane and describe a gait controller with multiple parameters for a 3D full-body humanoid robot. The controller can achieve a stable gait on a physical robot in the real world after we optimize the parameters through an Policy Gradient Reinforcement Learning (PGRL). Section 4 introduces our work on the neural network controller to enhance the lateral stability. Section 5 concludes this paper. We provide a brief summary of the results and outline the future research.

2 Related Work

2.1 Movement Models

Humanoid robots have complex bodies with irregular shape and mass distribution. Therefore, it is advantageous to obtain an elemental representation of the robot’s dynamics. Ideal features of a model are simplicity, and both a conceptually and mathematically accurate representation of the dynamics of the real system. The main approaches employed to model the kinematics of humanoid robots are based on the Inverted Pendulum Model (IPM) [4] which involves a simplification compared to the body of the robot. The IPM represents the whole body of the robot as a point mass located at the center of mass (CoM) of the actual robot. The point mass is linked to the base of the robot by a telescopic massless leg. Restraining the movements of the CoM to a horizontal plane allows to simplify the motion equation of the IPM. The resulting model is known as the Linear Inverted Pendulum Model (LIPM) which [5] proposed to describe humanoid robot locomotion. The LIPM provides an efficient means to represent the kinematic behavior of the robot and it is therefore a popular tool to understand and manipulate the balance of a humanoid robot. With the LIPM and zero moment point (ZMP) stability criteria [6], institutes/companies have successfully built biped robots that can walk with various gaits adapting to different walking situations (e.g. [710]).

2.2 Energy Consumption

However, the movement model is not the only factor to be considered. The energy consumption of a gait is an important issue. Various approaches have been proposed to reduce the energy consumption of a gait. One of these approaches is passive-dynamic walking where the robot’s dynamics are designed to enable a robot to walk down slight slopes without control input, except for the gravitational force. The paper of [11] explained this well. [12] believed that there are three primary flaws of passive-dynamic walker: they can only walk down slopes, their gaits are restricted by their dynamics, and they are sensitive to perturbations. Realizing these limitations, researchers [13] have sought to improve passive-dynamic walker by adding actuators.

A second approach to obtain energy efficient bipedal walking is the application of mechanical compliance. In the work of [13] and [14], springs were added across the hip, thigh, knee and ankle simultaneously. [15] exploited parallel knee compliance on the robot ERNIE and discussed how soft/stiff springs affect the energy efficiency at different walking speed. [16] described the implementation of series-elastic actuation on Spring Flamingo (a MIT’s planar bipedal walking robot) to enable the control of the ground reaction forces during walking.

A third approach to obtaining energetically-efficient bipedal walking is the design of gaits that minimize the energetic cost of walking. The most common means of design is to use parametric optimization to the parameters that specify the gait of the robot. For example, [17] used parametric optimization to design fourth degree polynomial functions that give the joint motions over a step as functions of time. Unlike the previous example, in the work of [18] cubic splines connected at points uniformly distributed along the motion time are used to generate complete optimal steps, including a double-support phase.

Parametric optimization methods are also implemented to optimize the walking generator on humanoid Nao robots. In the work of [19], the proposed method models the omni-directional motion as the combination of a set of periodic signals. The parameters controlling the characteristics of the signals are encoded into genes and evolutionary strategies is used to learn an optimal set of parameters. Nao humanoid robots are used as the test platform. [20] augmented the 3D inverted pendulum with a spring model and use policy search to optimize the parameters of the walking engines on Nao robots. [21] introduced a two-stage learning algorithm for Central Pattern Generator (CPG) of Nao robot’s bipedal walking.

3 Our New Gait

This section briefly describes our new gait presented in [13]. We first analyzed the gait without an area of support using an IPM with telescopic legs. Then we designed a controller which implements the gait on a real Nao robot.

Fig. 1.
figure 1

The abstract Inverted Pendulum Model

Fig. 2.
figure 2

The CoM lateral movement during double support phase.

3.1 Kinematics Model in Sagittal Direction

The IPM with telescopic legs allows the length of the virtual support leg to vary during a step. We proposed the leg-length policy \(\delta : [-\frac{\pi }{2},\frac{\pi }{2}] \rightarrow [0,1]\) that determines how much the virtual support leg will be shortened as function of the angle between stance leg with vertical axis. The shortening of the stance leg is realized by bending the knee joint, see the right side of Fig. 1.

To identify the leg-length policy that minimizes the energy consumption of a robot, we make use of the fact that the robot has to bend the knee in order to shorten the leg. The knee torque is the main factor determining the energy consumption [1]. Figure 3 shows the optimal leg-length policy \(\delta (\alpha )\) as a function of the angle \(\alpha \) from the beginning till the end of the step that we identified and Fig. 4 shows the realization using the 5-link model. The detailed information can be found in our previous publication [1].

Since we assumed the absence of an area of support and to further reduce the total energy cost, we set the stiffness on both ankle joints to almost zero. Thus, the stance leg of the robot can freely rotate around ankle joints, and the area of support reduces to a point.

Fig. 3.
figure 3

The optimal leg-length policy

Fig. 4.
figure 4

Kinematic of sagittal motion

3.2 Kinematics Model in Lateral Direction

For a simple forward step, it is insufficient to only consider 2D dynamics in sagittal direction. To address the lateral stability, we designed a lateral controller to regulate the CoM lateral movement during double support phase which is proposed in [3]. We use the upper body tilt to initiate the lateral movement of the CoM towards the swing foot. Next, we use a force generated by the swing leg to stop the movement when CoM is balanced above the swing foot, which then becomes the new stance foot. The force generated by the swing leg is described by a force policy. In order to smooth the CoM transition trajectory, we determines the shape of the force policy by means of Quadratic Bezier curve which introduces the quadratic bezier point \(\theta _7\), one of the controller parameters in next subsection.

3.3 Controller Design

We designed a controller which implements this gait on a Nao robot. Because of the differences between the abstract model and the Nao, several parameters of the controller need to be fine-tuned. This subsection presents the parameters of a gait controller that realizes the leg-length policy described in Sect. 3.1 and the parameters that control the lateral movement of the CoM in the DPS. We identified 9 parameters that are essential in controlling a dynamic gait:

  • Step Length \((\theta _1)\): Defines the distance which Nao moves in a singe step (sagittal).

  • Step Height \((\theta _2)\): Defines the maximal altitude between ground and lifting feet. A high step height requires swing leg’ faster move and may cause horizontal instability. A low step height increases the possibility of tripping and limits the step length.

  • Knee Bending \((\theta _3)\): Defines the maximal bending of the swing leg at the beginning of the double support phase which determines the value of \(\delta (\alpha _b)\), see Fig. 3. This parameter determines the sagittal velocity and the energy cost.

  • Step Time \((\theta _4)\): Defines how long a single step lasts. This parameter determines the sagittal walking velocity.

  • Stretch Time \((\theta _5)\): Defines how long it takes for the stance leg to stretch from \(\theta _3\) (angle of bent knee) to its full length at the beginning of the single support phase, see Fig. 3.

  • Torso Pitch Inclination \((\theta _6)\): Defines the maximum angle that torso leans in sagittal direction at the beginning of the first step. If positive, it will move the center of mass (CoM) in sagittal direction. If it is set not appropriate, a fall will occur. In our experiments, the inclination lasts for 200 ms.

  • Quadratic Bezier point \((\theta _7)\): Defines the magnitude of middle points in Quadratic Bezier Curves, which determines the force policy on swing leg (introduced in Sect. 3).

  • Torso Roll Inclination \((\theta _8)\): Defines the maximum angle that torso leans in lateral direction. If positive, it will move the center of mass (CoM) towards the swing leg in lateral plane as discussed in Sect. 3.

  • Ratio of single support duration \((\theta _9)\): Defines how long the single support phase lasts in one single step. The single support phase duration equals this parameter times step time \(\theta _4\).

All parameters except \(\theta _1\) (the step length) will be optimized in the experiments. We do not consider the step length for optimization because we need step length to be variable when the velocity is changing. We manually set different walking velocity v in each experiment and determined the optimal Step Time \(\theta _4\). The corresponding step length is given by: \(\theta _1 = v \theta _4\).

Algorithm for Learning Controller Parameters. We use a policy gradient reinforcement learning method [22] to automatically search the set of possible parameters with the goal of finding the stable and low energy cost walk. In order to generate a gait that is energy efficient and stable, we considered a fitness function based on the total energy cost and the stability over a certain distance of forward walk. The energy cost determines 30 % of the fitness function value and the distance which the robot walks without falling determines 70 % of the fitness function value [1].

Learning Optimal Parameters in the Simulator. To generate the optimal gait parameters and validate the gait’s performance, we uploaded the controller of our proposed gait together with an implementation of the policy gradient algorithm into the Webots simulator. We used a relatively elementary hand-tune gait as a starting policy for the policy gradient algorithm. Each new policy was evaluated by letting a robot walk at a constant distance of 0.75 m. During the walking, the energy consumption and stability were determined. The policy gradient algorithm converges to a parameters set P shown in Table 1.

Table 1. Learned parameters set P

The algorithm presented here converges to a local optimum. In order to investigate whether the results could be a global optimum, we repeated the learning experiment 500 times, each time starting from a randomly generated parameter vector \(x^\pi \) with the same velocity. The results of the experiments indicate that the local optimum we have in Table 1 is probably the global optimum. Therefore, the parameters set P most likely results in the most energy efficient gait.

The accompanying video materialFootnote 2 shows the Nao robot walking on flat ground with our proposed gait controller at a speed of 6 cm / s. We also compare the new gait with the standard gait Aldebaran supplies with the Nao. The energy consumption of new gait is 41 % less than the Aldebaran gait.

4 Walking on Uneven Floor

The gait without support areas we proposed is validated as a dynamically stable gait on flat ground. However, the controller cannot compensate for external disturbances. This means any disturbance such as a push or stepping on uneven terrain may jeopardize its balance, because the ankle stiffness is set to almost zero. To enhance the walking stability, the gait should adapt to unknown disturbances. For example, if the robot is standing on a slope or stepping on a bump in the floor, the feet are not on the same altitude in the lateral plane, this may cause the CoM undershoot/overshoot the balance position when switching the stance leg. In order to make the problem tractable, simplifying assumptions are made. Since bipedal robot’s stepping on the uneven terrain makes the altitude of robot’s two foot different, the robot’s walking on uneven floor can be viewed as walking on the slope in the lateral direction. The gait controller should adapt the gait parameters to compensate for the slope in the lateral direction.

Fig. 5.
figure 5

Backpropagation neural network

Fig. 6.
figure 6

Experiment scenario: the nao robot stands on slope in angle p

4.1 Controller Design

As a first step in designing a controller that can handle disturbances influencing the lateral stability, we determined the optimal control parameters \(\theta _2\) to \(\theta _9\) when walking on a certain slope in the lateral direction. Since the left and right foot are at different height, the control parameters for the left and right leg might be different. Therefore, \(\theta _3\) to \(\theta _9\) are split into parameters for the left leg \(\theta ^L_i\) and the right leg \(\theta ^R_i\). Next we addressed how to adapt the control parameters.

The robot does not know that it walks on a slope or about other disturbances. The only information it has available is (1) the angular speed \(\dot{\beta }\) of the rotation of the CoM around the ankle of the stance leg. (2) the lateral acceleration measured by the Inertia Measurement Unit (IMU). This value approximates the angular acceleration \(\ddot{\beta }\). (3) the angle \(\beta '\) of the CoM w.r.t the swing foot. Figure 2 shows the three parameters (\(\beta '\),\(\dot{\beta }\),\(\ddot{\beta }\)). We chose to use these three parameters as inputs for the controller that can adapt the gait parameters.

We designed a series of experiment in the simulator Webots to obtain the optimal control parameters and input vector (\(\beta '\),\(\dot{\beta }\),\(\ddot{\beta }\)) corresponding to certain slopes. In the experiments, the same policy reinforcement learning method in Subsect. 3.3 is used to find the proper control parameters that can generate the stable walking gait under different slopes. The experiments require the robot stands on various slopes where tilt angles varies from 0.00 to 0.139 (rad) in robot’s lateral plane (see Fig. 6). We kept the robot walking on the slope and ran the policy search method to get the corresponding controller parameters which ensure robot’s stability. Each new policy was evaluated by letting a Nao robot move for 5 seconds. The fitness function of each policy will get high score if the robot keeps stable. Otherwise, the function gets a penalized score. After the result of policy search converged while the robot’s movement become stable, the control parameters are recorded. Table 2 shows the results fo those experiments. Next, from the beginning of DPS to its end, we sampled the data from IMU and joint sensors every 10 ms in order to determine \(({\beta '},\dot{\beta },\ddot{\beta })\) values during the DSP for each lateral slope.

Table 2. Learned control parameters adaptive to different slop angles
Fig. 7.
figure 7

Sampled stiffness over angle \(\beta '\)

A controller that uses \((\beta ',\dot{\beta },\ddot{\beta })\) as inputs cannot adapt \(\theta _7\). At the end of the DSP, \((\beta ',\dot{\beta },\ddot{\beta })\) must become equal to (0, 0, 0) for every \(\theta _7\) value. So, in the neighborhood of (0, 0, 0) the correct \(\theta _7\) value is not well defined. Therefore, instead of \(\theta _7\), the stiffness determined by \(\theta _7\) will be used instead. The stiffness values are determined by sampling the the Bezier curve determined by \(\theta _7\) for different \(\beta '\) values, see Fig. 7.

From Table 2, we know that the parameters \(\theta _3\), \(\theta _5\) and \(\theta _8\) are also influenced by the slope angle. The parameter \(\theta _3\) solely depends on the relative elevation between stance and swing leg, and its value can be determined at end of the DSP where \(\beta '\) should become zero. Since \(\theta _3\) encodes the information about the slope angle, its value can be used to set the values of the other two parameters: \(\theta _5\) and \(\theta _8\).

Fig. 8.
figure 8

Architecture for the identification process of the lateral controller parameters in double support phase

Fig. 9.
figure 9

The roll angle of both legs under different slope in 0\(^\circ \), 4\(^\circ \), 7\(^\circ \).

4.2 Controller Implementation and Evaluation

We implemented two neural networks to control and improve the robot’s lateral stability by adjusting the controller parameters adaptive to unknown slope. The backpropagation method has been applied for training multi-layer feedforward networks, see Fig. 5. With the trained network, we designed a simple neural network controller to maintain robot’s walking stability on uneven terrain in the lateral direction. Figure 8 depicts the general architecture of the lateral stability controller that was implemented in this paper. When the robots walking on uneven floor and a new DSP begins, with the data retrieved from joint sensors (\(\beta \) and \(\dot{\beta }\)) and IMU (\(\ddot{\beta }\)), the first neural network takes these three variables as the input vector and outputs the new stiffness values for swing leg during the DSP. The second network uses as input the value of \(\theta _3\) at the end of the DSP and outputs the values of \(\theta _5\) and \(\theta _8\), which are used by the gait controller. Together with other fixed parameters, the gait controller generates updated joints command, to compensate for the uneven terrain. Figure 9 shows the roll angles trajectories of left/right legs when robot is walking on slope in 0.00 rad, 0.07 rad (\(\approx \) 4\(^\circ \)) and 0.12 rad (\(\approx \) 7\(^\circ \)). From this figure, we can see that right foot is higher than the left one when the slope exists which makes the joints on right leg rotate in less angles to let CoM approach its balance point. Moreover, under the different slopes, the time for one step does not change which means our proposed gait can make a stable walk on different slopes without the loss of walking velocity. The accompanying video materialFootnote 3 shows the Nao robot walking on uneven terrain with our proposed gait controller in the simulator Webots which proves our controller can handle the altitude difference of foot placement and adjust the control parameters to maintain balance.

5 Conclusion

In previous work we have presented a new gait for humanoid robots. An implementation of the gait on a Nao robot reduces the energy consumption with 41 % compared to the standard gait of the Nao. An important feature of new the gait is that it does not use an area of support. That is, the robot can rotate freely around the ankle joint while walking. This makes the new gait suited for uneven terrains because the feet can adapt to the slope of the terrain.

The absence of an area of support implies that, in principle, the robot is unstable. In the sagittal plane, the robot falls forwards in each step, and in the lateral plane, the robot balances above the stance foot in the single support phase and falls towards the swing foot in the double support phase. Nevertheless, experiment with a Nao robot on an almost flat floor consisting of wooden planks showed that the gait is stable.

In this paper, we investigate how we can improve the lateral stability of the gait when walking on an uneven terrain. The most important aspect of walking on an uneven terrain is the lateral stability. Since the robot balances above the stance leg in the single support phase while it can turn freely around the ankle joint, a bump or a hole in the walking surface may disrupt the lateral stability. Therefore, during the double support phase the robot may over- or undershoot its stable end point, namely, balancing stable above the new stance foot. The paper presents a feedback controller based on a feedforward neural network, that adapts the gait parameters in order ensure the lateral stability while walking on an uneven terrain.

The feedback controller for the lateral stability also enables the robot to handle, to some degree, slopes in the sagittal plane. In future work, we will extend the feedback controller to specifically address the effects of walking uphill or downhill.