Hybrid autonomous controller for bipedal robot balance with deep reinforcement learning and pattern generators

https://doi.org/10.1016/j.robot.2021.103891Get rights and content
Under a Creative Commons license
open access

Highlights

  • Bipedal Robots face the real-world problem of recovering from an abrupt push.

  • Asynchronous combination of pattern generators with deep reinforcement learning for recovering after a push.

  • Offline and onboard training capabilities of the controller.

  • Ability to utilize the pattern generators after disconnecting the deep reinforcement network.

Abstract

Recovering after an abrupt push is essential for bipedal robots in real-world applications within environments where humans must collaborate closely with robots. There are several balancing algorithms for bipedal robots in the literature, however most of them either rely on hard coding or power-hungry algorithms. We propose a hybrid autonomous controller that hierarchically combines two separate, efficient systems, to address this problem. The lower-level system is a reliable, high-speed, full state controller that was hardcoded on a microcontroller to be power efficient. The higher-level system is a low-speed reinforcement learning controller implemented on a low-power onboard computer. While one controller offers speed, the other provides trainability and adaptability. An efficient control is then formed without sacrificing adaptability to new dynamic environments. Additionally, as the higher-level system is trained via deep reinforcement learning, the robot could learn after deployment, which is ideal for real-world applications. The system’s performance is validated with a real robot recovering after a random push in less than 5 s, with minimal steps from its initial positions. The training was conducted using simulated data.

Keywords

Bipedal robot
Pattern generator
Reinforcement learning
Hybrid controller

Cited by (0)

Christos Kouppas is a Ph.D. student in Computer Science at Loughborough University in the United Kingdom. He obtained BEng in Mechanical Engineering from the University of Cyprus, graduating first of his class in 2015. He also graduated from The University of Sheffield first of his class at the M.Sc. Advanced Control & Systems Engineering in 2016. He has been awarded the Laverick Webster Hewitt Prize for his outstanding performance and The Eric Rose Prize for the best project, during his master studies.

Mohamad Saada is a KTP researcher at Loughborough University in the United Kingdom. His main research interest includes but not limited to autonomous and automatic systems, mobile and stationary robotics as well as advance control algorithms with artificial intelligence.

Qinggang Meng is a Professor in robotics and autonomous systems at Loughborough University, UK. His main research interests and expertise include: cognitive robotics, multi-robot/UAV cooperation, AI, machine learning and computer vision, driverless vehicles, human–robot interaction, and ambient assisted living.

Mark King is a Professor in Sports Biomechanics at Loughborough University, UK specializing in using subject-specific computer simulation models to understand optimum performance and injury risk in sport. Integral to this work is the role of muscle and technique on optimum performance and how the force gets transmitted/energy dissipated through the body. He has been at Loughborough since 1990 graduating in Sports Science and Mathematics in 1993 and obtained his Ph.D. in computer simulation of dynamic jumps in 1998.

Dennis Majoe is a senior scientist in a scientific collaboration with the Ecole Polytechnique de Lausanne advising on high fidelity medical wearable devices. He is also CEO for Motion Robotics Ltd, U.K. His main area of research is robotics as applied to assistive living.

The code (and data) in this article has been certified as Reproducible by Code Ocean: (https://codeocean.com/). More information on the Reproducibility Badge Initiative is available at https://www.elsevier.com/physical-sciences-and-engineering/computer-science/journals.