Deep Reinforcement Learning with Heuristic Corrections for UGV Navigation

Wei, Changyun; Li, Yajun; Ouyang, Yongping; Ji, Ze

doi:10.1007/s10846-023-01950-y

Deep Reinforcement Learning with Heuristic Corrections for UGV Navigation

Regular paper
Open access
Published: 06 September 2023

Volume 109, article number 18, (2023)
Cite this article

Download PDF

You have full access to this open access article

Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Deep Reinforcement Learning with Heuristic Corrections for UGV Navigation

Download PDF

Changyun Wei¹,
Yajun Li¹,
Yongping Ouyang¹ &
…
Ze Ji ORCID: orcid.org/0000-0002-8968-9902²

626 Accesses
1 Citation
Explore all metrics

Abstract

Mapless navigation for mobile Unmanned Ground Vehicles (UGVs) using Deep Reinforcement Learning (DRL) has attracted significantly rising attention in robotic and related research communities. Collision avoidance from dynamic obstacles in unstructured environments, such as pedestrians and other vehicles, is one of the key challenges for mapless navigation. This paper proposes a DRL algorithm based on heuristic correction learning for autonomous navigation of a UGV in mapless configuration. We use a 24-dimensional lidar sensor, and merge the target position information and the speed information of the UGV as the input of the reinforcement learning agent. The actions of the UGV are produced as the output of the agent. Our proposed algorithm has been trained and evaluated in both static and dynamic environments. The experimental result shows that our proposed algorithm can reach the target in less time with shorter distances under the premise of ensuring safety than other algorithms. Especially, the success rate of our proposed algorithm is 2.05 times higher than the second effective algorithm and the trajectory efficiency is improved by \(24\%\) in the dynamic environment. Finally, our proposed algorithm is deployed on a real robot in the real-world environment to validate and evaluate the algorithm performance. Experimental results show that our proposed algorithm can be directly applied to real robots robustly.

Article PDF

Recent Advances in Unmanned Aerial Vehicles: A Review

Article 25 April 2022

Computation offloading optimization for UAV-assisted mobile edge computing: a deep deterministic policy gradient approach

Article 05 May 2021

UAV Path Planning Using Optimization Approaches: A Survey

Article 18 April 2022

Code Availability

The simulation data will be available upon request.

References

Likhachev, M., Ferguson, D.I., Gordon, G.J., Stentz, A., Thrun, S.: Anytime dynamic a*: An anytime, replanning algorithm. In: International Conference on Automated Planning and Scheduling (ICAPS), vol. 5, pp. 262–271 (2005)
Nasir, J., Islam, F., Malik, U., Ayaz, Y., Hasan, O., Khan, M., Muhammad, M.S.: Rrt*-smart: A rapid convergence implementation of rrt. Int J Adv Robot Syst 10(7), 1651–1656 (2013)
Article Google Scholar
Durrant-Whyte, H., Bailey, T.: Simultaneous localization and mapping: part i. IEEE Robot Autom Mag 13(2), 99–110 (2006)
Article Google Scholar
Chen, Y.F., Everett, M., Liu, M., How, J.P.: Socially aware motion planning with deep reinforcement learning. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1343–1350 (2017). IEEE
Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., Van Den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)
Article Google Scholar
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.: Playing atari with deep reinforcement learning (2013). arXiv:1312.5602
Ye, D., Liu, Z., Sun, M., Shi, B., Zhao, P., Wu, H., Yu, H., Yang, S., Wu, X., Guo, Q., et al.: Mastering complex control in moba games with deep reinforcement learning. In: AAAI Conference on Artificial Intelligence, vol. 34, pp. 6672–6679 (2020)
Mnih, V., Badia, A.P., Mirza, M., Graves, A., Lillicrap, T., Harley, T., Silver, D., Kavukcuoglu, K.: Asynchronous methods for deep reinforcement learning. In: International Conference on Machine Learning (ICML), pp. 1928–1937 (2016). PMLR
Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D.: Continuous control with deep reinforcement learning (2015). arXiv:1509.02971
Schulman, J., Levine, S., Abbeel, P., Jordan, M., Moritz, P.: Trust region policy optimization. In: International Conference on Machine Learning (ICML), pp. 1889–1897 (2015). PMLR
Wang, Y., He, H., Tan, X.: Truly proximal policy optimization. In: Uncertainty in Artificial Intelligence, pp. 113–122 (2020). PMLR
Tai, L., Paolo, G., Liu, M.: Virtual-to-real deep reinforcement learning: Continuous control of mobile robots for mapless navigation. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 31–36 (2017). IEEE
Zhang, P., Wei, C., Cai, B., Ouyang, Y.: Mapless navigation for autonomous robots: A deep reinforcement learning approach. In: Chinese Automation Congress (CAC), pp. 3141–3146 (2019). IEEE
Gu, S., Holly, E., Lillicrap, T., Levine, S.: Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 3389–3396 (2017). IEEE
Chaffre, T., Moras, J., Chan-Hon-Tong, A., Marzat, J.: Sim-to-real transfer with incremental environment complexity for reinforcement learning of depth-based robot navigation (2020). arXiv:2004.14684
Pakrooh, R., Bohlooli, A.: A survey on unmanned aerial vehicles-assisted internet of things: A service-oriented classification. Wirel Pers Commun 119, 1541–1575 (2021)
Article Google Scholar
Alam, T.: Blockchain-enabled deep reinforcement learning approach for performance optimization on the internet of things. Wirel Pers Commun 126(2), 995–1011 (2022)
Article Google Scholar
Swarup, A., Gopal, M.: Control strategies for robot manipulators-a review. IETE J Res 35(4), 198–207 (1989)
Article Google Scholar
An, X., Wang, Y.: Smart wearable medical devices for isometric contraction of muscles and joint tracking with gyro sensors for elderly people. J Ambient Intell Hum Comput, 1–12 (2021)
Ding, H.: Motion path planning of soccer training auxiliary robot based on genetic algorithm in fixed-point rotation environment. J Ambient Intell Hum Comput 11, 6261–6270 (2020)
Article Google Scholar
Pawar, P., Yadav, S.M., Trivedi, A.: Performance study of dual unmanned aerial vehicles with underlaid device-to-device communications. Wirel Pers Commun 105, 1111–1132 (2019)
Article Google Scholar
Alimi, I.A., Teixeira, A.L., Monteiro, P.P.: Effects of correlated multivariate fso channel on outage performance of space-air-ground integrated network (sagin). Wirel Pers Commun 106(1), 7–25 (2019)
Article Google Scholar
Li, H., Luo, J., Li, J.: Reinforcement learning based full-duplex cognitive anti-jamming using improved energy detector. Wirel Pers Commun 111, 2107–2127 (2020)
Article Google Scholar
Li, L., Mao, Y.: Autonomously coordinating multiple unmanned vehicles for data communication between two stations. Wirel Pers Commun 97, 3793–3810 (2017)
Article Google Scholar
Praise, J.J., Raj, R.J.S., Benifa, J.B.: Development of reinforcement learning and pattern matching (rlpm) based firewall for secured cloud infrastructure. Wirel Pers Commun 115, 993–1018 (2020)
Article Google Scholar
Tasgaonkar, P.P., Garg, R.D., Garg, P.K.: Vehicle detection and traffic estimation with sensors technologies for intelligent transportation systems. Sens & Imaging 21, 1–28 (2020)
Article Google Scholar
Annepu, V., Rajesh, A.: Implementation of an efficient artificial bee colony algorithm for node localization in unmanned aerial vehicle assisted wireless sensor networks. Wirel Pers Commun 114, 2663–2680 (2020)
Article Google Scholar
Kumar, A.: Real-time performance comparison of vision-based autonomous landing of quadcopter on a ground moving target. IETE J Res, 1–18 (2021)
Li, X.: Robot target localization and interactive multi-mode motion trajectory tracking based on adaptive iterative learning. J Ambient Intell Hum Comput 11, 6271–6282 (2020)
Article Google Scholar
Muller, U., Ben, J., Cosatto, E., Flepp, B., Cun, Y.: Off-road obstacle avoidance through end-to-end learning. Adv Neural Inf Process Syst 18, 739–746 (2005). (MIT Press)
Google Scholar
Tai, L., Li, S., Liu, M.: A deep-network solution towards model-less obstacle avoidance. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2759–2764 (2016). IEEE
Chen, C., Seff, A., Kornhauser, A., Xiao, J.: Deepdriving: Learning affordance for direct perception in autonomous driving. In: IEEE International Conference on Computer Vision, pp. 2722–2730 (2015). IEEE
Kretzschmar, H., Spies, M., Sprunk, C., Burgard, W.: Socially compliant mobile robot navigation via inverse reinforcement learning. Int J Robot Res 35(11), 1289–1307 (2016)
Article Google Scholar
Pfeiffer, M., Schaeuble, M., Nieto, J., Siegwart, R., Cadena, C.: From perception to decision: A data-driven approach to end-to-end motion planning for autonomous ground robots. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 1527–1533 (2017). IEEE
Zhu, D., Li, T., Ho, D., Wang, C., Meng, M.Q.-H.: Deep reinforcement learning supervised autonomous exploration in office environments. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 7548–7555 (2018). IEEE
Zhang, J., Springenberg, J.T., Boedecker, J., Burgard, W.: Deep reinforcement learning with successor features for navigation across similar environments. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2371–2378 (2017). IEEE
Everett, M., Chen, Y.F., How, J.P.: Motion planning among dynamic, decision-making agents with deep reinforcement learning. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 3052–3059 (2018). IEEE
Gu, S., Lillicrap, T., Sutskever, I., Levine, S.: Continuous deep q-learning with model-based acceleration. In: International Conference on Machine Learning (ICML), pp. 2829–2838 (2016). PMLR
Chen, C., Liu, Y., Kreiss, S., Alahi, A.: Crowd-robot interaction: Crowdaware robot navigation with attention-based deep reinforcement learning. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 6015–6022 (2019). IEEE
Ciou, P.-H., Hsiao, Y.-T., Wu, Z.-Z., Tseng, S.-H., Fu, L.-C.: Composite reinforcement learning for social robot navigation. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2553–2558 (2018). IEEE
Chen, Y.F., Liu, M., Everett, M., How, J.P.: Decentralized noncommunicating multiagent collision avoidance with deep reinforcement learning. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 285–292 (2017). IEEE
Fujimoto, S., Hoof, H., Meger, D.: Addressing function approximation error in actor-critic methods. In: International Conference on Machine Learning (ICML), pp. 1587–1596 (2018). PMLR
Hasselt, H.: Double q-learning. Adv Neural Inf Process Syst 23, 2613–2621 (2010)
Google Scholar
Schaul, T., Quan, J., Antonoglou, I., Silver, D.: Prioritized experience replay (2015). arXiv:1511.05952
Koenig, N., Howard, A.: Design and use paradigms for gazebo, an opensource multi-robot simulator. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), vol. 3, pp. 2149–2154 (2004). IEEE

Download references

Funding

This work was supported in part by the National Natural Science Foundation of China under Grant 61703138.

Author information

Authors and Affiliations

College of Mechanical and Electrical Engineering, Hohai University, Changzhou, China
Changyun Wei, Yajun Li & Yongping Ouyang
School of Engineering, Cardiff University, Cardiff, Wales, UK
Ze Ji

Authors

Changyun Wei
View author publications
You can also search for this author in PubMed Google Scholar
Yajun Li
View author publications
You can also search for this author in PubMed Google Scholar
Yongping Ouyang
View author publications
You can also search for this author in PubMed Google Scholar
Ze Ji
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed to the study conception and design. Changyun Wei and Yongping Ouyang performed material preparation, data collection and analysis. Changyun Wei and Yongping Ouyang wrote the first draft of the manuscript. Yajun Li and Ze Ji revised the manuscript based on previous versions. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Ze Ji.

Ethics declarations

Ethical approval

Not applicable. Our manuscript does not report results of studies involving humans or animals.

Consent to participate

Not applicable. Our manuscript does not report results of studies involving humans or animals.

Consent for Publication

All authors have approved and consented to publish the manuscript.

Conflicts of interest

The authors have no relevant financial or nonfinancial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Wei, C., Li, Y., Ouyang, Y. et al. Deep Reinforcement Learning with Heuristic Corrections for UGV Navigation. J Intell Robot Syst 109, 18 (2023). https://doi.org/10.1007/s10846-023-01950-y

Download citation

Received: 13 March 2023
Accepted: 07 August 2023
Published: 06 September 2023
DOI: https://doi.org/10.1007/s10846-023-01950-y

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Deep Reinforcement Learning with Heuristic Corrections for UGV Navigation

Abstract

Article PDF

Similar content being viewed by others

Recent Advances in Unmanned Aerial Vehicles: A Review

Computation offloading optimization for UAV-assisted mobile edge computing: a deep deterministic policy gradient approach

UAV Path Planning Using Optimization Approaches: A Survey

Code Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethical approval

Consent to participate

Consent for Publication

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation