Towards Real-Time Path Planning through Deep Reinforcement Learning for a UAV in Dynamic Environments

Yan, Chao; Xiang, Xiaojia; Wang, Chang

doi:10.1007/s10846-019-01073-3

Towards Real-Time Path Planning through Deep Reinforcement Learning for a UAV in Dynamic Environments

Published: 07 September 2019

Volume 98, pages 297–309, (2020)
Cite this article

Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Chao Yan¹,
Xiaojia Xiang¹ &
Chang Wang¹

4285 Accesses
148 Citations
Explore all metrics

Abstract

Path planning remains a challenge for Unmanned Aerial Vehicles (UAVs) in dynamic environments with potential threats. In this paper, we have proposed a Deep Reinforcement Learning (DRL) approach for UAV path planning based on the global situation information. We have chosen the STAGE Scenario software to provide the simulation environment where a situation assessment model is developed with consideration of the UAV survival probability under enemy radar detection and missile attack. We have employed the dueling double deep Q-networks (D3QN) algorithm that takes a set of situation maps as input to approximate the Q-values corresponding to all candidate actions. In addition, the ε-greedy strategy is combined with heuristic search rules to select an action. We have demonstrated the performance of the proposed method under both static and dynamic task settings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Unmanned Aerial Vehicles Path Planning Based on Deep Reinforcement Learning

Trajectory Planning of UAV in Unknown Dynamic Environment with Deep Reinforcement Learning

A Novel UAV Path Planning Method Based on Layered PER-DDQN

References

Tran, L.D., Cross, C.D., Motter, M.A., Neilan, J.H., Qualls, G., Rothhaar, P.M., Trujillo, A., Allen, B.D.: Reinforcement learning with autonomous small unmanned aerial vehicles in cluttered environments. In: Proceedings of AIAA Aviation Technology, Integration, and Operations Conference, 2899 (2015)
Google Scholar
Faessler, M., Fontana, F., Forster, C., Mueggler, E., Pizzoli, M., Scaramuzza, D.: Autonomous, vision-based flight and live dense 3D mapping with a quadrotor micro aerial vehicle. J. Field. Rob. 33, 431–450 (2016)
Article Google Scholar
Scherer, S., Rehder, J., Achar, S., Cover, H., Chambers, A., Nuske, S., Singh, S.: River mapping from a flying robot: state estimation, river detection, and obstacle mapping. Auton. Robot. 33, 189–214 (2012)
Article Google Scholar
Xie, L., Wang, S., Markham, A., Trigoni, N.: Towards monocular vision based obstacle avoidance through deep reinforcement learning. arXiv:1706.09829(2017)
Ross, S., Melik Barkhudarov, N., Shankar, K.S., Wendel, A., Dey, D., Bagnell, J.A., Hebert, M.: Learning monocular reactive UAV control in cluttered natural environments. In: Proceedings of IEEE International Conference on Robotics and Automation (ICRA), pp. 1765–1772 (2013)
Google Scholar
Ma, Z., Wang, C., Niu, Y., Wang, X., Shen, L.: A saliency-based reinforcement learning approach for a UAV to avoid flying obstacles. Robot. Auton. Syst. 100, 108–118 (2018)
Article Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: an Introduction. MIT Press, Cambridge (1998)
MATH Google Scholar
Watkins, C.J., Dayan, P.: Q-learning. Mach. Learn. 8(3–4), 279–292 (1992)
MATH Google Scholar
Zhao, Y., Zheng, Z., Zhang, X., Liu, Y.: Q learning algorithm based UAV path learning and obstacle avoidance approach. In: Proceedings of Chinese Control Conference (CCC), pp. 3397–3402 (2017)
Google Scholar
Li, S., Xu, X., Zuo, L.: Dynamic path planning of a mobile robot with improved Q-learning algorithm. In: Proceedings of IEEE International Conference on Information and Automation, pp. 409–414 (2015)
Google Scholar
Tang, R., Yuan, H.: Cyclic error correction based Q-learning for mobile robots navigation. Int. J. Control. Autom. Syst. 15, 1790–1798 (2017)
Article Google Scholar
Wang, C., Hindriks, K.V., Babuska, R.: Robot learning and use of affordances in goal-directed tasks. In: Proceeding of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2288–2294 (2013)
Google Scholar
Yan, C., Xiang, X.: A path planning algorithm for UAV based on improved Q-learning. In: Proceedings of IEEE International Conference on Robotics and Automation Sciences, pp. 46–50 (2018)
Google Scholar
Li, Y.: Deep Reinforcement Learning: an Overview. arXiv:1701.07274(2017)
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.: Playing atari with deep reinforcement learning. arXiv:1312.5602(2013)
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., Hassabis, D.: Human-level control through deep reinforcement learning. Nature. 518(7540), 529–533 (2015)
Article Google Scholar
Wu, J., Shin, S., Kim, C.G., Kim, S.D.: Effective lazy training method for deep Q-network in obstacle avoidance and path planning. In: Proceedings of IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 1799–1804 (2017)
Google Scholar
Zhou, B., Wang, W., Wang, Z., Ding, B.: Neural Q learning algorithm based UAV obstacle avoidance. In: Proceedings of IEEE/CSAA Guidance, Navigation and Control Conference, pp. 961–966 (2018)
Google Scholar
Wang, Y., Peng, D.: A simulation platform of multi-sensor multi-target track system based on STAGE. In: Proceedings of World Congress on Intelligent Control and Automation, pp. 6975–6978 (2010)
Google Scholar
Deng, Y.: A threat assessment model under uncertain environment. Math. Probl. Eng. 2015, 1–12 (2015)
Google Scholar
Gao, Y., Xiang, J.: New threat assessment non-parameter model in beyond-visual-range air combat. Journal of System Simulation. 18, 2570–2572 (2006)
Google Scholar
Xiao, B., Fang, Y., Hu, S., Wang, L.: New threat assessment method in beyond-the-horizon range air combat. Syst. Eng. Electron. 31, 2163–2166 (2009)
Google Scholar
Ernest, N., Cohen, K., Kivelevitch, E., Schumacher, C., Casbeer, D.: Genetic fuzzy trees and their application towards autonomous training and control of a squadron of unmanned combat aerial vehicles. Unmanned Systems. 3(03), 185–204 (2015)
Article Google Scholar
Wen, N., Su, X., Ma, P., Zhao, L., Zhang, Y.: Online UAV path planning in uncertain and hostile environments. Int. J. Mach. Learn. Cybern. 8, 469–487 (2017)
Article Google Scholar
Kim, Y.J., Hoffmann, C.M.: Enhanced battlefield visualization for situation awareness. Comput. Graph. 27, 873–885 (2003)
Article Google Scholar
Tai, L., Liu, M.: Towards cognitive exploration through deep reinforcement learning for mobile robots. arXiv:1610.01733(2016)
Van Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double Q-learning. In: Proceedings of AAAI Conference on Artificial Intelligence, pp. 2094–2100 (2015)
Google Scholar
Wang, Z., Schaul, T., Hessel, M., Van Hasselt, H., Lanctot, M., De Freitas, N.: Dueling network architectures for deep reinforcement learning. In: Proceedings of International Conference on Machine Learning (ICML), pp. 1995–2003 (2016)
Google Scholar
Van Hasselt, H.: Double Q-learning. In: Advances in Neural Information Processing Systems, pp. 2613–2621 (2010)
Google Scholar
Wang, Z., Bapst, V., Heess, N., Mnih, V., Munos, R., Kavukcuoglu, K., De Freitas, N.: Sample efficient Actor-Critic with experience replay. arXiv:1611.01224(2016)
Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: Proceedings of International Conference on Machine Learning (ICML), pp. 807–814 (2010)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a Method for Stochastic Optimization. arXiv: 1412.6980(2014)

Download references

Author information

Authors and Affiliations

College of Intelligence Science and Technology, National University of Defense Technology, Changsha, 410073, China
Chao Yan, Xiaojia Xiang & Chang Wang

Authors

Chao Yan
View author publications
You can also search for this author in PubMed Google Scholar
Xiaojia Xiang
View author publications
You can also search for this author in PubMed Google Scholar
Chang Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaojia Xiang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yan, C., Xiang, X. & Wang, C. Towards Real-Time Path Planning through Deep Reinforcement Learning for a UAV in Dynamic Environments. J Intell Robot Syst 98, 297–309 (2020). https://doi.org/10.1007/s10846-019-01073-3

Download citation

Received: 09 January 2019
Accepted: 25 July 2019
Published: 07 September 2019
Issue Date: May 2020
DOI: https://doi.org/10.1007/s10846-019-01073-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Towards Real-Time Path Planning through Deep Reinforcement Learning for a UAV in Dynamic Environments

Abstract

Access this article

Similar content being viewed by others

Unmanned Aerial Vehicles Path Planning Based on Deep Reinforcement Learning

Trajectory Planning of UAV in Unknown Dynamic Environment with Deep Reinforcement Learning

A Novel UAV Path Planning Method Based on Layered PER-DDQN

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation