Skip to main content
Log in

Towards Real-Time Path Planning through Deep Reinforcement Learning for a UAV in Dynamic Environments

  • Published:
Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Abstract

Path planning remains a challenge for Unmanned Aerial Vehicles (UAVs) in dynamic environments with potential threats. In this paper, we have proposed a Deep Reinforcement Learning (DRL) approach for UAV path planning based on the global situation information. We have chosen the STAGE Scenario software to provide the simulation environment where a situation assessment model is developed with consideration of the UAV survival probability under enemy radar detection and missile attack. We have employed the dueling double deep Q-networks (D3QN) algorithm that takes a set of situation maps as input to approximate the Q-values corresponding to all candidate actions. In addition, the ε-greedy strategy is combined with heuristic search rules to select an action. We have demonstrated the performance of the proposed method under both static and dynamic task settings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Tran, L.D., Cross, C.D., Motter, M.A., Neilan, J.H., Qualls, G., Rothhaar, P.M., Trujillo, A., Allen, B.D.: Reinforcement learning with autonomous small unmanned aerial vehicles in cluttered environments. In: Proceedings of AIAA Aviation Technology, Integration, and Operations Conference, 2899 (2015)

    Google Scholar 

  2. Faessler, M., Fontana, F., Forster, C., Mueggler, E., Pizzoli, M., Scaramuzza, D.: Autonomous, vision-based flight and live dense 3D mapping with a quadrotor micro aerial vehicle. J. Field. Rob. 33, 431–450 (2016)

    Article  Google Scholar 

  3. Scherer, S., Rehder, J., Achar, S., Cover, H., Chambers, A., Nuske, S., Singh, S.: River mapping from a flying robot: state estimation, river detection, and obstacle mapping. Auton. Robot. 33, 189–214 (2012)

    Article  Google Scholar 

  4. Xie, L., Wang, S., Markham, A., Trigoni, N.: Towards monocular vision based obstacle avoidance through deep reinforcement learning. arXiv:1706.09829(2017)

  5. Ross, S., Melik Barkhudarov, N., Shankar, K.S., Wendel, A., Dey, D., Bagnell, J.A., Hebert, M.: Learning monocular reactive UAV control in cluttered natural environments. In: Proceedings of IEEE International Conference on Robotics and Automation (ICRA), pp. 1765–1772 (2013)

    Google Scholar 

  6. Ma, Z., Wang, C., Niu, Y., Wang, X., Shen, L.: A saliency-based reinforcement learning approach for a UAV to avoid flying obstacles. Robot. Auton. Syst. 100, 108–118 (2018)

    Article  Google Scholar 

  7. Sutton, R.S., Barto, A.G.: Reinforcement Learning: an Introduction. MIT Press, Cambridge (1998)

    MATH  Google Scholar 

  8. Watkins, C.J., Dayan, P.: Q-learning. Mach. Learn. 8(3–4), 279–292 (1992)

    MATH  Google Scholar 

  9. Zhao, Y., Zheng, Z., Zhang, X., Liu, Y.: Q learning algorithm based UAV path learning and obstacle avoidance approach. In: Proceedings of Chinese Control Conference (CCC), pp. 3397–3402 (2017)

    Google Scholar 

  10. Li, S., Xu, X., Zuo, L.: Dynamic path planning of a mobile robot with improved Q-learning algorithm. In: Proceedings of IEEE International Conference on Information and Automation, pp. 409–414 (2015)

    Google Scholar 

  11. Tang, R., Yuan, H.: Cyclic error correction based Q-learning for mobile robots navigation. Int. J. Control. Autom. Syst. 15, 1790–1798 (2017)

    Article  Google Scholar 

  12. Wang, C., Hindriks, K.V., Babuska, R.: Robot learning and use of affordances in goal-directed tasks. In: Proceeding of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2288–2294 (2013)

    Google Scholar 

  13. Yan, C., Xiang, X.: A path planning algorithm for UAV based on improved Q-learning. In: Proceedings of IEEE International Conference on Robotics and Automation Sciences, pp. 46–50 (2018)

    Google Scholar 

  14. Li, Y.: Deep Reinforcement Learning: an Overview. arXiv:1701.07274(2017)

  15. Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.: Playing atari with deep reinforcement learning. arXiv:1312.5602(2013)

  16. Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., Hassabis, D.: Human-level control through deep reinforcement learning. Nature. 518(7540), 529–533 (2015)

    Article  Google Scholar 

  17. Wu, J., Shin, S., Kim, C.G., Kim, S.D.: Effective lazy training method for deep Q-network in obstacle avoidance and path planning. In: Proceedings of IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 1799–1804 (2017)

    Google Scholar 

  18. Zhou, B., Wang, W., Wang, Z., Ding, B.: Neural Q learning algorithm based UAV obstacle avoidance. In: Proceedings of IEEE/CSAA Guidance, Navigation and Control Conference, pp. 961–966 (2018)

    Google Scholar 

  19. Wang, Y., Peng, D.: A simulation platform of multi-sensor multi-target track system based on STAGE. In: Proceedings of World Congress on Intelligent Control and Automation, pp. 6975–6978 (2010)

    Google Scholar 

  20. Deng, Y.: A threat assessment model under uncertain environment. Math. Probl. Eng. 2015, 1–12 (2015)

    Google Scholar 

  21. Gao, Y., Xiang, J.: New threat assessment non-parameter model in beyond-visual-range air combat. Journal of System Simulation. 18, 2570–2572 (2006)

    Google Scholar 

  22. Xiao, B., Fang, Y., Hu, S., Wang, L.: New threat assessment method in beyond-the-horizon range air combat. Syst. Eng. Electron. 31, 2163–2166 (2009)

    Google Scholar 

  23. Ernest, N., Cohen, K., Kivelevitch, E., Schumacher, C., Casbeer, D.: Genetic fuzzy trees and their application towards autonomous training and control of a squadron of unmanned combat aerial vehicles. Unmanned Systems. 3(03), 185–204 (2015)

    Article  Google Scholar 

  24. Wen, N., Su, X., Ma, P., Zhao, L., Zhang, Y.: Online UAV path planning in uncertain and hostile environments. Int. J. Mach. Learn. Cybern. 8, 469–487 (2017)

    Article  Google Scholar 

  25. Kim, Y.J., Hoffmann, C.M.: Enhanced battlefield visualization for situation awareness. Comput. Graph. 27, 873–885 (2003)

    Article  Google Scholar 

  26. Tai, L., Liu, M.: Towards cognitive exploration through deep reinforcement learning for mobile robots. arXiv:1610.01733(2016)

  27. Van Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double Q-learning. In: Proceedings of AAAI Conference on Artificial Intelligence, pp. 2094–2100 (2015)

    Google Scholar 

  28. Wang, Z., Schaul, T., Hessel, M., Van Hasselt, H., Lanctot, M., De Freitas, N.: Dueling network architectures for deep reinforcement learning. In: Proceedings of International Conference on Machine Learning (ICML), pp. 1995–2003 (2016)

    Google Scholar 

  29. Van Hasselt, H.: Double Q-learning. In: Advances in Neural Information Processing Systems, pp. 2613–2621 (2010)

    Google Scholar 

  30. Wang, Z., Bapst, V., Heess, N., Mnih, V., Munos, R., Kavukcuoglu, K., De Freitas, N.: Sample efficient Actor-Critic with experience replay. arXiv:1611.01224(2016)

  31. Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: Proceedings of International Conference on Machine Learning (ICML), pp. 807–814 (2010)

    Google Scholar 

  32. Kingma, D.P., Ba, J.: Adam: a Method for Stochastic Optimization. arXiv: 1412.6980(2014)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaojia Xiang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yan, C., Xiang, X. & Wang, C. Towards Real-Time Path Planning through Deep Reinforcement Learning for a UAV in Dynamic Environments. J Intell Robot Syst 98, 297–309 (2020). https://doi.org/10.1007/s10846-019-01073-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10846-019-01073-3

Keywords

Navigation