Skip to main content
Log in

DRL-based Path Planner and its Application in Real Quadrotor with LIDAR

  • Regular paper
  • Published:
Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Abstract

The distribution mismatching issue has been hindering the landing of deep reinforcement learning algorithms in the robot field for a long time. This paper proposes a novel DRL-based path planner and corresponding training method to realize the safe obstacle avoidance of real quadrotors. To achieve the goal, we design a randomized environment generation module to fit the reality-simulation error. Then the map information can be parameterized to make the test data statistically significant. In addition, an instruction filter is proposed to smooth the output of the policy network in the test phase. Its improvement in obstacle avoidance performance is demonstrated in the experiment section. Finally, real-time flight experiments are conducted to verify the effectiveness of our algorithm and prove that the learning-based path planner can solve practical problems in the robot field. Our framework has three advantages: (1) map parameterization, (2) low-cost planning, and (3) reality validation. The video and code are available: https://github.com/Vinson-sheep/multi_rotor_avoidance_rl.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Code Availability

The complete simulation data is available by contacting the corresponding author.

References

  1. Aakash, C., Kumar, V.M.: Path planning of an UAV with the help of lidar for slam application. IOP Conference Series: Materials Science and Engineering 912(6), 062013 (2020). https://doi.org/10.1088/1757-899x/912/6/062013

    Article  Google Scholar 

  2. Arulkumaran, K., Deisenroth, M.P., Brundage, M., Bharath, A.A.: Deep reinforcement learning: a brief survey. IEEE Signal Process. Mag. 34(6), 26–38 (2017). https://doi.org/10.1109/MSP.2017.2743240

    Article  Google Scholar 

  3. Bengio, Y., Louradour, J., Collobert, R., Weston, J.: Curriculum learning. In: Proceedings of the 26th annual international conference on machine learning, pp 41–48 (2009)

  4. Bharadwaja, Y., Vaitheeswaran, S., Ananda, C.: Obstacle avoidance for unmanned air vehicles using monocular-slam with chain-based path planning in gps denied environments. Journal of Aerospace System Engineering 14(2), 1–11 (2020)

    Google Scholar 

  5. Chen, S., Chen, H., Zhou, W., Wen, C.Y., Li, B.: End-to-End UAV simulation for visual SLAM and navigation. arXiv:2012.00298 (2020)

  6. Chen, S., Chen, H., Zhou, W., Wen, C.Y., Li, B.: End-to-End UAV simulation for visual SLAM and navigation. arXiv:2012.00298 (2020)

  7. Christodoulou, P.: Soft actor-critic for discrete action settings. arXiv:1910.07207 (2019)

  8. Deits, R., Tedrake, R.: Efficient mixed-integer planning for Uavs in cluttered environments. In: 2015 IEEE International Conference on Robotics and Automation (ICRA), pp. 42–49 (2015), https://doi.org/10.1109/ICRA.2015.7138978

  9. Doukhi, O., Lee, D.: Deep reinforcement learning for end-to-end local motion planning of autonomous aerial robots in unknown outdoor environments: Real-time flight experiments. Sensors 21, 2534 (2021). https://doi.org/10.3390/s21072534

    Article  Google Scholar 

  10. Fujimoto, S., van Hoof, H., Meger, D.: Addressing function approximation error in actor-critic methods. arXiv:1802.09477 (2018)

  11. Gao, F., Wang, L., Zhou, B., Zhou, X., Pan, J., Shen, S.: Teach-repeat-replan: a complete and robust system for aggressive flight in complex environments. IEEE Trans. Robot. 36(5), 1526–1545 (2020). https://doi.org/10.1109/TRO.2020.2993215

    Article  Google Scholar 

  12. Haarnoja, T., Ha, S., Zhou, A., Tan, J., Tucker, G., Levine, S.: Learning to walk via deep reinforcement learning. arXiv:1812.11103 (2018)

  13. Haarnoja, T., Tang, H., Abbeel, P., Levine, S.: Reinforcement learning with deep energy-based policies. arXiv:1702.08165 (2017)

  14. Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. arXiv:1801.01290 (2018)

  15. Haarnoja, T., Zhou, A., Hartikainen, K., Tucker, G., Ha, S., Tan, J., Kumar, V., Zhu, H., Gupta, A., Abbeel, P., Levine, S.: Soft actor-critic algorithms and applications. arXiv:1812.05905(2018)

  16. Hart, P.E., Nilsson, N.J., Raphael, B.: A formal basis for the heuristic determination of minimum cost paths. IEEE Transactions on Systems Science and Cybernetics 4(2), 100–107 (1968). https://doi.org/10.1109/TSSC.1968.300136

    Article  Google Scholar 

  17. He, L., Aouf, N., Whidborne, J.F., Song, B.: Deep reinforcement learning based local planner for UAV obstacle avoidance using demonstration data. arXiv:2008.02521 (2020)

  18. Karaman, S., Frazzoli, E.: Incremental sampling-based algorithms for optimal motion planning. arXiv:1005.0416 (2010)

  19. Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv:1412.6980(2014)

  20. Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D.: Continuous control with deep reinforcement learning. arXiv:1509.02971 (2015)

  21. Mellinger, D., Kumar, V.: Minimum snap trajectory generation and control for Quadrotors. In: 2011 IEEE International Conference on Robotics and Automation, pp. 2520–2525 (2011), https://doi.org/10.1109/ICRA.2011.5980409

  22. Ou, J., Guo, X., Zhu, M., Lou, W.: Autonomous quadrotor obstacle avoidance based on dueling double deep recurrent q-learning with monocular vision. Neurocomputing 441(2), 300–310 (2021)

    Article  Google Scholar 

  23. Patterson, M.A., Weinstein, M., Rao, A.V.: An efficient overloaded method for computing derivatives of mathematical functions in matlab. ACM Trans. Math. Softw. 39(3). https://doi.org/10.1145/2450153.2450155 (2013)

  24. Sun, W., Tang, G., Hauser, K.: Fast uav trajectory optimization using bilevel optimization with analytical gradients. IEEE Trans. Robot. 37(6), 2010–2024 (2021). https://doi.org/10.1109/TRO.2021.3076454

    Article  Google Scholar 

  25. Tordesillas, J., Lopez, B.T., How, J.P.: Faster: fast and safe trajectory planner for flights in unknown environments. In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1934–1940 (2019), https://doi.org/10.1109/IROS40897.2019.8968021

  26. van Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double Q-learning. arXiv:1509.06461 (2015)

  27. Wang, Z., Zhou, X., Xu, C., Gao, F.: Geometrically constrained trajectory optimization for multicopters. IEEE Trans. Robot. 38(5), 3259–3278 (2022). https://doi.org/10.1109/TRO.2022.3160022

    Article  Google Scholar 

  28. Wang, Z., Schaul, T., Hessel, M., Hasselt, H., Lanctot, M., Freitas, N.: Dueling network architectures for deep reinforcement learning. In: International Conference on Machine Learning, pp. 1995–2003. PMLR (2016)

  29. Wu, C.: Towards linear-time incremental structure from motion. In: 2013 International Conference on 3D Vision - 3DV 2013, pp. 127–134. IEEE (2013), https://doi.org/10.1109/3DV.2013.25

Download references

Funding

This work is supported by the Guangdong Basic and Applied Basic Research Foundation (No. 2020A1515110815).

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to the study’s conception and design. Yongsheng Yang, Zhiwei Hou, Hongbo Chen and Peng Lu performed material preparation, data collection and analysis. Yongsheng Yang wrote the first draft of the manuscript and all authors commented on previous versions. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Zhiwei Hou.

Ethics declarations

Ethics approval

Not applicable. Our manuscript doesn’t report the results of studies involving humans or animals.

Consent to participate

Not applicable. Our manuscript doesn’t report the results of studies involving humans or animals.

Consent for Publication

All authors have approved and consented to publish the manuscript.

Conflict of Interests

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, Y., Hou, Z., Chen, H. et al. DRL-based Path Planner and its Application in Real Quadrotor with LIDAR. J Intell Robot Syst 107, 38 (2023). https://doi.org/10.1007/s10846-023-01819-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10846-023-01819-0

Keywords

Navigation