Skip to main content
Log in

A Modified Convergence DDPG Algorithm for Robotic Manipulation

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

Today, robotic arms are widely used in industry. Reinforcement learning algorithms are used frequently for controlling robotic arms in complex environments. One of the customs off-policy model-free actor-critic deep reinforcement learning for continuous action spaces is deep deterministic policy gradient (DDPG). This algorithm has achieved significant results when applied to control robotic arms with high degrees of freedom. But, it also has limitations. DDPG is prone to instability and divergence in complex tasks due to the high dimensional continuous action spaces. In this paper, in order to increase the reliability and convergence speed of the DDPG algorithm, a new modified convergence DDPG (MCDDPG) algorithm is presented. By saving and reusing desirable parameters of the previous actor and critic networks, the proposed algorithm has shown a significant enhancement in training time and stability of the model compared to the conventional DDPG. We evaluate our method on the PR2’s right arm which is a 7-DoF manipulator, and simulations demonstrate that our MCDDPG outperforms state-of-the-art algorithms such as DDPG and normalized advantage function in learning complex robotic tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Chieh-Li C et al (2011) Robust trajectories following control of a 2-link robot manipulator via coordinate transformation for manufacturing applications. Robot Comput Integr Manuf 27(3):569–580

    Article  Google Scholar 

  2. Rao D et al (2023) Design and development of robotic manipulator’s for medical surgeries. Mater Today Proc 80(1):195–201

    Article  Google Scholar 

  3. Broquere X et al (2008) Soft motion trajectory planner for service manipulator robot. In: 2008 IEEE/RSJ international conference on intelligent robots and systems

  4. Galicki M (2016) Finite-time trajectory tracking control in a task space of robotic manipulators. Automatica 67:165–170

    Article  MathSciNet  MATH  Google Scholar 

  5. Truong TN et al (2021) A backstepping global fast terminal sliding mode control for trajectory tracking control of industrial robotic manipulators. IEEE Access 9:31921–31931

    Article  Google Scholar 

  6. Chen D et al (2021) A novel supertwisting zeroing neural network with application to mobile robot manipulators. IEEE Trans Neural Netw Learn Syst 32(4):1776–1787

    Article  MathSciNet  Google Scholar 

  7. Li W et al (2021) A gradient-based neural network accelerated for vision-based control of an RCM-constrained surgical endoscope robot. Neural Comput Appl 34:1329–1343. https://doi.org/10.1007/s00521-021-06465-x

    Article  Google Scholar 

  8. Adorno BV, Marinho MM (2020) Dq robotics: a library for robot modeling and control. IEEE Robot Autom Mag 28(3):102–116. https://doi.org/10.1109/MRA.2020.2997920

    Article  Google Scholar 

  9. Marinho MM et al (2020) SmartArm: integration and validation of a versatile surgical robotic system for constrained workspaces. Int J Med Robot Comput Assist Surg 16(2):e2053. https://doi.org/10.1002/rcs.2053

    Article  Google Scholar 

  10. Quiroz-Omana JJ et al (2019) Whole-body control with (self) collision avoidance using vector field inequalities. IEEE Robot Autom Lett 4(4):4048–4053. https://doi.org/10.1109/LRA.2019.2928783

    Article  Google Scholar 

  11. Savino HJ et al (2020) Pose consensus based on dual quaternion algebra with application to decentralized formation control of mobile manipulators. J Frankl Inst 357(1):142–178. https://doi.org/10.1016/j.jfranklin.2019.09.045

    Article  MathSciNet  MATH  Google Scholar 

  12. Pane YP et al (2019) Reinforcement learning based compensation methods for robot manipulators. Eng Appl Artif Intell 78:236–247

    Article  Google Scholar 

  13. Tai L et al (2017) Virtual-to-real deep reinforcement learning: continuous control of mobile robots for mapless navigation. In: 2017 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE. https://doi.org/10.1109/IROS.2017.8202134

  14. Rusu AA et al (2017) Sim-to-real robot learning from pixels with progressive nets. In: 1st conference on robot learning (CoRL 2017), Mountain View, USA

  15. Kober J et al (2013) Reinforcement learning in robotics: a survey. Int J Robot Res 32(11):1238–1274. https://doi.org/10.1177/0278364913495721

    Article  Google Scholar 

  16. Deisenroth MP et al (2013) A survey on policy search for robotics. Found Trends Robot 2(1–2):388–403. https://doi.org/10.1561/2300000021

    Article  Google Scholar 

  17. Mnih V et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533

    Article  Google Scholar 

  18. Krizhevsky A et al (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 25:1097–1105

    Google Scholar 

  19. Kodama N et al (2018) A proposal for reducing the number of trial-and-error searches for deep Q-networks combined with exploitation-oriented learning. In: 2018 17th IEEE international conference on machine learning and applications (ICMLA). IEEE. https://doi.org/10.1109/ICMLA.2018.00160

  20. Zhang C, Ma L (2019) Trial and error experience replay based deep reinforcement learning. In: 2019 IEEE international conference on smart cloud (SmartCloud), IEEE. https://doi.org/10.1109/SmartCloud.2019.00045

  21. Levine S et al (2016) End-to-end training of deep visuomotor policies. J Mach Learn Res 17(1):1334–1373

    MathSciNet  MATH  Google Scholar 

  22. Ghadirzadeh A et al (2017) Deep predictive policy training using reinforcement learning. In: 2017 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE. https://doi.org/10.1109/IROS.2017.8206046

  23. Degris T et al (2012) Model-free reinforcement learning with continuous action in practice. In: 2012 American control conference (ACC). IEEE. https://doi.org/10.1109/ACC.2012.6315022

  24. Hua J et al (2021) Learning for a robot: deep reinforcement learning, imitation learning, transfer learning. Sensors 21(4):1278

    Article  Google Scholar 

  25. Yang Z et al (2018) Hierarchical deep reinforcement learning for continuous action control. IEEE Trans Neural Netw Learn Syst 29(11):5174–5184. https://doi.org/10.1109/TNNLS.2018.2805379

    Article  MathSciNet  Google Scholar 

  26. Lillicrap T, Hunt JJ, Pritzel A, Heess N et al (2016) Continuous control with deep reinforcement learning. arXiv:1509.02971. https://doi.org/10.48550/arXiv.1509.02971

  27. Gu S, Lillicrap T, Sutskever I, Levine S (2016) Continuous deep Q-learning with model-based acceleration. In: Proceedings of the 33rd international conference on machine learning, New York, NY, USA, pp 19–24. https://doi.org/10.48550/arXiv.1603.00748

  28. http://wiki.ros.org/melodic

  29. https://gazebosim.org/

  30. http://wiki.ros.org/rviz

  31. PR2 - ROBOTS: Your Guide to the World of Robotics (ieee.org)

  32. Zhang F, Leitner J, Milford M, Upcroft B, Corke P (2015) Towards vision-based deep reinforcement learning for robotic motion control. In: Australasian conference on robotics and automation (ACRA)

  33. Schulman J, Wolski F et al (2017). Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347. https://doi.org/10.48550/arXiv.1707.06347

  34. Mnih V, Badia AP et al (2016) Asynchronous methods for deep reinforcement learning. In: International conference on machine learning, pp 1928–1937. https://doi.org/10.48550/arXiv.1602.01783

  35. Mirowski P, Pascanu R, Viola F et al (2017) Learning to navigate in complex environments. https://doi.org/10.48550/arXiv.1611.03673

  36. Haarnoja T, Zhou A, Abbeel P, Levine S (2018) Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: Proceedings of the 35th international conference on machine learning, PMLR, vol 80, pp 1861–1870. https://doi.org/10.48550/arXiv.1801.01290

  37. Haarnoja T, Zhou A, Hartikainen K et al (2018) Soft actor-critic algorithms and applications. arXiv preprint arXiv:1812.05905. https://doi.org/10.48550/arXiv.1812.05905

  38. Gu S, Holly E, Lillicrap T, Levine S (2017) Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. In: Proceedings of the 2017 IEEE international conference on robotics and automation (ICRA), Singapore, pp 3389–3396. https://doi.org/10.1109/ICRA.2017.7989385

  39. Arnekvist I (2017) Reinforcement learning for robotic manipulation. Thesis, KTH Royal Institute of Technology

  40. Gu S (2019) Sample-efficient deep reinforcement learning for continuous control. PhD dissertation, University of Cambridge.

  41. https://docs.opencv.org/

Download references

Funding

This study was not supported by any funding agency.

Author information

Authors and Affiliations

Authors

Contributions

All authors wrote the main manuscript text and reviewed the manuscript.

Corresponding author

Correspondence to Ghader Karimian.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Afzali, S.R., Shoaran, M. & Karimian, G. A Modified Convergence DDPG Algorithm for Robotic Manipulation. Neural Process Lett 55, 11637–11652 (2023). https://doi.org/10.1007/s11063-023-11393-z

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-023-11393-z

Keywords

Navigation