A Modified Convergence DDPG Algorithm for Robotic Manipulation

Afzali, Seyed Reza; Shoaran, Maryam; Karimian, Ghader

doi:10.1007/s11063-023-11393-z

A Modified Convergence DDPG Algorithm for Robotic Manipulation

Published: 29 August 2023

Volume 55, pages 11637–11652, (2023)
Cite this article

Neural Processing Letters Aims and scope Submit manuscript

Seyed Reza Afzali¹,
Maryam Shoaran² &
Ghader Karimian²

443 Accesses
1 Citation
Explore all metrics

Abstract

Today, robotic arms are widely used in industry. Reinforcement learning algorithms are used frequently for controlling robotic arms in complex environments. One of the customs off-policy model-free actor-critic deep reinforcement learning for continuous action spaces is deep deterministic policy gradient (DDPG). This algorithm has achieved significant results when applied to control robotic arms with high degrees of freedom. But, it also has limitations. DDPG is prone to instability and divergence in complex tasks due to the high dimensional continuous action spaces. In this paper, in order to increase the reliability and convergence speed of the DDPG algorithm, a new modified convergence DDPG (MCDDPG) algorithm is presented. By saving and reusing desirable parameters of the previous actor and critic networks, the proposed algorithm has shown a significant enhancement in training time and stability of the model compared to the conventional DDPG. We evaluate our method on the PR2’s right arm which is a 7-DoF manipulator, and simulations demonstrate that our MCDDPG outperforms state-of-the-art algorithms such as DDPG and normalized advantage function in learning complex robotic tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robotic Arm Control and Task Training Through Deep Reinforcement Learning

Research on Motion Planning of Seven Degree of Freedom Manipulator Based on DDPG

A policy gradient algorithm integrating long and short-term rewards for soft continuum arm control

Article 18 July 2022

References

Chieh-Li C et al (2011) Robust trajectories following control of a 2-link robot manipulator via coordinate transformation for manufacturing applications. Robot Comput Integr Manuf 27(3):569–580
Article Google Scholar
Rao D et al (2023) Design and development of robotic manipulator’s for medical surgeries. Mater Today Proc 80(1):195–201
Article Google Scholar
Broquere X et al (2008) Soft motion trajectory planner for service manipulator robot. In: 2008 IEEE/RSJ international conference on intelligent robots and systems
Galicki M (2016) Finite-time trajectory tracking control in a task space of robotic manipulators. Automatica 67:165–170
Article MathSciNet MATH Google Scholar
Truong TN et al (2021) A backstepping global fast terminal sliding mode control for trajectory tracking control of industrial robotic manipulators. IEEE Access 9:31921–31931
Article Google Scholar
Chen D et al (2021) A novel supertwisting zeroing neural network with application to mobile robot manipulators. IEEE Trans Neural Netw Learn Syst 32(4):1776–1787
Article MathSciNet Google Scholar
Li W et al (2021) A gradient-based neural network accelerated for vision-based control of an RCM-constrained surgical endoscope robot. Neural Comput Appl 34:1329–1343. https://doi.org/10.1007/s00521-021-06465-x
Article Google Scholar
Adorno BV, Marinho MM (2020) Dq robotics: a library for robot modeling and control. IEEE Robot Autom Mag 28(3):102–116. https://doi.org/10.1109/MRA.2020.2997920
Article Google Scholar
Marinho MM et al (2020) SmartArm: integration and validation of a versatile surgical robotic system for constrained workspaces. Int J Med Robot Comput Assist Surg 16(2):e2053. https://doi.org/10.1002/rcs.2053
Article Google Scholar
Quiroz-Omana JJ et al (2019) Whole-body control with (self) collision avoidance using vector field inequalities. IEEE Robot Autom Lett 4(4):4048–4053. https://doi.org/10.1109/LRA.2019.2928783
Article Google Scholar
Savino HJ et al (2020) Pose consensus based on dual quaternion algebra with application to decentralized formation control of mobile manipulators. J Frankl Inst 357(1):142–178. https://doi.org/10.1016/j.jfranklin.2019.09.045
Article MathSciNet MATH Google Scholar
Pane YP et al (2019) Reinforcement learning based compensation methods for robot manipulators. Eng Appl Artif Intell 78:236–247
Article Google Scholar
Tai L et al (2017) Virtual-to-real deep reinforcement learning: continuous control of mobile robots for mapless navigation. In: 2017 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE. https://doi.org/10.1109/IROS.2017.8202134
Rusu AA et al (2017) Sim-to-real robot learning from pixels with progressive nets. In: 1st conference on robot learning (CoRL 2017), Mountain View, USA
Kober J et al (2013) Reinforcement learning in robotics: a survey. Int J Robot Res 32(11):1238–1274. https://doi.org/10.1177/0278364913495721
Article Google Scholar
Deisenroth MP et al (2013) A survey on policy search for robotics. Found Trends Robot 2(1–2):388–403. https://doi.org/10.1561/2300000021
Article Google Scholar
Mnih V et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533
Article Google Scholar
Krizhevsky A et al (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 25:1097–1105
Google Scholar
Kodama N et al (2018) A proposal for reducing the number of trial-and-error searches for deep Q-networks combined with exploitation-oriented learning. In: 2018 17th IEEE international conference on machine learning and applications (ICMLA). IEEE. https://doi.org/10.1109/ICMLA.2018.00160
Zhang C, Ma L (2019) Trial and error experience replay based deep reinforcement learning. In: 2019 IEEE international conference on smart cloud (SmartCloud), IEEE. https://doi.org/10.1109/SmartCloud.2019.00045
Levine S et al (2016) End-to-end training of deep visuomotor policies. J Mach Learn Res 17(1):1334–1373
MathSciNet MATH Google Scholar
Ghadirzadeh A et al (2017) Deep predictive policy training using reinforcement learning. In: 2017 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE. https://doi.org/10.1109/IROS.2017.8206046
Degris T et al (2012) Model-free reinforcement learning with continuous action in practice. In: 2012 American control conference (ACC). IEEE. https://doi.org/10.1109/ACC.2012.6315022
Hua J et al (2021) Learning for a robot: deep reinforcement learning, imitation learning, transfer learning. Sensors 21(4):1278
Article Google Scholar
Yang Z et al (2018) Hierarchical deep reinforcement learning for continuous action control. IEEE Trans Neural Netw Learn Syst 29(11):5174–5184. https://doi.org/10.1109/TNNLS.2018.2805379
Article MathSciNet Google Scholar
Lillicrap T, Hunt JJ, Pritzel A, Heess N et al (2016) Continuous control with deep reinforcement learning. arXiv:1509.02971. https://doi.org/10.48550/arXiv.1509.02971
Gu S, Lillicrap T, Sutskever I, Levine S (2016) Continuous deep Q-learning with model-based acceleration. In: Proceedings of the 33rd international conference on machine learning, New York, NY, USA, pp 19–24. https://doi.org/10.48550/arXiv.1603.00748
http://wiki.ros.org/melodic
https://gazebosim.org/
http://wiki.ros.org/rviz
PR2 - ROBOTS: Your Guide to the World of Robotics (ieee.org)
Zhang F, Leitner J, Milford M, Upcroft B, Corke P (2015) Towards vision-based deep reinforcement learning for robotic motion control. In: Australasian conference on robotics and automation (ACRA)
Schulman J, Wolski F et al (2017). Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347. https://doi.org/10.48550/arXiv.1707.06347
Mnih V, Badia AP et al (2016) Asynchronous methods for deep reinforcement learning. In: International conference on machine learning, pp 1928–1937. https://doi.org/10.48550/arXiv.1602.01783
Mirowski P, Pascanu R, Viola F et al (2017) Learning to navigate in complex environments. https://doi.org/10.48550/arXiv.1611.03673
Haarnoja T, Zhou A, Abbeel P, Levine S (2018) Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: Proceedings of the 35th international conference on machine learning, PMLR, vol 80, pp 1861–1870. https://doi.org/10.48550/arXiv.1801.01290
Haarnoja T, Zhou A, Hartikainen K et al (2018) Soft actor-critic algorithms and applications. arXiv preprint arXiv:1812.05905. https://doi.org/10.48550/arXiv.1812.05905
Gu S, Holly E, Lillicrap T, Levine S (2017) Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. In: Proceedings of the 2017 IEEE international conference on robotics and automation (ICRA), Singapore, pp 3389–3396. https://doi.org/10.1109/ICRA.2017.7989385
Arnekvist I (2017) Reinforcement learning for robotic manipulation. Thesis, KTH Royal Institute of Technology
Gu S (2019) Sample-efficient deep reinforcement learning for continuous control. PhD dissertation, University of Cambridge.
https://docs.opencv.org/

Download references

Funding

This study was not supported by any funding agency.

Author information

Authors and Affiliations

Faculty of Mechanical Engineering, University of Tabriz, Tabriz, Iran
Seyed Reza Afzali
Faculty of Electrical and Computer Engineering, University of Tabriz, Tabriz, Iran
Maryam Shoaran & Ghader Karimian

Authors

Seyed Reza Afzali
View author publications
You can also search for this author in PubMed Google Scholar
Maryam Shoaran
View author publications
You can also search for this author in PubMed Google Scholar
Ghader Karimian
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors wrote the main manuscript text and reviewed the manuscript.

Corresponding author

Correspondence to Ghader Karimian.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Afzali, S.R., Shoaran, M. & Karimian, G. A Modified Convergence DDPG Algorithm for Robotic Manipulation. Neural Process Lett 55, 11637–11652 (2023). https://doi.org/10.1007/s11063-023-11393-z

Download citation

Accepted: 09 August 2023
Published: 29 August 2023
Issue Date: December 2023
DOI: https://doi.org/10.1007/s11063-023-11393-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Modified Convergence DDPG Algorithm for Robotic Manipulation

Abstract

Access this article

Similar content being viewed by others

Robotic Arm Control and Task Training Through Deep Reinforcement Learning

Research on Motion Planning of Seven Degree of Freedom Manipulator Based on DDPG

A policy gradient algorithm integrating long and short-term rewards for soft continuum arm control

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Modified Convergence DDPG Algorithm for Robotic Manipulation

Abstract

Access this article

Similar content being viewed by others

Robotic Arm Control and Task Training Through Deep Reinforcement Learning

Research on Motion Planning of Seven Degree of Freedom Manipulator Based on DDPG

A policy gradient algorithm integrating long and short-term rewards for soft continuum arm control

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation