Modeling Human Actions in the Cart-Pole Game Using Cognitive and Deep Reinforcement Learning Approach

Gupta, Aadhar; Dabas, Mahavir; Uttrani, Shashank; Sharma, Sakshi; Dutt, Varun

doi:10.1007/978-3-031-43129-6_19

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14161))

Included in the following conference series:

International Conference on Social Computing, Behavioral-Cultural Modeling and Prediction and Behavior Representation in Modeling and Simulation

380 Accesses
1 Altmetric

Abstract

Designing optimal controllers still poses a challenge for modern Artificial Intelligence systems. Prior research has explored reinforcement learning (RL) algorithms for benchmarking the cart-pole control problem. However, there is still a lack of investigation of cognitive decision-making models and their ensemble with the RL techniques in the context of such dynamical control tasks. The primary objective of this paper is to implement a Deep Q-Network (DQN), Instance-based Learning (IBL), and an ensemble model of DQN and IBL for the cart-pole environment and compare these models’ ability to match human choices. Forty-two human participants were recruited to play the cart-pole game for ten training trials followed by a test trial, and the human experience information containing the situations, decisions taken, and the corresponding reward earned was recorded. The human experiences collected from the game-play were used to initialize the memory (buffer) for both the algorithms, DQN and IBL, rather than following the approach of learning from scratch through environmental interaction. The results indicated that the IBL algorithm initialized with human experience could be proposed as an alternative to the Q-learning initialized with human experience. It was also observed that the ensemble model could account for the human choices more accurately compared to the Q-learning and IBL models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Anderson, J., Bothell, D., Byrne, M., Douglass, S., Lebiere, C., Qin, Y.: An integrated theory of mind. Psychol. Rev. 111, 1036–1060 (2004)
Google Scholar
Arulkumaran, K., Deisenroth, M.P., Brundage, M., Bharath, A.A.: A brief survey of deep reinforcement learning. arXiv preprint arXiv:1708.05866 (2017)
Barto, A.G., Sutton, R.S., Anderson, C.W.: Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Trans. Syst. Man Cybern. SMC-13(5), 834–846 (1983). https://doi.org/10.1109/TSMC.1983.6313077
Duarte, F.F., Lau, N., Pereira, A., Reis, L.P.: Benchmarking deep and non-deep reinforcement learning algorithms for discrete environments. In: Silva, M.F., Luís Lima, J., Reis, L.P., Sanfeliu, A., Tardioli, D. (eds.) ROBOT 2019. AISC, vol. 1093, pp. 263–275. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-36150-1_22
Chapter Google Scholar
Fgadaleta: top 4 reasons why reinforcement learning sucks (ep. 83) (2019). https://datascienceathome.com/what-is-wrong-with-reinforcement-learning/
Gonzalez, C., Dutt, V.: Instance-based learning models of training. In: Proceedings of the human factors and ergonomics society annual meeting, vol. 54, pp. 2319–2323. SAGE Publications Sage CA: Los Angeles, CA (2010)
Google Scholar
Kotseruba, I., Tsotsos, J.K.: A review of 40 years of cognitive architecture research: core cognitive abilities and practical applications. arXiv preprint arXiv:1610.08602 (2016)
Kumar, S.: Balancing a cartpole system with reinforcement learning-a tutorial. arXiv preprint arXiv:2006.04938 (2020)
Meltzer, B., Michie, D.: Machine intelligence 4 (1970)
Google Scholar
Mnih, V., et al.: Playing Atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013)
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)
Google Scholar
Mothanna, Y., Hewahi, N.: Review on reinforcement learning in cartpole game. In: 2022 International Conference on Innovation and Intelligence for Informatics, Computing, and Technologies (3ICT), pp. 344–349. IEEE (2022)
Google Scholar
Nagendra, S., Podila, N., Ugarakhod, R., George, K.: Comparison of reinforcement learning algorithms applied to the Cart-Pole problem. In: 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp. 26–32. IEEE (2017)
Google Scholar
Schaal, S.: Is imitation learning the route to humanoid robots? Trends Cogn. Sci. 3(6), 233–242 (1999)
Article Google Scholar
Sunden, P.: Q-learning and deep Q-learning in OpenAI gym cartpole classic control environment (2022)
Google Scholar
Surriani, A., Wahyunggoro, O., Cahyadi, A.I.: Reinforcement learning for cart pole inverted pendulum system. In: 2021 IEEE Industrial Electronics and Applications Conference (IEACon), pp. 297–301. IEEE (2021)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement learning: an introduction MIT press. Cambridge, MA 22447 (1998)
Google Scholar
Tan, Z., Karakose, M.: Optimized deep reinforcement learning approach for dynamic system. In: 2020 IEEE International Symposium on Systems Engineering (ISSE), pp. 1–4. IEEE (2020)
Google Scholar
Wang, X., Gu, Y., Cheng, Y., Liu, A., Chen, C.P.: Approximate policy-based accelerated deep reinforcement learning. IEEE Trans. Neural Netw. Learn. Syst. 31(6), 1820–1830 (2019)
Article MathSciNet Google Scholar
Watkins, C.J.C.H., Dayan, P.: Q-learning. Mach. Learn. 8, 279–292 (1992). https://doi.org/10.1007/BF00992698

Download references

Author information

Authors and Affiliations

Applied Cognitive Science Lab, Indian Institute of Technology Mandi, Kamand, 175005, Himachal Pradesh, India
Aadhar Gupta, Mahavir Dabas, Shashank Uttrani, Sakshi Sharma & Varun Dutt

Authors

Aadhar Gupta
View author publications
You can also search for this author in PubMed Google Scholar
Mahavir Dabas
View author publications
You can also search for this author in PubMed Google Scholar
Shashank Uttrani
View author publications
You can also search for this author in PubMed Google Scholar
Sakshi Sharma
View author publications
You can also search for this author in PubMed Google Scholar
Varun Dutt
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Aadhar Gupta , Mahavir Dabas , Shashank Uttrani , Sakshi Sharma or Varun Dutt .

Editor information

Editors and Affiliations

Army Cyber Institute, United States Military Academy, West Point, NY, USA
Robert Thomson
Creighton University, Omaha, NE, USA
Samer Al-khateeb
Oak Ridge National Laboratory, Oak Ridge, TN, USA
Annetta Burger
Carnegie Mellon University, Pittsburg, PA, USA
Patrick Park
Army Cyber Institute, United States Military Academy, West Point, NY, USA
Aryn A. Pyke

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gupta, A., Dabas, M., Uttrani, S., Sharma, S., Dutt, V. (2023). Modeling Human Actions in the Cart-Pole Game Using Cognitive and Deep Reinforcement Learning Approach. In: Thomson, R., Al-khateeb, S., Burger, A., Park, P., A. Pyke, A. (eds) Social, Cultural, and Behavioral Modeling. SBP-BRiMS 2023. Lecture Notes in Computer Science, vol 14161. Springer, Cham. https://doi.org/10.1007/978-3-031-43129-6_19

Download citation

DOI: https://doi.org/10.1007/978-3-031-43129-6_19
Published: 16 September 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43128-9
Online ISBN: 978-3-031-43129-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Modeling Human Actions in the Cart-Pole Game Using Cognitive and Deep Reinforcement Learning Approach