Abstract
Designing optimal controllers still poses a challenge for modern Artificial Intelligence systems. Prior research has explored reinforcement learning (RL) algorithms for benchmarking the cart-pole control problem. However, there is still a lack of investigation of cognitive decision-making models and their ensemble with the RL techniques in the context of such dynamical control tasks. The primary objective of this paper is to implement a Deep Q-Network (DQN), Instance-based Learning (IBL), and an ensemble model of DQN and IBL for the cart-pole environment and compare these models’ ability to match human choices. Forty-two human participants were recruited to play the cart-pole game for ten training trials followed by a test trial, and the human experience information containing the situations, decisions taken, and the corresponding reward earned was recorded. The human experiences collected from the game-play were used to initialize the memory (buffer) for both the algorithms, DQN and IBL, rather than following the approach of learning from scratch through environmental interaction. The results indicated that the IBL algorithm initialized with human experience could be proposed as an alternative to the Q-learning initialized with human experience. It was also observed that the ensemble model could account for the human choices more accurately compared to the Q-learning and IBL models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Anderson, J., Bothell, D., Byrne, M., Douglass, S., Lebiere, C., Qin, Y.: An integrated theory of mind. Psychol. Rev. 111, 1036–1060 (2004)
Arulkumaran, K., Deisenroth, M.P., Brundage, M., Bharath, A.A.: A brief survey of deep reinforcement learning. arXiv preprint arXiv:1708.05866 (2017)
Barto, A.G., Sutton, R.S., Anderson, C.W.: Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Trans. Syst. Man Cybern. SMC-13(5), 834–846 (1983). https://doi.org/10.1109/TSMC.1983.6313077
Duarte, F.F., Lau, N., Pereira, A., Reis, L.P.: Benchmarking deep and non-deep reinforcement learning algorithms for discrete environments. In: Silva, M.F., Luís Lima, J., Reis, L.P., Sanfeliu, A., Tardioli, D. (eds.) ROBOT 2019. AISC, vol. 1093, pp. 263–275. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-36150-1_22
Fgadaleta: top 4 reasons why reinforcement learning sucks (ep. 83) (2019). https://datascienceathome.com/what-is-wrong-with-reinforcement-learning/
Gonzalez, C., Dutt, V.: Instance-based learning models of training. In: Proceedings of the human factors and ergonomics society annual meeting, vol. 54, pp. 2319–2323. SAGE Publications Sage CA: Los Angeles, CA (2010)
Kotseruba, I., Tsotsos, J.K.: A review of 40 years of cognitive architecture research: core cognitive abilities and practical applications. arXiv preprint arXiv:1610.08602 (2016)
Kumar, S.: Balancing a cartpole system with reinforcement learning-a tutorial. arXiv preprint arXiv:2006.04938 (2020)
Meltzer, B., Michie, D.: Machine intelligence 4 (1970)
Mnih, V., et al.: Playing Atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013)
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)
Mothanna, Y., Hewahi, N.: Review on reinforcement learning in cartpole game. In: 2022 International Conference on Innovation and Intelligence for Informatics, Computing, and Technologies (3ICT), pp. 344–349. IEEE (2022)
Nagendra, S., Podila, N., Ugarakhod, R., George, K.: Comparison of reinforcement learning algorithms applied to the Cart-Pole problem. In: 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp. 26–32. IEEE (2017)
Schaal, S.: Is imitation learning the route to humanoid robots? Trends Cogn. Sci. 3(6), 233–242 (1999)
Sunden, P.: Q-learning and deep Q-learning in OpenAI gym cartpole classic control environment (2022)
Surriani, A., Wahyunggoro, O., Cahyadi, A.I.: Reinforcement learning for cart pole inverted pendulum system. In: 2021 IEEE Industrial Electronics and Applications Conference (IEACon), pp. 297–301. IEEE (2021)
Sutton, R.S., Barto, A.G.: Reinforcement learning: an introduction MIT press. Cambridge, MA 22447 (1998)
Tan, Z., Karakose, M.: Optimized deep reinforcement learning approach for dynamic system. In: 2020 IEEE International Symposium on Systems Engineering (ISSE), pp. 1–4. IEEE (2020)
Wang, X., Gu, Y., Cheng, Y., Liu, A., Chen, C.P.: Approximate policy-based accelerated deep reinforcement learning. IEEE Trans. Neural Netw. Learn. Syst. 31(6), 1820–1830 (2019)
Watkins, C.J.C.H., Dayan, P.: Q-learning. Mach. Learn. 8, 279–292 (1992). https://doi.org/10.1007/BF00992698
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Gupta, A., Dabas, M., Uttrani, S., Sharma, S., Dutt, V. (2023). Modeling Human Actions in the Cart-Pole Game Using Cognitive and Deep Reinforcement Learning Approach. In: Thomson, R., Al-khateeb, S., Burger, A., Park, P., A. Pyke, A. (eds) Social, Cultural, and Behavioral Modeling. SBP-BRiMS 2023. Lecture Notes in Computer Science, vol 14161. Springer, Cham. https://doi.org/10.1007/978-3-031-43129-6_19
Download citation
DOI: https://doi.org/10.1007/978-3-031-43129-6_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43128-9
Online ISBN: 978-3-031-43129-6
eBook Packages: Computer ScienceComputer Science (R0)