Persistent rule-based interactive reinforcement learning

Bignold, Adam; Cruz, Francisco; Dazeley, Richard; Vamplew, Peter; Foale, Cameron

doi:10.1007/s00521-021-06466-w

Persistent rule-based interactive reinforcement learning

S.I.: Human-in-the-loop Machine Learning and its Applications
Published: 04 September 2021

Volume 35, pages 23411–23428, (2023)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Adam Bignold¹^na1,
Francisco Cruz ORCID: orcid.org/0000-0002-1131-3382^2,3^na1,
Richard Dazeley²,
Peter Vamplew¹ &
…
Cameron Foale¹

653 Accesses
14 Citations
1 Altmetric
Explore all metrics

Abstract

Interactive reinforcement learning has allowed speeding up the learning process in autonomous agents by including a human trainer providing extra information to the agent in real-time. Current interactive reinforcement learning research has been limited to real-time interactions that offer relevant user advice to the current state only. Additionally, the information provided by each interaction is not retained and instead discarded by the agent after a single-use. In this work, we propose a persistent rule-based interactive reinforcement learning approach, i.e., a method for retaining and reusing provided knowledge, allowing trainers to give general advice relevant to more than just the current state. Our experimental results show persistent advice substantially improves the performance of the agent while reducing the number of interactions required for the trainer. Moreover, rule-based advice shows similar performance impact as state-based advice, but with a substantially reduced interaction count.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Human-in-the-loop machine learning: a state of the art

Article Open access 17 August 2022

Multi-agent deep reinforcement learning: a survey

Article Open access 15 April 2021

A practical guide to multi-objective reinforcement learning and planning

Article Open access 13 April 2022

References

Arzate C, Igarashi T (2020) A survey on interactive reinforcement learning: design principles and open challenges. In: Proceedings of the 2020 ACM designing interactive systems conference. pp 1195–1209
Lin J, Ma Z, Gomez R, Nakamura K, He B, Li G (2020) A review on interactive reinforcement learning from human social feedback. IEEE Access 8:120757–120765
Article Google Scholar
Bignold A, Cruz F, Dazeley R, Vamplew P, Foale C (2020) Human engagement providing evaluative and informative advice for interactive reinforcement learning arXiv preprint arXiv:2009.09575
Knox WB, Stone P (2009) Interactively shaping agents via human reinforcement: The TAMER framework. In: Proceedings of the fifth international conference on knowledge capture, pp. 9–16, ACM
Bignold A, Cruz F, Taylor ME, Brys T, Dazeley R, Vamplew P, Foale C (2020) A conceptual framework for externally-influenced agents: an assisted reinforcement learning review, arXiv preprint arXiv:2007.01544
Griffith S, Subramanian K, Scholz J, Isbell C, Thomaz AL (2013) Policy shaping: integrating human feedback with reinforcement learning. In: Advances in neural information processing systems. pp 2625–2633
Knox WB, and Stone P (2010) Combining manual feedback with subsequent MDP reward signals for reinforcement learning. In: Proceedings of the 9th international conference on autonomous agents and multiagent systems, vol 1, pp 5–12
Taylor ME, Carboni N, Fachantidis A, Vlahavas I, Torrey L (2014) Reinforcement learning agents providing advice in complex video games. Connect Sci 26(1):45–63
Article Google Scholar
Sutton RS, Barto AG (2018) Reinforcement learning: an introduction. MIT Press, Cambridge
MATH Google Scholar
Puterman ML (1994) Markov decision processes: discrete stochastic dynamic programming. Wiley, Hoboken
Book MATH Google Scholar
Sledge IJ, Príncipe JC (2017) Balancing exploration and exploitation in reinforcement learning using a value of information criterion. In: 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp. 2816–2820
Subramanian K, Isbell CL Jr, Thomaz AL (2016) Exploration from demonstration for interactive reinforcement learning. In: Proceedings of the 2016 international conference on autonomous agents & multiagent systems, pp 447–456
Moreira I, Rivas J, Cruz F, Dazeley R, Ayala A, Fernandes B (2020) Deep reinforcement learning with interactive feedback in a human-robot environment. Appl Sci 10(16):5574
Article Google Scholar
Thomaz AL, Hoffman G, Breazeal C (2005) Real-time interactive reinforcement learning for robots. In: AAAI 2005 workshop on human comprehensible machine learning
Ayala A, Henríquez C, Cruz F (2019) Reinforcement learning using continuous states and interactive feedback. In: Proceedings of the international conference on applications of intelligent systems, pp 1–5
Millán C, Fernandes B, Cruz F (2019) Human feedback in continuous actor-critic reinforcement learning. In: Proceedings of the European symposium on artificial neural networks, computational intelligence and machine learning ESANN, pp 661–666, ESANN
Pilarski PM, and Sutton RS (2012) Between instruction and reward: human-prompted switching. In: AAAI fall symposium series: robots learning interactively from human teachers, pp 45–52
Cruz F, Wüppen P, Magg S, Fazrie A, Wermter S (2017) Agent-advising approaches in an interactive reinforcement learning scenario. In: Proceedings of the joint IEEE international conference on development and learning and epigenetic robotics ICDL-EpiRob, pp 209–214, IEEE
Torrey L, Taylor ME (2013) Teaching on a budget: agents advising agents in reinforcement learning, In: Proceedings of the international conference on autonomous agents and multiagent systems AAMAS
López G, Quesada L, Guerrero LA (2017) Alexa vs. siri vs. cortana vs. google assistant: a comparison of speech-based natural user interfaces. In: International conference on applied human factors and ergonomics, pp 241–250, Springer
Churamani N, Cruz F, Griffiths S, and Barros P (2016) iCub: learning emotion expressions using human reward. In: Proceedings of the workshop on bio-inspired social robot learning in home scenarios. IEEE/RSJ IROS, p 2
Kwok SW, Carter C (1990) Multiple decision trees. In: Machine intelligence and pattern recognition, vol 9. Elsevier, pp 327–335
Rokach L, Maimon O (2005) Decision trees, in data mining and knowledge discovery handbook. Springer, pp 165–192
Quinlan JR (1986) Induction of decision trees. Mach Learn 1(1):81–106
Article Google Scholar
Breiman L (2017) Classification and regression trees. Routledge, Milton Park
Book Google Scholar
Džeroski S, De Raedt L, Driessens K (2001) Relational reinforcement learning. Mach Learn 43(1):7–52
Article MATH Google Scholar
Li R, Jabri A, Darrell T, Agrawal P (2020) Towards practical multi-object manipulation using relational reinforcement learning. In: IEEE international conference on robotics and automation, pp 4051–4058
Tadepalli P, Givan R, Driessens K (2004) Relational reinforcement learning: an overview. In: Proceedings of the ICML-2004 workshop on relational reinforcement learning, pp 1–9
Glatt R, Da Silva FL, da Costa Bianchi RA, Costa AHR (2020) DECAF: deep case-based policy inference for knowledge transfer in reinforcement learning. Expert Syst Appl 156:113420
Article Google Scholar
Bianchi RA, Ros R, De Mantaras RL (2009) Improving reinforcement learning by using case based heuristics. In: International conference on case-based reasoning. Springer, pp 75–89
Taylor ME, Stone P (2009) Transfer learning for reinforcement learning domains: a survey. J Mach Learn Res 10(7):1633–1685
MathSciNet MATH Google Scholar
Bianchi RA, Celiberto LA Jr, Santos PE, Matsuura JP, de Mantaras RL (2015) Transferring knowledge as heuristics in reinforcement learning: a case-based approach. Artif Intell 226:102–121
Article MathSciNet MATH Google Scholar
Kang B, Compton P, and Preston P (1995) Multiple classification ripple down rules: evaluation and possibilities. In: Proceedings 9th Banff knowledge acquisition for knowledge-based systems workshop, vol 1, pp 17–1
Compton P, Edwards G, Kang B, Lazarus L, Malor R, Menzies T, Preston P, Srinivasan A, Sammut C (1991) Ripple down rules: possibilities and limitations. In: Proceedings of the sixth AAAI knowledge acquisition for knowledge-based systems workshop. University of Calgary, Calgary, Canada, pp 6–1
Herbert D, Kang BH (2018) Intelligent conversation system using multiple classification ripple down rules and conversational context. Expert Syst Appl 112:342–352
Article Google Scholar
Richards D (2009) Two decades of ripple down rules research. Knowl Eng Rev 24(2):159–184
Article Google Scholar
Randløv J and Alstrøm P (1988) Learning to drive a bicycle using reinforcement learning and shaping. In: ICML, vol 98, pp 463–471, Citeseer
Ng AY, Harada D, Russell S (1999) Policy invariance under reward transformations: theory and application to reward shaping. Proc. Int. Conf. Mach. Learn. ICML 99:278–287
Google Scholar
Devlin S and Kudenko D (2011) Theoretical considerations of potential-based reward shaping for multi-agent systems. In: The 10th international conference on autonomous agents and multiagent systems-vol 1, pp 225–232
Harutyunyan A, Devlin S, Vrancxn P, Nowé A (2015) Expressing arbitrary reward functions as potential-based advice.. In: AAAI, pp 2652–2658
Fernández F, Veloso M (2006) Probabilistic policy reuse in a reinforcement learning agent, in Proceedings of the fifth International Joint Conference on Autonomous Agents and Multi-Agent Systems. pp 720–727
Bignold A, Cruz F, Dazeley R, Vamplew P, Foale C (2021) An evaluation methodology for interactive reinforcement learning with simulated users. Biomimetics 6(1):13
Article Google Scholar
Kang BH, Preston P, Compton P, (1998) Simulated expert evaluation of multiple classification ripple down rules. In: Proceedings of the 11th workshop on knowledge acquisition, modeling and management
Compton P, Preston P, Kang B (1995) The use of simulated experts in evaluating knowledge acquisition. University of Calgary, Calgary
Google Scholar
Gaines BR, Compton P (1995) Induction of ripple-down rules applied to modeling large databases. J Intell Inf Syst 5(3):211–228
Article Google Scholar
Compton P, Peters L, Edwards G, Lavers TG (2006) Experience with ripple-down rules. Applications and innovations in intelligent systems XIII. Springer, pp 109–121

Download references

Acknowledgments

This work has been partially supported by the Australian Government Research Training Program (RTP) and the RTP Fee-Offset Scholarship through Federation University Australia.

Author information

Adam Bignold and Francisco Cruz have contributed equally to this manuscript.

Authors and Affiliations

School of Engineering, IT and Physical Sciences, Federation University, Ballarat, Australia
Adam Bignold, Peter Vamplew & Cameron Foale
School of Information Technology, Deakin University, Geelong, Australia
Francisco Cruz & Richard Dazeley
Escuela de Ingeniería, Universidad Central de Chile, Santiago, Chile
Francisco Cruz

Authors

Adam Bignold
View author publications
You can also search for this author in PubMed Google Scholar
Francisco Cruz
View author publications
You can also search for this author in PubMed Google Scholar
Richard Dazeley
View author publications
You can also search for this author in PubMed Google Scholar
Peter Vamplew
View author publications
You can also search for this author in PubMed Google Scholar
Cameron Foale
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Francisco Cruz.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bignold, A., Cruz, F., Dazeley, R. et al. Persistent rule-based interactive reinforcement learning. Neural Comput & Applic 35, 23411–23428 (2023). https://doi.org/10.1007/s00521-021-06466-w

Download citation

Received: 02 February 2021
Accepted: 26 August 2021
Published: 04 September 2021
Issue Date: November 2023
DOI: https://doi.org/10.1007/s00521-021-06466-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Persistent rule-based interactive reinforcement learning

Abstract

Access this article

Similar content being viewed by others

Human-in-the-loop machine learning: a state of the art

Multi-agent deep reinforcement learning: a survey

A practical guide to multi-objective reinforcement learning and planning

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Persistent rule-based interactive reinforcement learning

Abstract

Access this article

Similar content being viewed by others

Human-in-the-loop machine learning: a state of the art

Multi-agent deep reinforcement learning: a survey

A practical guide to multi-objective reinforcement learning and planning

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation