Dynamic Knowledge Injection for AIXI Agents

Authors

  • Samuel Yang-Zhao Australian National University
  • Kee Siong Ng Australian National University
  • Marcus Hutter Google DeepMind Australian National University

DOI:

https://doi.org/10.1609/aaai.v38i15.29575

Keywords:

ML: Reinforcement Learning, ML: Bayesian Learning, ML: Online Learning & Bandits, HAI: Human-in-the-loop Machine Learning

Abstract

Prior approximations of AIXI, a Bayesian optimality notion for general reinforcement learning, can only approximate AIXI's Bayesian environment model using an a-priori defined set of models. This is a fundamental source of epistemic uncertainty for the agent in settings where the existence of systematic bias in the predefined model class cannot be resolved by simply collecting more data from the environment. We address this issue in the context of Human-AI teaming by considering a setup where additional knowledge for the agent in the form of new candidate models arrives from a human operator in an online fashion. We introduce a new agent called DynamicHedgeAIXI that maintains an exact Bayesian mixture over dynamically changing sets of models via a time-adaptive prior constructed from a variant of the Hedge algorithm. The DynamicHedgeAIXI agent is the richest direct approximation of AIXI known to date and comes with good performance guarantees. Experimental results on epidemic control on contact networks validates the agent's practical utility.

Published

2024-03-24

How to Cite

Yang-Zhao, S., Ng, K. S., & Hutter, M. (2024). Dynamic Knowledge Injection for AIXI Agents. Proceedings of the AAAI Conference on Artificial Intelligence, 38(15), 16388-16397. https://doi.org/10.1609/aaai.v38i15.29575

Issue

Section

AAAI Technical Track on Machine Learning VI