ABSTRACT
Calibrating agent-based models (ABMs) in economics and finance typically involves a derivative-free search in a very large parameter space. In this work, we benchmark a number of search methods in the calibration of a well-known macroeconomic ABM on real data, and further assess the performance of "mixed strategies" made by combining different methods. We find that methods based on random-forest surrogates are particularly efficient, and that combining search methods generally increases performance since the biases of any single method are mitigated. Moving from these observations, we propose a reinforcement learning (RL) scheme to automatically select and combine search methods on-the-fly during a calibration run. The RL agent keeps exploiting a specific method only as long as this keeps performing well, but explores new strategies when the specific method reaches a performance plateau. The resulting RL search scheme outperforms any other method or method combination tested, and does not rely on any prior information or trial and error procedure.
- Claudio Angione, Eric Silverman, and Elisabeth Yaneske. 2022. Using machine learning as a surrogate model for agent-based simulations. Plos one 17, 2 (2022), e0263150.Google ScholarCross Ref
- Leo Ardon, Nelson Vadori, Thomas Spooner, Mengda Xu, Jared Vann, and Sumitra Ganesh. 2021. Towards a fully RL-based Market Simulator. In Proceedings of the Second ACM International Conference on AI in Finance. 1–9.Google ScholarDigital Library
- Leo Ardon, Jared Vann, Deepeka Garg, Thomas Spooner, and Sumitra Ganesh. 2023. Phantom-A RL-driven Multi-Agent Framework to Model Complex Systems. In Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems. 2742–2744.Google Scholar
- Tiziana Assenza, Domenico Delli Gatti, and Jakob Grazzini. 2015. Emergent dynamics of a macroeconomic agent based model with capital and credit. Journal of Economic Dynamics and Control 50 (2015), 5–28. https://doi.org/10.1016/j.jedc.2014.07.001 Crises and Complexity.Google ScholarCross Ref
- Peter Auer, Nicolo Cesa-Bianchi, and Paul Fischer. 2002. Finite-time analysis of the multiarmed bandit problem. Machine learning 47, 2 (2002), 235–256.Google ScholarDigital Library
- Peter Auer, Nicolò Cesa-Bianchi, Yoav Freund, and Robert E. Schapire. 2002. The Nonstochastic Multiarmed Bandit Problem. SIAM J. Comput. 32, 1 (2002), 48–77.Google ScholarDigital Library
- Robert L Axtell and J Doyne Farmer. 2022. Agent-based modeling in economics and finance: Past, present, and future. Journal of Economic Literature (2022).Google Scholar
- Lukáš Bajer, Zbyněk Pitra, and Martin Holeňa. 2015. Benchmarking Gaussian processes and random forests surrogate models on the BBOB noiseless testbed. In Proceedings of the Companion Publication of the 2015 Annual Conference on Genetic and Evolutionary Computation. 1143–1150.Google ScholarDigital Library
- Rafa Baptista, J Doyne Farmer, Marc Hinterschweiger, Katie Low, Daniel Tang, and Arzu Uluc. 2016. Macroprudential policy in an agent-based model of the UK housing market. (2016).Google Scholar
- Marco Benedetti, Gennaro Catapano, Francesco De Sclavis, Marco Favorito, Aldo Glielmo, Davide Magnanimi, and Antonio Muci. 2022. Black-it: A Ready-to-Use and Easy-to-Extend Calibration Kit for Agent-based Models. Journal of Open Source Software 7, 79 (2022), 4622. https://doi.org/10.21105/joss.04622Google ScholarCross Ref
- Donald A Berry and Bert Fristedt. 1985. Bandit problems: sequential allocation of experiments (Monographs on statistics and applied probability). London: Chapman and Hall 5, 71-87 (1985), 7–7.Google ScholarCross Ref
- Richard Bookstaber, Mark Paddrik, and Brian Tivnan. 2014. An agent-based model for financial vulnerability. Technical Report. Office of Financial Research Working Paper Series.Google Scholar
- William A Brock and Cars H Hommes. 1998. Heterogeneous beliefs and routes to chaos in a simple asset pricing model. Journal of Economic dynamics and Control 22, 8-9 (1998), 1235–1274.Google ScholarCross Ref
- Sébastien Bubeck, Nicolo Cesa-Bianchi, 2012. Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Foundations and Trends® in Machine Learning 5, 1 (2012), 1–122.Google ScholarCross Ref
- Adrian Carro. 2022. Could Spain be less different? Exploring the effects of macroprudential policy on the house price cycle. (2022).Google Scholar
- Gennaro Catapano, Francesco Franceschi, Michele Loberto, and Valentina Michelangeli. 2021. Macroprudential policy analysis via an agent based model of the real estate sector. Bank of Italy Temi di Discussione (Working Paper) No 1338 (2021).Google Scholar
- Mr Jorge A Chan-Lau. 2017. ABBA: An agent-based model of the banking system. International Monetary Fund.Google Scholar
- Tianqi Chen and Carlos Guestrin. 2016. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining. 785–794.Google ScholarDigital Library
- Zhenxi Chen and Thomas Lux. 2018. Estimation of sentiment effects in financial markets: A simulated method of moments approach. Computational Economics 52, 3 (2018), 711–744.Google ScholarDigital Library
- Graeme Cokayne. 2019. The effects of macroprudential policies on house price cycles in an agent-based model of the Danish housing market. Technical Report. Danmarks Nationalbank Working Papers.Google Scholar
- Stefano Conti and Anthony O’Hagan. 2010. Bayesian emulation of complex multi-output and dynamic computer models. Journal of statistical planning and inference 140, 3 (2010), 640–651.Google ScholarCross Ref
- G Covi, M Montagna, and G Torri. 2020. On the Origins of Systemic Risk. Technical Report. European Central Bank Working Papers.Google Scholar
- Herbert Dawid and Domenico Delli Gatti. 2018. Agent-based macroeconomics. Handbook of computational economics 4 (2018), 63–156.Google Scholar
- Domenico Delli Gatti, Saul Desiderio, Edoardo Gaffeo, Pasquale Cirillo, and Mauro Gallegati. 2011. Macroeconomics from the Bottom-up. Vol. 1. Springer Science & Business Media.Google Scholar
- Domenico Delli Gatti and Jakob Grazzini. 2020. Rising to the challenge: Bayesian estimation and forecasting techniques for macroeconomic Agent Based Models. Journal of Economic Behavior & Organization 178 (2020), 875–902.Google ScholarCross Ref
- Giovanni Dosi, Giorgio Fagiolo, and Andrea Roventini. 2010. Schumpeter meeting Keynes: A policy-friendly model of endogenous growth and business cycles. Journal of Economic Dynamics and Control 34, 9 (2010), 1748–1767.Google ScholarCross Ref
- Joel Dyer, Patrick Cannon, J Doyne Farmer, and Sebastian Schmon. 2022. Black-box Bayesian inference for economic agent-based models. arXiv preprint arXiv:2202.00625 (2022).Google Scholar
- Giorgio Fagiolo, Alessio Moneta, and Paul Windrum. 2007. A critical guide to empirical validation of agent-based models in economics: Methodologies, procedures, and open problems. Computational Economics 30, 3 (2007), 195–226.Google ScholarDigital Library
- Reiner Franke. 2009. Applying the method of simulated moments to estimate a small agent-based asset pricing model. Journal of Empirical Finance 16, 5 (2009), 804–815.Google ScholarCross Ref
- Reiner Franke and Frank Westerhoff. 2012. Structural stochastic volatility in asset pricing dynamics: Estimation and model contest. Journal of Economic Dynamics and Control 36, 8 (2012), 1193–1211.Google ScholarCross Ref
- Aurélien Garivier and Olivier Cappé. 2011. The KL-UCB algorithm for bounded stochastic bandits and beyond. In Proceedings of the 24th annual conference on learning theory. JMLR Workshop and Conference Proceedings, 359–376.Google Scholar
- Manfred Gilli and Peter Winker. 2003. A global optimization heuristic for estimating agent based models. Computational Statistics & Data Analysis 42, 3 (2003), 299–312.Google ScholarDigital Library
- John Gittins, Kevin Glazebrook, and Richard Weber. 2011. Multi-armed bandit allocation indices. John Wiley & Sons.Google Scholar
- Jakob Grazzini and Matteo Richiardi. 2015. Estimation of ergodic agent-based models by simulated minimum distance. Journal of Economic Dynamics and Control 51 (2015), 148–165.Google ScholarCross Ref
- Jakob Grazzini, Matteo G Richiardi, and Mike Tsionas. 2017. Bayesian estimation of agent-based models. Journal of Economic Dynamics and Control 77 (2017), 26–47.Google ScholarCross Ref
- John H Halton. 1964. Algorithm 247: Radical-inverse quasi-random point sequence. Commun. ACM 7, 12 (1964), 701–702.Google ScholarDigital Library
- Cars Hommes, Mario He, Sebastian Poledna, Melissa Siqueira, and Yang Zhang. 2022. CANVAS: A Canadian Behavioral Agent-Based Model. Technical Report. Bank of Canada.Google Scholar
- Leslie Pack Kaelbling, Michael L Littman, and Anthony R Cassandra. 1998. Planning and acting in partially observable stochastic domains. Artificial intelligence 101, 1-2 (1998), 99–134.Google Scholar
- Michael N Katehakis and Arthur F Veinott Jr. 1987. The multi-armed bandit problem: decomposition and computation. Mathematics of Operations Research 12, 2 (1987), 262–268.Google ScholarCross Ref
- Ali Kaveh. 2017. Particle swarm optimization. In Advances in Metaheuristic Algorithms for Optimal Design of Structures. Springer, 11–43.Google Scholar
- Paul Knysh and Yannis Korkolis. 2016. Blackbox: A procedure for parallel optimization of expensive black-box functions. arXiv preprint arXiv:1605.00998 (2016).Google Scholar
- Ladislav Kocis and William J Whiten. 1997. Computational investigations of low-discrepancy sequences. ACM Transactions on Mathematical Software (TOMS) 23, 2 (1997), 266–294.Google ScholarDigital Library
- Francesco Lamperti. 2018. An information theoretic criterion for empirical validation of simulation models. Econometrics and Statistics 5 (2018), 83–106.Google ScholarCross Ref
- Francesco Lamperti, Andrea Roventini, and Amir Sani. 2018. Agent-based model calibration using machine learning surrogates. Journal of Economic Dynamics and Control 90 (2018), 366–389.Google ScholarCross Ref
- John Langford and Tong Zhang. 2007. The epoch-greedy algorithm for contextual multi-armed bandits. Advances in neural information processing systems 20, 1 (2007), 96–1.Google Scholar
- Tor Lattimore and Csaba Szepesvári. 2020. Bandit algorithms. Cambridge University Press.Google Scholar
- Lihong Li, Wei Chu, John Langford, and Robert E. Schapire. 2010. A Contextual-Bandit Approach to Personalized News Article Recommendation. In Proceedings of the 19th international conference on World wide web - WWW ’10. 661. https://doi.org/10.1145/1772690.1772758 arXiv:1003.0146 [cs].Google ScholarDigital Library
- Michael W McCracken and Serena Ng. 2016. FRED-MD: A monthly database for macroeconomic research. Journal of Business & Economic Statistics 34, 4 (2016), 574–589.Google ScholarCross Ref
- Bence Méro, András Borsos, Zsuzsanna Hosszú, Zsolt Oláh, and Nikolett Vágó. 2022. A high resolution agent-based model of the hungarian housing market. MNB Working Papers 7 (2022).Google Scholar
- Corrado Monti, Marco Pangallo, Gianmarco De Francisci Morales, and Francesco Bonchi. 2023. On learning agent-based models from data. Scientific Reports 13, 1 (2023), 9268.Google ScholarCross Ref
- Romain Plassard 2020. Making a Breach: The Incorporation of Agent-Based Models into the Bank of England’s Toolkit. Technical Report. Groupe de REcherche en Droit, Economie, Gestion (GREDEG CNRS), Université ….Google Scholar
- Donovan Platt. 2020. A comparison of economic agent-based model calibration methods. Journal of Economic Dynamics and Control 113 (2020), 103859.Google ScholarCross Ref
- Donovan Platt. 2021. Bayesian estimation of economic simulation models using neural networks. Computational Economics (2021), 1–52.Google Scholar
- Sebastian Poledna, Michael Gregor Miess, Cars Hommes, and Katrin Rabitsch. 2023. Economic forecasting with an agent-based model. European Economic Review 151 (2023), 104306.Google ScholarCross Ref
- Vishnu Raj and Sheetal Kalyani. 2017. Taming non-stationary bandits: A Bayesian approach. arXiv preprint arXiv:1707.09727 (2017).Google Scholar
- Carl Edward Rasmussen. 2004. Gaussian processes in machine learning. Springer.Google Scholar
- Morten O Ravn and Harald Uhlig. 2002. On adjusting the Hodrick-Prescott filter for the frequency of observations. Review of economics and statistics 84, 2 (2002), 371–376.Google Scholar
- Marcos Simoes, MM Telo da Gama, and André Nunes. 2008. Stochastic fluctuations in epidemics on networks. Journal of the Royal Society Interface 5, 22 (2008), 555–566.Google ScholarCross Ref
- Forrest J Stonedahl. 2011. Genetic algorithms for the exploration of parameter spaces in agent-based models. Ph. D. Dissertation. Northwestern University.Google Scholar
- Richard S Sutton and Andrew G Barto. 2018. Reinforcement learning: An introduction. MIT press.Google ScholarDigital Library
- Arthur Turrell. 2016. Agent-based models: understanding the economy from the bottom up. Bank of England Quarterly Bulletin (2016), Q4.Google Scholar
- Nelson Vadori, Leo Ardon, Sumitra Ganesh, Thomas Spooner, Selim Amrouni, Jared Vann, Mengda Xu, Zeyu Zheng, Tucker Balch, and Manuela Veloso. 2022. Towards Multi-Agent Reinforcement Learning driven Over-The-Counter Market Simulations. arXiv preprint arXiv:2210.07184 (2022).Google Scholar
- Richard Weber. 1992. On the Gittins index for multiarmed bandits. The Annals of Applied Probability (1992), 1024–1033.Google Scholar
Index Terms
- Reinforcement Learning for Combining Search Methods in the Calibration of Economic ABMs
Recommendations
Reward Shaping in Episodic Reinforcement Learning
AAMAS '17: Proceedings of the 16th Conference on Autonomous Agents and MultiAgent SystemsRecent advancements in reinforcement learning confirm that reinforcement learning techniques can solve large scale problems leading to high quality autonomous decision making. It is a matter of time until we will see large scale applications of ...
Principled methods for biasing reinforcement learning agents
AICI'11: Proceedings of the Third international conference on Artificial intelligence and computational intelligence - Volume Part IIReinforcement learning (RL) is a powerful technique for learning in domains where there is no instructive feedback but only evaluative feedback and is rapidly expanding in industrial and research fields. One of the main limitations of RL is the slowness ...
Combining manual feedback with subsequent MDP reward signals for reinforcement learning
AAMAS '10: Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1As learning agents move from research labs to the real world, it is increasingly important that human users, including those without programming skills, be able to teach agents desired behaviors. Recently, the tamer framework was introduced for ...
Comments