Hostname: page-component-76fb5796d-vfjqv Total loading time: 0 Render date: 2024-04-27T14:34:37.228Z Has data issue: false hasContentIssue false

Variational inference as an alternative to MCMC for parameter estimation and model selection

Published online by Cambridge University Press:  25 January 2022

Geetakrishnasai Gunapati
Affiliation:
Department of Computer Science and Engineering, IIT Hyderabad, Kandi, Telangana 502285, India
Anirudh Jain
Affiliation:
Department of Computer Science, Aalto University, Espoo 02150, Finland
P. K. Srijith
Affiliation:
Department of Computer Science and Engineering, IIT Hyderabad, Kandi, Telangana 502285, India
Shantanu Desai*
Affiliation:
Department of Physics, IIT Hyderabad, Kandi, Telangana 502285, India
*
Author for correspondence: Shantanu Desai, e-mail: shntn05@gmail.com

Abstract

Most applications of Bayesian Inference for parameter estimation and model selection in astrophysics involve the use of Monte Carlo techniques such as Markov Chain Monte Carlo (MCMC) and nested sampling. However, these techniques are time-consuming and their convergence to the posterior could be difficult to determine. In this study, we advocate variational inference as an alternative to solve the above problems, and demonstrate its usefulness for parameter estimation and model selection in astrophysics. Variational inference converts the inference problem into an optimisation problem by approximating the posterior from a known family of distributions and using Kullback–Leibler divergence to characterise the difference. It takes advantage of fast optimisation techniques, which make it ideal to deal with large datasets and makes it trivial to parallelise on a multicore platform. We also derive a new approximate evidence estimation based on variational posterior, and importance sampling technique called posterior-weighted importance sampling for the calculation of evidence, which is useful to perform Bayesian model selection. As a proof of principle, we apply variational inference to five different problems in astrophysics, where Monte Carlo techniques were previously used. These include assessment of significance of annual modulation in the COSINE-100 dark matter experiment, measuring exoplanet orbital parameters from radial velocity data, tests of periodicities in measurements of Newton’s constant G, assessing the significance of a turnover in the spectral lag data of GRB 160625B, and estimating the mass of a galaxy cluster using weak gravitational lensing. We find that variational inference is much faster than MCMC and nested sampling techniques for most of these problems while providing competitive results. All our analysis codes have been made publicly available.

Type
Research Article
Copyright
© The Author(s), 2022. Published by Cambridge University Press on behalf of the Astronomical Society of Australia

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Adhikari, G., et al. 2019, PhRvL, 123, 031302Google Scholar
Anderson, J. D., Schubert, G., Trimble, V., & Feldman, M. R. 2015, EPL, 110, 10002 CrossRefGoogle Scholar
Armstrong, D. J., Gamper, J., & Damoulas, T. 2020, MNRAS, 504, 5327 CrossRefGoogle Scholar
Balan, S. T., & Lahav, O. 2009, MNRAS, 394, 1936 CrossRefGoogle Scholar
Bastien, D. J., Scaife, A. M. M., Tang, H., Bowles, M., & Porter, F. 2021, MNRAS, 503, 3351 CrossRefGoogle Scholar
Becker, M. R., & Kravtsov, A. V. 2011, ApJ, 740, 25 CrossRefGoogle Scholar
Bernabei, R., et al. 2018, NPAE, 19, 307 CrossRefGoogle Scholar
Bernardo, J. M., et al. 2003, The Variational Bayesian EM Algorithm for Incomplete Data: with Application to Scoring Graphical Model StructuresGoogle Scholar
Betancourt, M. 2018, A Conceptual Introduction to Hamiltonian Monte Carlo (arXiv:1701.02434)CrossRefGoogle Scholar
Blei, D. M., Kucukelbir, A., & McAuliffe, J. D. 2017, JASA, 112CrossRefGoogle Scholar
Blei, D. M., & Lafferty, J. D. 2007, AAS, 17Google Scholar
Blundell, C., Cornebise, J., Kavukcuoglu, K., & Wierstra, D. 2015, arXiv e-prints, p. arXiv:1505.05424 Google Scholar
Braun, M., & McAuliffe, J. 2010, JASA, 105, 324 CrossRefGoogle Scholar
Brewer, B. J. 2014, arXiv e-prints, p. arXiv:1411.3921 Google Scholar
Cai, X., McEwen, J. D., & Pereyra, M. 2021, arXiv e-prints, p. arXiv:2106.03646 Google Scholar
Cameron, S. A., Eggers, H. C., & Kroon, S. 2019, Entropy, 21, 1109 CrossRefGoogle Scholar
Carpenter, B., et al. 2016, JSS, 20, 1 Google Scholar
Desai, S. 2016, EPL (Europhysics Letters), 115, 20006 CrossRefGoogle Scholar
Desai, S., et al. 2004, PhRvD, 70, 083523Google Scholar
Feroz, F., Hobson, M. P., & Bridges, M. 2009, MNRAS, 398, 1601 CrossRefGoogle Scholar
Feroz, F., Hobson, M. P., Cameron, E., & Pettitt, A. N. 2019, OJA, 2CrossRefGoogle Scholar
Foreman-Mackey, D. 2016, JOSS, 24 CrossRefGoogle Scholar
Foreman-Mackey, D., Hogg, D. W., Lang, D., & Goodman, J. 2013, PASP, 125, 306 CrossRefGoogle Scholar
Freese, K., Frieman, J., & Gould, A. 1988, PhRvD, 37, 3388 CrossRefGoogle Scholar
Gabbard, H., Messenger, C., Heng, I. S., Tonolini, F., & Murray-Smith, R. 2020, Bayesian parameter estimation using conditional variational autoencoders for gravitational-wave astronomy (arXiv:1909.06296)CrossRefGoogle Scholar
Ganguly, S., & Desai, S. 2017, APh, 94, 17 CrossRefGoogle Scholar
Gelfand, A. E., & Smith, A. F. 1990, JASA, 85, 398 CrossRefGoogle Scholar
Gelman, A., & Rubin, D. B. 1992, SS, 7, 457 CrossRefGoogle Scholar
Goodman, J., & Weare, J. 2010, CAMCS, 5, 65 CrossRefGoogle Scholar
Gordon, C., & Trotta, R. 2007, MNRAS, 382, 1859 CrossRefGoogle Scholar
Gregory, P. C. 2005, in American Institute of Physics Conference Series, Vol. 803, Bayesian Inference and Maximum Entropy Methods in Science and Engineering, ed. K. H. Knuth, A. E. Abbas, R. D. Morris, & J. P. Castle, 139 (arXiv:astro-ph/0509412), 10.1063/1.2149789 Google Scholar
Hastings, W. K. 1970, Biometrika, 57, 97 CrossRefGoogle Scholar
Hinton, S. R. 2016, JOSS, 1, 00045 CrossRefGoogle Scholar
Hoekstra, H., Bartelmann, M., Dahle, H., Israel, H., Limousin, M., & Meneghetti, M. 2013, SSR, 177, 75 CrossRefGoogle Scholar
Hoffman, M. D., Blei, D. M., Wang, C., & Paisley, J. 2013, JMLR, 14, 1303 Google Scholar
Hoffman, M. D., & Gelman, A. 2011, The No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo (arXiv:1111.4246)Google Scholar
Hogg, D. W., & Foreman-Mackey, D. 2018, ApJS, 236, 11 CrossRefGoogle Scholar
Hortúa, H. J., Malagò, L., & Volpi, R. 2020a, MLST, 1, 035014 CrossRefGoogle Scholar
Hortúa, H. J., Volpi, R., Marinelli, D., & Malagò, L. 2020b, PhRvD, 102 CrossRefGoogle Scholar
Jaakkola, T. S., & Jordan, M. I. 1996, in Proceedings of the Twelfth International Conference on Uncertainty in Artificial Intelligence, 340Google Scholar
Jia, H., & Seljak, U. 2019. (arXiv:1912.06073)Google Scholar
Jiang, H., Jing, J., Wang, J., Liu, C., Li, Q., Xu, Y., Wang, J. T. L., & Wang, H. 2021, AJSS, 256, 20 CrossRefGoogle Scholar
Jimenez Rezende, D., Mohamed, S., & Wierstra, D. 2014, arXiv e-prints, p. arXiv:1401.4082 Google Scholar
Jordan, M. I., Ghahramani, Z., Jaakkola, T. S., & Saul, L. K. 1999, M, 37, 183 CrossRefGoogle Scholar
Kerscher, M., & Weller, J. 2019, ScPPL, 9 Google Scholar
Kingma, D. P., & Welling, M. 2013, arXiv e-prints, p. arXiv:1312.6114 Google Scholar
Knowles, D. A., & Minka, T. 2011, in Advances in Neural Information Processing Systems, 1701Google Scholar
Komatsu, E., et al. 2011, ApJS, 192, 18 CrossRefGoogle Scholar
Kravtsov, A. V., Klypin, A. A., & Khokhlov, A. M. 1997, ApJS, 111, 73 CrossRefGoogle Scholar
Krishak, A., & Desai, S. 2019, OJA, 2CrossRefGoogle Scholar
Kucukelbir, A., Tran, D., Ranganath, R., Gelman, A., & Blei, D. M. 2016, arXiv e-prints, p. arXiv:1603.00788 Google Scholar
Kullback, S., & Leibler, R. A. 1951, AMS, 22, 79 CrossRefGoogle Scholar
Lewis, A. 2019, arXiv e-prints, p. arXiv:1910.13970 Google Scholar
Lin, Y.-C., & Wu, J.-H. P. 2021, PhRvD, 103 Google Scholar
Liu, B. 2014, ApJS, 213, 14 CrossRefGoogle Scholar
MacKay, D. J. 1992, NC, 4, 448 CrossRefGoogle Scholar
Maturana-Russel, P., Meyer, R., Veitch, J., & Christensen, N. 2019, PhRvD, 99, 084006 CrossRefGoogle Scholar
Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H., & Teller, E. 1953, JCP, 21, 1087 CrossRefGoogle Scholar
Morales-álvarez, P., Ruiz, P., Coughlin, S., Molina, R., & Katsaggelos, A. K. 2019, Scalable Variational Gaussian Processes for Crowdsourcing: Glitch Detection in LIGO (arXiv:1911.01915)CrossRefGoogle Scholar
Murphy, K. P. 2013, Machine Learning: A Probabilistic Perspective (MIT Press)Google Scholar
Neal, R. 2001, SC, 11 CrossRefGoogle Scholar
Paisley, J., Blei, D., & Jordan, M. 2012, arXiv e-prints, p. arXiv:1206.6430 Google Scholar
Perreault Levasseur, L., Hezaveh, Y. D., & Wechsler, R. H. 2017, ApJ, 850, L7 CrossRefGoogle Scholar
Pitkin, M. 2015, EPL (Europhysics Letters), 111, 30002 CrossRefGoogle Scholar
Ranganath, R., Gerrish, S., & Blei, D. 2014, in Proceedings of Machine Learning Research Vol. 33, Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, ed. Kaski, S., & J. Corander, PMLR, 814Google Scholar
Ravanbakhsh, S., Lanusse, F., Mandelbaum, R., Schneider, J., & Poczos, B. 2016, Enabling Dark Energy Science with Deep Generative Models of Galaxy Images (arXiv:1609.05796)Google Scholar
Regier, J., Miller, A. C., Schlegel, D., Adams, R. P., McAuliffe, J. D., & Prabhat 2018, preprint, (arXiv:1803.00113)Google Scholar
Robert, C., & Casella, G. 2011, SS, 102 Google Scholar
Saha, P., & Williams, T. B. 1994, AJ, 107, 1295 CrossRefGoogle Scholar
Salimans, T., & Knowles, D. A. 2014, arXiv e-prints, p. arXiv:1401.1022 Google Scholar
Salvatier, J., Wiecki, T. V., & Fonnesbeck, C. 2016, PeerJCS, 2, e55 CrossRefGoogle Scholar
Schneider, P., Ehlers, J., & Falco, E. E. 1992, GL, 10.1007/978-3-662-03758-4. Google Scholar
Sharma, S. 2017, ARA&A, 55, 213 CrossRefGoogle Scholar
Skilling, J., et al. 2006, BA, 1, 833 Google Scholar
Sokal, A. 1997, Functional Integration (Springer), 131Google Scholar
Speagle, J. S. 2019, arXiv e-prints, p. arXiv:1909.12313 Google Scholar
Speagle, J. S. 2020, MNRAS, 493, 3132 CrossRefGoogle Scholar
Spindler, A., Geach, J. E., & Smith, M. J. 2020, MNRAS, 502, 985 CrossRefGoogle Scholar
Titsias, M., & Lázaro-Gredilla, M. 2014, in Proceedings of the 31st International Conference on Machine Learning (ICML-14), 1971Google Scholar
Trotta, R. 2017, preprint, (arXiv:1701.01467)Google Scholar
Vousden, W. D., Farr, W. M., & Mandel, I. 2015, MNRAS, 455, 1919 CrossRefGoogle Scholar
Walmsley, M., et al. 2019, MNRAS, 491, 1554 CrossRefGoogle Scholar
Wang, C., & Blei, D. M. 2013, JMLR, 14, 1005 CrossRefGoogle Scholar
Wei, J.-J., Zhang, B.-B., Shao, L., Wu, X.-F., & MÉszáros, P. 2017, ApJ, 834, L13 CrossRefGoogle Scholar