Abstract
We show how to automatically synthesize probabilistic programs from real-world datasets. Such a synthesis is feasible due to a combination of two techniques: (1) We borrow the idea of ``sketching'' from synthesis of deterministic programs, and allow the programmer to write a skeleton program with ``holes''. Sketches enable the programmer to communicate domain-specific intuition about the structure of the desired program and prune the search space, and (2) we design an efficient Markov Chain Monte Carlo (MCMC) based synthesis algorithm to instantiate the holes in the sketch with program fragments. Our algorithm efficiently synthesizes a probabilistic program that is most consistent with the data. A core difficulty in synthesizing probabilistic programs is computing the likelihood L(P | D) of a candidate program P generating data D. We propose an approximate method to compute likelihoods using mixtures of Gaussian distributions, thereby avoiding expensive computation of integrals. The use of such approximations enables us to speed up evaluation of the likelihood of candidate programs by a factor of 1000, and makes Markov Chain Monte Carlo based search feasible. We have implemented our algorithm in a tool called PSKETCH, and our results are encouraging PSKETCH is able to automatically synthesize 16 non-trivial real-world probabilistic programs.
- Y. Bachrach, T. Graepel, T. Minka, and J. Guiver. How to grade a test without knowing the answers—a bayesian graphical model for adaptive crowdsourcing and aptitude testing. arXiv preprint arXiv:1206.6386, 2012.Google Scholar
- S. Bhat, J. Borgström, A. D. Gordon, and C. V. Russo. Deriving probability density functions from probabilistic functional programs. In Tools and Algorithms for the Construction and Analysis of Systems (TACAS), pages 508–522, 2013. Google ScholarDigital Library
- S. Chib and E. Greenberg. Understanding the Metropolis-Hastings algorithm. American Statistician, 49(4):327–335, 1995.Google Scholar
- A. Gelman, J. B. Carlin, H. S. Stern, D. B. Dunson, A. Vehtari, and D. B. Rubin. Bayesian data analysis. CRC press, 2013.Google ScholarCross Ref
- R. Gens and P. Domingos. Learning the structure of sum-product networks. In International Conference on Machine Learning (ICML), pages 873–880, 2013.Google Scholar
- W. R. Gilks, A. Thomas, and D. J. Spiegelhalter. A language and program for complex Bayesian modelling. The Statistician, 43(1):169– 177, 1994.Google ScholarCross Ref
- V. Gogate, W. A. Webb, and P. Domingos. Learning efficient markov networks. In Neural Information Processing Systems (NIPS), pages 748–756, 2010.Google Scholar
- N. D. Goodman, V. K. Mansinghka, D. M. Roy, K. Bonawitz, and J. B. Tenenbaum. Church: a language for generative models. In Uncertainty in Artificial Intelligence (UAI), pages 220–229, 2008.Google Scholar
- A. D. Gordon, T. A. Henzinger, A. V. Nori, and S. K. Rajamani. Probabilistic programming. In Future of Software Engineering, FOSE 2014, pages 167–181, 2014. Google ScholarDigital Library
- A. D. Gordon, T. A. Henzinger, A. V. Nori, and S. K. Rajamani. Probabilistic programming. In Future of Software Engineering (FOSE), pages 167–181, 2014. Google ScholarDigital Library
- S. Gulwani. Dimensions in program synthesis. In Principles and Practice of Declarative Programming (PPDP), 2010. http://research.microsoft.com/˜sumitg/pubs/ppdp10-synthesis.pdf. Google ScholarDigital Library
- R. Herbrich, T. Minka, and T. Graepel. TrueSkill: A Bayesian skill rating system. In Neural Information Processing Systems (NIPS), pages 569–576, 2006.Google Scholar
- M. D. Hoffman and A. Gelman. The no-U-turn sampler: Adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, in press, 2013.Google Scholar
- J. H. Kim and J. Pearl. A computational model for causal and diagnostic reasoning in inference systems. In IJCAI, volume 83, pages 190–193. Citeseer, 1983. Google ScholarDigital Library
- S. Kok, M. Sumner, M. Richardson, P. Singla, H. Poon, D. Lowd, and P. Domingos. The Alchemy system for Statistical Relational AI. Technical report, University of Washington, 2007.Google Scholar
- D. Koller, D. A. McAllester, and A. Pfeffer. Effective Bayesian inference for stochastic programs. In National Conference on Artificial Intelligence (AAAI), pages 740–747, 1997. Google ScholarDigital Library
- D. Kozen. Semantics of probabilistic programs. Journal of Computer and System Science (JCSS), 22:328–350, 1981.Google Scholar
- P. Liang, M. I. Jordan, and D. Klein. Learning programs: A hierarchical bayesian approach. In Proceedings of the 27th International Conference on Machine Learning (ICML-10), June 21-24, 2010, Haifa, Israel, pages 639–646, 2010.Google Scholar
- D. Lowd and P. Domingos. Learning arithmetic circuits. In Uncertainty in Artificial Intelligence (UAI), pages 383–392, 2008.Google Scholar
- D. J. C. MacKay. Information Theory, Inference & Learning Algorithms. Cambridge University Press, New York, NY, USA, 2002. Google ScholarDigital Library
- C. J. Maddison and D. Tarlow. Structured generative models of natural source code. In International Conference on Machine Learning (ICML), pages 649–657, 2014.Google Scholar
- V. Maz’ya and G. Schmidt. On approximate approximations using gaussian kernels. IMA Journal of Numerical Analysis, 16:13–29, 1996.Google ScholarCross Ref
- T. Minka, J. Winn, J. Guiver, and A. Kannan. Infer.NET 2.3, 2009.Google Scholar
- A. V. Nori, C.-K. Hur, S. K. Rajamani, and S. Samuel. R2: An efficient mcmc sampler for probabilistic programs. In AAAI Conference on Artificial Intelligence. AAAI Press, July 2014.Google Scholar
- A. Pfeffer. The design and implementation of IBAL: A generalpurpose probabilistic language. In Statistical Relational Learning, pages 399–432, 2007.Google Scholar
- J. Pfeffer. Probabilistic Reasoning in Intelligence Systems. Morgan Kaufmann, 1996.Google Scholar
- E. Schkufza, R. Sharma, and A. Aiken. Stochastic superoptimization. In Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 305–316, 2013. Google ScholarDigital Library
- A. Solar-Lezama, R. M. Rabbah, R. Bod´ık, and K. Ebcioglu. Programming by sketching for bit-streaming programs. In Programming Language Design and Implementation (PLDI), pages 281–294, 2005. Google ScholarDigital Library
- S. Srivastava, S. Gulwani, and J. Foster. From program verification to program synthesis. In Principles of Programming Languages (POPL), pages 313–326, 2010. Google ScholarDigital Library
Recommendations
Efficient synthesis of probabilistic programs
PLDI '15: Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and ImplementationWe show how to automatically synthesize probabilistic programs from real-world datasets. Such a synthesis is feasible due to a combination of two techniques: (1) We borrow the idea of ``sketching'' from synthesis of deterministic programs, and allow ...
Can reactive synthesis and syntax-guided synthesis be friends?
SPLASH Companion 2021: Companion Proceedings of the 2021 ACM SIGPLAN International Conference on Systems, Programming, Languages, and Applications: Software for HumanityWhile reactive synthesis and syntax-guided synthesis (SyGuS) have seen enormous progress in recent years, combining the two approaches has remained a challenge. In this work, we present the synthesis of reactive programs from Temporal Stream Logic ...
Optimizing synthesis with metasketches
POPL '16Many advanced programming tools---for both end-users and expert developers---rely on program synthesis to automatically generate implementations from high-level specifications. These tools often need to employ tricky, custom-built synthesis algorithms ...
Comments