research-article

Open Access

Optimal approximate sampling from discrete probability distributions

Authors:
Feras A. Saad

Massachusetts Institute of Technology, USA

Massachusetts Institute of Technology, USA
View Profile

,
Cameron E. Freer

Massachusetts Institute of Technology, USA

Massachusetts Institute of Technology, USA
View Profile

,
Martin C. Rinard

Massachusetts Institute of Technology, USA

Massachusetts Institute of Technology, USA
View Profile

,
Vikash K. Mansinghka

Massachusetts Institute of Technology, USA

Massachusetts Institute of Technology, USA
View Profile

Proceedings of the ACM on Programming Languages Volume 4 Issue POPLArticle No.: 36pp 1–31https://doi.org/10.1145/3371104

Published:20 December 2019Publication History

Proceedings of the ACM on Programming Languages

Abstract

This paper addresses a fundamental problem in random variate generation: given access to a random source that emits a stream of independent fair bits, what is the most accurate and entropy-efficient algorithm for sampling from a discrete probability distribution (p₁, …, p_n), where the probabilities of the output distribution (p̂₁, …, p̂_n) of the sampling algorithm must be specified using at most k bits of precision? We present a theoretical framework for formulating this problem and provide new techniques for finding sampling algorithms that are optimal both statistically (in the sense of sampling accuracy) and information-theoretically (in the sense of entropy consumption). We leverage these results to build a system that, for a broad family of measures of statistical accuracy, delivers a sampling algorithm whose expected entropy usage is minimal among those that induce the same distribution (i.e., is “entropy-optimal”) and whose output distribution (p̂₁, …, p̂_n) is a closest approximation to the target distribution (p₁, …, p_n) among all entropy-optimal sampling algorithms that operate within the specified k-bit precision. This optimal approximate sampler is also a closer approximation than any (possibly entropy-suboptimal) sampler that consumes a bounded amount of entropy with the specified precision, a class which includes floating-point implementations of inversion sampling and related methods found in many software libraries. We evaluate the accuracy, entropy consumption, precision requirements, and wall-clock runtime of our optimal approximate sampling algorithms on a broad set of distributions, demonstrating the ways that they are superior to existing approximate samplers and establishing that they often consume significantly fewer resources than are needed by exact samplers.

Supplemental Material

a36-saad.webm

webm

94.9 MB

Download

Available for Download

zip

popl20main-p126-p-aux.zip (1.3 MB)

This file contains the appendices of the main paper.

References

Julia Abrahams. 1996. Generation of Discrete Distributions from Biased Coins. IEEE Trans. Inf. Theory 42, 5 (Sept. 1996), 1541–1546.Google ScholarDigital Library
S. M. Ali and S. D. Silvey. 1966. A General Class of Coefficients of Divergence of One Distribution from Another. J. R. Stat. Soc. B. 28, 1 (Jan. 1966), 131–142.Google Scholar
Ziv Bar-Yossef, Thathachar S. Jayram, Ravi Kumar, and D. Sivakumar. 2004. An Information Statistics Approach to Data Stream and Communication Complexity. J. Comput. Syst. Sci. 68, 4 (June 2004), 702–732.Google ScholarDigital Library
Kurt Binder (Ed.). 1986. Monte Carlo Methods in Statistical Physics (2 ed.). Topics in Current Physics, Vol. 7. Springer-Verlag, Berlin.Google Scholar
Antonio Blanca and Milena Mihail. 2012. Efficient Generation ϵ-close to G(n, p) and Generalizations. (April 2012). arXiv: 1204.5834Google Scholar
Lenore Blum, Felipe Cucker, Michael Shub, and Steve Smale. 1998. Complexity and Real Computation. Springer-Verlag, New York.Google Scholar
Manuel Blum. 1986. Independent Unbiased Coin Flips from a Correlated Biased Source: A Finite State Markov Chain. Combinatorica 6, 2 (June 1986), 97–108.Google Scholar
Karl Bringmann and Tobias Friedrich. 2013. Exact and Efficient Generation of Geometric Random Variates and Random Graphs. In ICALP 2013: Proceedings of the 40th International Colloquium on Automata, Languages and Programming (Riga, Latvia). Lecture Notes in Computer Science, Vol. 7965. Springer, Heidelberg, 267–278.Google Scholar
Karl Bringmann and Konstantinos Panagiotou. 2017. Efficient Sampling Methods for Discrete Distributions. Algorithmica 79, 2 (Oct. 2017), 484–508.Google ScholarDigital Library
Ferdinando Cicalese, Luisa Gargano, and Ugo Vaccaro. 2006. A Note on Approximation of Uniform Distributions from Variable-to-Fixed Length Codes. IEEE Trans. Inf. Theory 52, 8 (Aug. 2006), 3772–3777.Google ScholarDigital Library
Thomas M. Cover and Joy A. Thomas. 2006. Elements of Information Theory (2 ed.). John Wiley & Sons, Inc., Hoboken.Google ScholarDigital Library
Marco F. Cusumano-Towner, Feras A. Saad, Alexander K. Lew, and Vikash K. Mansinghka. 2019. Gen: A General-purpose Probabilistic Programming System with Programmable Inference. In PLDI 2019: Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation (Phoenix, AZ, USA). ACM, New York, 221–236.Google Scholar
Christian de Schryver, Daniel Schmidt, Norbert Wehn, Elke Korn, Henning Marxen, Anton Kostiuk, and Ralf Korn. 2012. A Hardware Efficient Random Number Generator for Nonuniform Distributions with Arbitrary Precision. Int. J. Reconf. Comput. 2012, Article 675130 (2012), 11 pages.Google ScholarDigital Library
Luc Devroye. 1982. A Note on Approximations in Random Variate Generation. J. Stat. Comput. Simul. 14, 2 (1982), 149–158.Google ScholarCross Ref
Luc Devroye. 1986. Non-Uniform Random Variate Generation. Springer-Verlag, New York.Google Scholar
Luc Devroye and Claude Gravel. 2015. Sampling with Arbitrary Precision. (Feb. 2015). arXiv: 1502.02539Google Scholar
Inderjit S. Dhillon, Subramanyam Mallela, and Rahul Kumar. 2003. A Divisive Information-Theoretic Feature Clustering Algorithm for Text Classification. J. Mach. Learn. Res. 3 (March 2003), 1265–1287.Google Scholar
Dragan Djuric. 2019. Billions of Random Numbers in a Blink of an Eye. Retrieved June 15, 2019 from https://dragan.rocks/ articles/19/Billion- random- numbers- blink- eye- ClojureGoogle Scholar
Chaohui Du and Guoqiang Bai. 2015. Towards Efficient Discrete Gaussian Sampling For Lattice-Based Cryptography. In FPL 2015: Proceedings of the 25th International Conference on Field Programmable Logic and Applications (London, UK). IEEE Press, Piscataway, 1–6.Google Scholar
Nagarjun C. Dwarakanath and Steven D. Galbraith. 2014. Sampling from Discrete Gaussians for Lattice-Based Cryptography On a Constrained Device. Appl. Algebr. Eng. Comm. 25, 3 (June 2014), 159–180.Google Scholar
Peter Elias. 1972. The Efficient Construction of an Unbiased Random Sequence. Ann. Math. Stat. 43, 3 (June 1972), 865–870.Google Scholar
János Folláth. 2014. Gaussian Sampling in Lattice Based Cryptography. Tatra Mount. Math. Pub. 60, 1 (Sept. 2014), 1–23.Google Scholar
Mark Galassi, Jim Davies, James Theiler, Brian Gough, Gerard Jungman, Patrick Alken, Michael Booth, Fabrice Rossi, and Rhys Ulerich. 2019. GNU Scientific Library. Free Software Foundation.Google Scholar
Paul Glasserman. 2003. Monte Carlo Methods in Financial Engineering. Stochastic Modeling and Applied Probability, Vol. 53. Springer Science+Business Media, New York.Google Scholar
Andrew D. Gordon, Thomas A. Henzinger, Aditya V. Nori, and Sriram K. Rajamani. 2014. Probabilistic Programming. In FOSE 2014: Proceedings of the on Future of Software Engineering (Hyderabad, India). ACM, New York, 167–181.Google Scholar
Te Sun Han and Mamoru Hoshi. 1997. Interval Algorithm for Random Number Generation. IEEE Trans. Inf. Theory 43, 2 (March 1997), 599–611.Google Scholar
Te Sun Han and Sergio Verdú. 1993. Approximation Theory of Output Statistics. IEEE Trans. Inf. Theory 39, 3 (May 1993), 752–772.Google ScholarDigital Library
John Harling. 1958. Simulation Techniques in Operations Research—A Review. Oper. Res. 6, 3 (June 1958), 307–319.Google ScholarDigital Library
Eric Jonas. 2014. Stochastic Architectures for Probabilistic Computation. Ph.D. Dissertation. Massachusetts Institute of Technology.Google Scholar
Donald E. Knuth and Andrew C. Yao. 1976. The Complexity of Nonuniform Random Number Generation. In Algorithms and Complexity: New Directions and Recent Results, Joseph F. Traub (Ed.). Academic Press, Inc., Orlando, FL, 357–428.Google Scholar
Dexter Kozen. 2014. Optimal Coin Flipping. In Horizons of the Mind. A Tribute to Prakash Panangaden: Essays Dedicated to Prakash Panangaden on the Occasion of His 60th Birthday. Lecture Notes in Computer Science, Vol. 8464. Springer, Cham, 407–426.Google Scholar
Dexter Kozen and Matvey Soloviev. 2018. Coalgebraic Tools for Randomness-Conserving Protocols. In RAMiCS 2018: Proceedings of the 17th International Conference on Relational and Algebraic Methods in Computer Science (Groningen, The Netherlands). Lecture Notes in Computer Science, Vol. 11194. Springer, Cham, 298–313.Google Scholar
S. Kullback and R. A. Leibler. 1951. On Information and Sufficiency. Ann. Math. Stat. 22, 1 (March 1951), 79–86.Google ScholarCross Ref
Anthony J. C. Ladd. 2009. A Fast Random Number Generator for Stochastic Simulations. Comput. Phys. Commun. 180, 11 (2009), 2140–2142.Google ScholarCross Ref
Dopug Lea. 1992. User’s Guide to the GNU C++ Library. Free Software Foundation, Inc.Google Scholar
Josef Leydold and Sougata Chaudhuri. 2014. rvgtest: Tools for Analyzing Non-Uniform Pseudo-Random Variate Generators. https://CRAN.R- project.org/package=rvgtest R package version 0.7.4.Google Scholar
Friedrich Liese and Igor Vajda. 2006. On Divergences and Informations in Statistics and Information Theory. IEEE Trans. Inf. Theory 52, 10 (Oct. 2006), 4394–4412.Google ScholarDigital Library
Jun S. Liu. 2001. Monte Carlo Strategies in Scientific Computing. Springer, New York.Google ScholarDigital Library
Jérmie Lumbroso. 2013. Optimal Discrete Uniform Generation from Coin Flips, and Applications. (April 2013). arXiv: 1304.1916Google Scholar
Vikash Mansinghka and Eric Jonas. 2014. Building Fast Bayesian Computing Machines Out of Intentionally Stochastic Digital Parts. (Feb. 2014). arXiv: 1402.4914Google Scholar
The MathWorks. 1993. Statistics Toolbox User’s Guide. The MathWorks, Inc.Google Scholar
John F. Monahan. 1985. Accuracy in Random Number Generation. Math. Comput. 45, 172 (Oct. 1985), 559–568.Google ScholarCross Ref
Aditya V. Nori, Sherjil Ozair, Sriram K. Rajamani, and Deepak Vijaykeerthy. 2015. Efficient Synthesis of Probabilistic Programs. In PLDI 2015: Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation (Portland, OR, USA). ACM, New York, 208–217.Google Scholar
Sung-il Pae and Michael C Loui. 2006. Randomizing Functions: Simulation of a Discrete Probability Distribution Using a Source of Unknown Distribution. IEEE Trans. Inf. Theory 52, 11 (Nov. 2006), 4965–4976.Google Scholar
Karl Pearson. 1900. On the Criterion That a Given System of Deviations from the Probable in the Case of a Correlated System of Variables Is Such That It Can Be Reasonably Supposed to Have Arisen from Random Sampling. Philos. Mag. 5 (July 1900), 157–175.Google ScholarCross Ref
Yuval Peres. 1992. Iterating von Neumann’s Procedure for Extracting Random Bits. Ann. Stat. 20, 1 (March 1992), 590–597.Google Scholar
R Core Team. 2014. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. http://www.R- project.org/Google Scholar
James R. Roche. 1991. Efficient Generation of Random Variables from Biased Coins. In ISIT 1991: Proceedings of the IEEE International Symposium on Information Theory (Budapest, Hungary). IEEE Press, Piscataway, 169–169.Google ScholarCross Ref
Sinha S. Roy, Frederik Vercauteren, and Ingrid Verbauwhede. 2013. High Precision Discrete Gaussian Sampling on FPGAs. In SAC 2013: Proceedings of the 20th International Conference on Selected Areas in Cryptography (Burnaby, Canada). Lecture Notes in Computer Science, Vol. 8282. Springer, Berlin, 383–401.Google Scholar
Feras Saad and Vikash Mansinghka. 2016. Probabilistic Data Analysis with Probabilistic Programming. (Aug. 2016). arXiv: 1608.05347Google Scholar
Feras A. Saad, Marco F. Cusumano-Towner, Ulrich Schaechtle, Martin C. Rinard, and Vikash K. Mansinghka. 2019. Bayesian Synthesis of Probabilistic Programs for Automatic Data Modeling. Proc. ACM Program. Lang. 3, POPL, Article 37 (Jan. 2019), 32 pages.Google Scholar
Claude E. Shannon. 1948. A Mathematical Theory of Communication. Bell Sys. Tech. Journ. 27, 3 (July 1948), 379–423.Google ScholarCross Ref
Warren D. Smith. 2002. How To Sample from a Probability Distribution. Technical Report DocNumber17. NEC Research.Google Scholar
Sam Staton, Hongseok Yang, Frank Wood, Chris Heunen, and Ohad Kammar. 2016. Semantics for Probabilistic Programming: Higher-order Functions, Continuous Distributions, and Soft Constraints. In LICS 2016: Proceedings of the 31st Annual ACM/IEEE Symposium on Logic in Computer Science (New York, NY, USA). ACM, New York, 525–534.Google ScholarDigital Library
John Steinberger. 2012. Improved Security Bounds for Key-Alternating Ciphers via Hellinger Distance. Technical Report Report 2012/481. Cryptology ePrint Archive.Google Scholar
Quentin F. Stout and Bette Warren. 1984. Tree Algorithms for Unbiased Coin Tossing with a Biased Coin. Ann. Probab. 12, 1 (Feb. 1984), 212–222.Google Scholar
Tomohiko Uyematsu and Yuan Li. 2003. Two Algorithms for Random Number Generation Implemented by Using Arithmetic of Limited Precision. IEICE Trans. Fund. Elec. Comm. Comp. Sci 86, 10 (Oct. 2003), 2542–2551.Google Scholar
Sridhar Vembu and Sergio Verdú. 1995. Generating Random Bits from an Arbitrary Source: Fundamental Limits. IEEE Trans. Inf. Theory 41, 5 (Sept. 1995), 1322–1332.Google ScholarDigital Library
John von Neumann. 1951. Various Techniques Used in Connection with Random Digits. In Monte Carlo Method, A. S. Householder, G. E. Forsythe, and H. H. Germond (Eds.). National Bureau of Standards Applied Mathematics Series, Vol. 12. U.S. Government Printing Office, Washington, DC, Chapter 13, 36–38.Google Scholar
Michael D. Vose. 1991. A Linear Algorithm for Generating Random Numbers with a Given Distribution. IEEE Trans. Softw. Eng. 17, 9 (Sept. 1991), 972–975.Google ScholarDigital Library
Alistair J. Walker. 1974. New Fast Method for Generating Discrete Random Numbers with Arbitrary Frequency Distributions. Electron. Lett. 10, 8 (April 1974), 127–128.Google ScholarCross Ref
Alastair J. Walker. 1977. An Efficient Method for Generating Discrete Random Variables with General Distributions. ACM Trans. Math. Softw. 3, 3 (Sept. 1977), 253–256.Google ScholarDigital Library

Index Terms

Optimal approximate sampling from discrete probability distributions
1. Mathematics of computing
2. Theory of computation
  1. Design and analysis of algorithms
    1. Approximation algorithms analysis
      1. Numeric approximation algorithms
  2. Models of computation
    1. Probabilistic computation

Recommendations

The patchwork rejection technique for sampling from unimodal distributions

We report on both theoretical developments and comutational experience with the patchwork rejection technique in Zechner and Stadlober [1993] and Zechner [1997]. The basic approach is due to Minh [1988], who suggested a special sampling method for the ...
Read More
Efficient Sampling Methods for Discrete Distributions

We study the fundamental problem of the exact and efficient generation of random values from a finite and discrete probability distribution. Suppose that we are given n distinct events with associated probabilities $$p_1, \dots , p_n$$p1,ź,pn. First, we ...
Read More
Optimal resource allocation in two stage sampling of input distributions
WSC '06: Proceedings of the 38th conference on Winter simulation

Consider a performance measure that is evaluated via Monte Carlo simulation where input distributions to the underlying model may involve two stage sampling. The settings of interest include the case where in the first stage physical samples from the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

Proceedings of the ACM on Programming Languages Volume 4, Issue POPL
January 2020
1984 pages
EISSN:2475-1421
DOI:10.1145/3377388
Issue’s Table of Contents

Copyright © 2019 Owner/Author
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike International 4.0 License.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 20 December 2019
Published in pacmpl Volume 4, Issue POPL

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Badges
- Artifacts Available
- Artifacts Evaluated & Functional
Author Tags
discrete random variables
random variate generation
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 1,637
  Total Downloads
- Downloads (Last 12 months)275
- Downloads (Last 6 weeks)37
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Optimal approximate sampling from discrete probability distributions

Proceedings of the ACM on Programming Languages

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

The patchwork rejection technique for sampling from unimodal distributions

Efficient Sampling Methods for Discrete Distributions

Optimal resource allocation in two stage sampling of input distributions