Abstract
This paper addresses a fundamental problem in random variate generation: given access to a random source that emits a stream of independent fair bits, what is the most accurate and entropy-efficient algorithm for sampling from a discrete probability distribution (p1, …, pn), where the probabilities of the output distribution (p̂1, …, p̂n) of the sampling algorithm must be specified using at most k bits of precision? We present a theoretical framework for formulating this problem and provide new techniques for finding sampling algorithms that are optimal both statistically (in the sense of sampling accuracy) and information-theoretically (in the sense of entropy consumption). We leverage these results to build a system that, for a broad family of measures of statistical accuracy, delivers a sampling algorithm whose expected entropy usage is minimal among those that induce the same distribution (i.e., is “entropy-optimal”) and whose output distribution (p̂1, …, p̂n) is a closest approximation to the target distribution (p1, …, pn) among all entropy-optimal sampling algorithms that operate within the specified k-bit precision. This optimal approximate sampler is also a closer approximation than any (possibly entropy-suboptimal) sampler that consumes a bounded amount of entropy with the specified precision, a class which includes floating-point implementations of inversion sampling and related methods found in many software libraries. We evaluate the accuracy, entropy consumption, precision requirements, and wall-clock runtime of our optimal approximate sampling algorithms on a broad set of distributions, demonstrating the ways that they are superior to existing approximate samplers and establishing that they often consume significantly fewer resources than are needed by exact samplers.
Supplemental Material
Available for Download
This file contains the appendices of the main paper.
- Julia Abrahams. 1996. Generation of Discrete Distributions from Biased Coins. IEEE Trans. Inf. Theory 42, 5 (Sept. 1996), 1541–1546.Google ScholarDigital Library
- S. M. Ali and S. D. Silvey. 1966. A General Class of Coefficients of Divergence of One Distribution from Another. J. R. Stat. Soc. B. 28, 1 (Jan. 1966), 131–142.Google Scholar
- Ziv Bar-Yossef, Thathachar S. Jayram, Ravi Kumar, and D. Sivakumar. 2004. An Information Statistics Approach to Data Stream and Communication Complexity. J. Comput. Syst. Sci. 68, 4 (June 2004), 702–732.Google ScholarDigital Library
- Kurt Binder (Ed.). 1986. Monte Carlo Methods in Statistical Physics (2 ed.). Topics in Current Physics, Vol. 7. Springer-Verlag, Berlin.Google Scholar
- Antonio Blanca and Milena Mihail. 2012. Efficient Generation ϵ-close to G(n, p) and Generalizations. (April 2012). arXiv: 1204.5834Google Scholar
- Lenore Blum, Felipe Cucker, Michael Shub, and Steve Smale. 1998. Complexity and Real Computation. Springer-Verlag, New York.Google Scholar
- Manuel Blum. 1986. Independent Unbiased Coin Flips from a Correlated Biased Source: A Finite State Markov Chain. Combinatorica 6, 2 (June 1986), 97–108.Google Scholar
- Karl Bringmann and Tobias Friedrich. 2013. Exact and Efficient Generation of Geometric Random Variates and Random Graphs. In ICALP 2013: Proceedings of the 40th International Colloquium on Automata, Languages and Programming (Riga, Latvia). Lecture Notes in Computer Science, Vol. 7965. Springer, Heidelberg, 267–278.Google Scholar
- Karl Bringmann and Konstantinos Panagiotou. 2017. Efficient Sampling Methods for Discrete Distributions. Algorithmica 79, 2 (Oct. 2017), 484–508.Google ScholarDigital Library
- Ferdinando Cicalese, Luisa Gargano, and Ugo Vaccaro. 2006. A Note on Approximation of Uniform Distributions from Variable-to-Fixed Length Codes. IEEE Trans. Inf. Theory 52, 8 (Aug. 2006), 3772–3777.Google ScholarDigital Library
- Thomas M. Cover and Joy A. Thomas. 2006. Elements of Information Theory (2 ed.). John Wiley & Sons, Inc., Hoboken.Google ScholarDigital Library
- Marco F. Cusumano-Towner, Feras A. Saad, Alexander K. Lew, and Vikash K. Mansinghka. 2019. Gen: A General-purpose Probabilistic Programming System with Programmable Inference. In PLDI 2019: Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation (Phoenix, AZ, USA). ACM, New York, 221–236.Google Scholar
- Christian de Schryver, Daniel Schmidt, Norbert Wehn, Elke Korn, Henning Marxen, Anton Kostiuk, and Ralf Korn. 2012. A Hardware Efficient Random Number Generator for Nonuniform Distributions with Arbitrary Precision. Int. J. Reconf. Comput. 2012, Article 675130 (2012), 11 pages.Google ScholarDigital Library
- Luc Devroye. 1982. A Note on Approximations in Random Variate Generation. J. Stat. Comput. Simul. 14, 2 (1982), 149–158.Google ScholarCross Ref
- Luc Devroye. 1986. Non-Uniform Random Variate Generation. Springer-Verlag, New York.Google Scholar
- Luc Devroye and Claude Gravel. 2015. Sampling with Arbitrary Precision. (Feb. 2015). arXiv: 1502.02539Google Scholar
- Inderjit S. Dhillon, Subramanyam Mallela, and Rahul Kumar. 2003. A Divisive Information-Theoretic Feature Clustering Algorithm for Text Classification. J. Mach. Learn. Res. 3 (March 2003), 1265–1287.Google Scholar
- Dragan Djuric. 2019. Billions of Random Numbers in a Blink of an Eye. Retrieved June 15, 2019 from https://dragan.rocks/ articles/19/Billion- random- numbers- blink- eye- ClojureGoogle Scholar
- Chaohui Du and Guoqiang Bai. 2015. Towards Efficient Discrete Gaussian Sampling For Lattice-Based Cryptography. In FPL 2015: Proceedings of the 25th International Conference on Field Programmable Logic and Applications (London, UK). IEEE Press, Piscataway, 1–6.Google Scholar
- Nagarjun C. Dwarakanath and Steven D. Galbraith. 2014. Sampling from Discrete Gaussians for Lattice-Based Cryptography On a Constrained Device. Appl. Algebr. Eng. Comm. 25, 3 (June 2014), 159–180.Google Scholar
- Peter Elias. 1972. The Efficient Construction of an Unbiased Random Sequence. Ann. Math. Stat. 43, 3 (June 1972), 865–870.Google Scholar
- János Folláth. 2014. Gaussian Sampling in Lattice Based Cryptography. Tatra Mount. Math. Pub. 60, 1 (Sept. 2014), 1–23.Google Scholar
- Mark Galassi, Jim Davies, James Theiler, Brian Gough, Gerard Jungman, Patrick Alken, Michael Booth, Fabrice Rossi, and Rhys Ulerich. 2019. GNU Scientific Library. Free Software Foundation.Google Scholar
- Paul Glasserman. 2003. Monte Carlo Methods in Financial Engineering. Stochastic Modeling and Applied Probability, Vol. 53. Springer Science+Business Media, New York.Google Scholar
- Andrew D. Gordon, Thomas A. Henzinger, Aditya V. Nori, and Sriram K. Rajamani. 2014. Probabilistic Programming. In FOSE 2014: Proceedings of the on Future of Software Engineering (Hyderabad, India). ACM, New York, 167–181.Google Scholar
- Te Sun Han and Mamoru Hoshi. 1997. Interval Algorithm for Random Number Generation. IEEE Trans. Inf. Theory 43, 2 (March 1997), 599–611.Google Scholar
- Te Sun Han and Sergio Verdú. 1993. Approximation Theory of Output Statistics. IEEE Trans. Inf. Theory 39, 3 (May 1993), 752–772.Google ScholarDigital Library
- John Harling. 1958. Simulation Techniques in Operations Research—A Review. Oper. Res. 6, 3 (June 1958), 307–319.Google ScholarDigital Library
- Eric Jonas. 2014. Stochastic Architectures for Probabilistic Computation. Ph.D. Dissertation. Massachusetts Institute of Technology.Google Scholar
- Donald E. Knuth and Andrew C. Yao. 1976. The Complexity of Nonuniform Random Number Generation. In Algorithms and Complexity: New Directions and Recent Results, Joseph F. Traub (Ed.). Academic Press, Inc., Orlando, FL, 357–428.Google Scholar
- Dexter Kozen. 2014. Optimal Coin Flipping. In Horizons of the Mind. A Tribute to Prakash Panangaden: Essays Dedicated to Prakash Panangaden on the Occasion of His 60th Birthday. Lecture Notes in Computer Science, Vol. 8464. Springer, Cham, 407–426.Google Scholar
- Dexter Kozen and Matvey Soloviev. 2018. Coalgebraic Tools for Randomness-Conserving Protocols. In RAMiCS 2018: Proceedings of the 17th International Conference on Relational and Algebraic Methods in Computer Science (Groningen, The Netherlands). Lecture Notes in Computer Science, Vol. 11194. Springer, Cham, 298–313.Google Scholar
- S. Kullback and R. A. Leibler. 1951. On Information and Sufficiency. Ann. Math. Stat. 22, 1 (March 1951), 79–86.Google ScholarCross Ref
- Anthony J. C. Ladd. 2009. A Fast Random Number Generator for Stochastic Simulations. Comput. Phys. Commun. 180, 11 (2009), 2140–2142.Google ScholarCross Ref
- Dopug Lea. 1992. User’s Guide to the GNU C++ Library. Free Software Foundation, Inc.Google Scholar
- Josef Leydold and Sougata Chaudhuri. 2014. rvgtest: Tools for Analyzing Non-Uniform Pseudo-Random Variate Generators. https://CRAN.R- project.org/package=rvgtest R package version 0.7.4.Google Scholar
- Friedrich Liese and Igor Vajda. 2006. On Divergences and Informations in Statistics and Information Theory. IEEE Trans. Inf. Theory 52, 10 (Oct. 2006), 4394–4412.Google ScholarDigital Library
- Jun S. Liu. 2001. Monte Carlo Strategies in Scientific Computing. Springer, New York.Google ScholarDigital Library
- Jérmie Lumbroso. 2013. Optimal Discrete Uniform Generation from Coin Flips, and Applications. (April 2013). arXiv: 1304.1916Google Scholar
- Vikash Mansinghka and Eric Jonas. 2014. Building Fast Bayesian Computing Machines Out of Intentionally Stochastic Digital Parts. (Feb. 2014). arXiv: 1402.4914Google Scholar
- The MathWorks. 1993. Statistics Toolbox User’s Guide. The MathWorks, Inc.Google Scholar
- John F. Monahan. 1985. Accuracy in Random Number Generation. Math. Comput. 45, 172 (Oct. 1985), 559–568.Google ScholarCross Ref
- Aditya V. Nori, Sherjil Ozair, Sriram K. Rajamani, and Deepak Vijaykeerthy. 2015. Efficient Synthesis of Probabilistic Programs. In PLDI 2015: Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation (Portland, OR, USA). ACM, New York, 208–217.Google Scholar
- Sung-il Pae and Michael C Loui. 2006. Randomizing Functions: Simulation of a Discrete Probability Distribution Using a Source of Unknown Distribution. IEEE Trans. Inf. Theory 52, 11 (Nov. 2006), 4965–4976.Google Scholar
- Karl Pearson. 1900. On the Criterion That a Given System of Deviations from the Probable in the Case of a Correlated System of Variables Is Such That It Can Be Reasonably Supposed to Have Arisen from Random Sampling. Philos. Mag. 5 (July 1900), 157–175.Google ScholarCross Ref
- Yuval Peres. 1992. Iterating von Neumann’s Procedure for Extracting Random Bits. Ann. Stat. 20, 1 (March 1992), 590–597.Google Scholar
- R Core Team. 2014. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. http://www.R- project.org/Google Scholar
- James R. Roche. 1991. Efficient Generation of Random Variables from Biased Coins. In ISIT 1991: Proceedings of the IEEE International Symposium on Information Theory (Budapest, Hungary). IEEE Press, Piscataway, 169–169.Google ScholarCross Ref
- Sinha S. Roy, Frederik Vercauteren, and Ingrid Verbauwhede. 2013. High Precision Discrete Gaussian Sampling on FPGAs. In SAC 2013: Proceedings of the 20th International Conference on Selected Areas in Cryptography (Burnaby, Canada). Lecture Notes in Computer Science, Vol. 8282. Springer, Berlin, 383–401.Google Scholar
- Feras Saad and Vikash Mansinghka. 2016. Probabilistic Data Analysis with Probabilistic Programming. (Aug. 2016). arXiv: 1608.05347Google Scholar
- Feras A. Saad, Marco F. Cusumano-Towner, Ulrich Schaechtle, Martin C. Rinard, and Vikash K. Mansinghka. 2019. Bayesian Synthesis of Probabilistic Programs for Automatic Data Modeling. Proc. ACM Program. Lang. 3, POPL, Article 37 (Jan. 2019), 32 pages.Google Scholar
- Claude E. Shannon. 1948. A Mathematical Theory of Communication. Bell Sys. Tech. Journ. 27, 3 (July 1948), 379–423.Google ScholarCross Ref
- Warren D. Smith. 2002. How To Sample from a Probability Distribution. Technical Report DocNumber17. NEC Research.Google Scholar
- Sam Staton, Hongseok Yang, Frank Wood, Chris Heunen, and Ohad Kammar. 2016. Semantics for Probabilistic Programming: Higher-order Functions, Continuous Distributions, and Soft Constraints. In LICS 2016: Proceedings of the 31st Annual ACM/IEEE Symposium on Logic in Computer Science (New York, NY, USA). ACM, New York, 525–534.Google ScholarDigital Library
- John Steinberger. 2012. Improved Security Bounds for Key-Alternating Ciphers via Hellinger Distance. Technical Report Report 2012/481. Cryptology ePrint Archive.Google Scholar
- Quentin F. Stout and Bette Warren. 1984. Tree Algorithms for Unbiased Coin Tossing with a Biased Coin. Ann. Probab. 12, 1 (Feb. 1984), 212–222.Google Scholar
- Tomohiko Uyematsu and Yuan Li. 2003. Two Algorithms for Random Number Generation Implemented by Using Arithmetic of Limited Precision. IEICE Trans. Fund. Elec. Comm. Comp. Sci 86, 10 (Oct. 2003), 2542–2551.Google Scholar
- Sridhar Vembu and Sergio Verdú. 1995. Generating Random Bits from an Arbitrary Source: Fundamental Limits. IEEE Trans. Inf. Theory 41, 5 (Sept. 1995), 1322–1332.Google ScholarDigital Library
- John von Neumann. 1951. Various Techniques Used in Connection with Random Digits. In Monte Carlo Method, A. S. Householder, G. E. Forsythe, and H. H. Germond (Eds.). National Bureau of Standards Applied Mathematics Series, Vol. 12. U.S. Government Printing Office, Washington, DC, Chapter 13, 36–38.Google Scholar
- Michael D. Vose. 1991. A Linear Algorithm for Generating Random Numbers with a Given Distribution. IEEE Trans. Softw. Eng. 17, 9 (Sept. 1991), 972–975.Google ScholarDigital Library
- Alistair J. Walker. 1974. New Fast Method for Generating Discrete Random Numbers with Arbitrary Frequency Distributions. Electron. Lett. 10, 8 (April 1974), 127–128.Google ScholarCross Ref
- Alastair J. Walker. 1977. An Efficient Method for Generating Discrete Random Variables with General Distributions. ACM Trans. Math. Softw. 3, 3 (Sept. 1977), 253–256.Google ScholarDigital Library
Index Terms
- Optimal approximate sampling from discrete probability distributions
Recommendations
The patchwork rejection technique for sampling from unimodal distributions
We report on both theoretical developments and comutational experience with the patchwork rejection technique in Zechner and Stadlober [1993] and Zechner [1997]. The basic approach is due to Minh [1988], who suggested a special sampling method for the ...
Efficient Sampling Methods for Discrete Distributions
We study the fundamental problem of the exact and efficient generation of random values from a finite and discrete probability distribution. Suppose that we are given n distinct events with associated probabilities $$p_1, \dots , p_n$$p1,ź,pn. First, we ...
Optimal resource allocation in two stage sampling of input distributions
WSC '06: Proceedings of the 38th conference on Winter simulationConsider a performance measure that is evaluated via Monte Carlo simulation where input distributions to the underlying model may involve two stage sampling. The settings of interest include the case where in the first stage physical samples from the ...
Comments