skip to main content
research-article
Open Access
Artifacts Available / v1.1

Parsing randomness

Published:31 October 2022Publication History
Skip Abstract Section

Abstract

Random data generators can be thought of as parsers of streams of randomness. This perspective on generators for random data structures is established folklore in the programming languages community, but it has never been formalized, nor have its consequences been deeply explored.

We build on the idea of freer monads to develop free generators, which unify parsing and generation using a common structure that makes the relationship between the two concepts precise. Free generators lead naturally to a proof that a monadic generator can be factored into a parser plus a distribution over choice sequences. Free generators also support a notion of derivative, analogous to the familiar Brzozowski derivatives of formal languages, allowing analysis tools to "preview" the effect of a particular generator choice. This gives rise to a novel algorithm for generating data structures satisfying user-specified preconditions.

Skip Supplemental Material Section

Supplemental Material

References

  1. Janusz A Brzozowski. 1964. Derivatives of regular expressions. Journal of the ACM (JACM), 11, 4 (1964), 481–494. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Koen Claessen, Jonas Duregård, and Michal H. Palka. 2015. Generating constrained random data with uniform distribution. J. Funct. Program., 25 (2015), https://doi.org/10.1017/S0956796815000143 Google ScholarGoogle ScholarCross RefCross Ref
  3. Koen Claessen and John Hughes. 2000. QuickCheck: a lightweight tool for random testing of Haskell programs. In Proceedings of the Fifth ACM SIGPLAN International Conference on Functional Programming (ICFP ’00), Montreal, Canada, September 18-21, 2000, Martin Odersky and Philip Wadler (Eds.). ACM, Montreal, Canada. 268–279. https://doi.org/10.1145/351240.351266 Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Kyle Thomas Dewey. 2017. Automated Black Box Generation of Structured Inputs for Use in Software Testing. University of California, Santa Barbara. Google ScholarGoogle Scholar
  5. Stephen Dolan and Mindy Preston. 2017. Testing with crowbar. In OCaml Workshop. Google ScholarGoogle Scholar
  6. Tony Garnock-Jones, Mahdi Eslamimehr, and Alessandro Warth. 2018. Recognising and generating terms using derivatives of parsing expression grammars. arXiv preprint arXiv:1801.10490. Google ScholarGoogle Scholar
  7. Michele Giry. 1982. A categorical approach to probability theory. In Categorical aspects of topology and analysis. Springer, 68–85. Google ScholarGoogle Scholar
  8. Patrice Godefroid, Hila Peleg, and Rishabh Singh. 2017. Learn&fuzz: Machine learning for input fuzzing. In 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE). 50–59. https://dl.acm.org/doi/10.5555/3155562.3155573 Google ScholarGoogle ScholarCross RefCross Ref
  9. Harrison Goldstein. 2021. Ungenerators. In ICFP Student Research Competition. https://harrisongoldste.in/papers/icfpsrc21.pdf Google ScholarGoogle Scholar
  10. Harrison Goldstein. 2022. Parsing Randomness: Free Generators Development. Oct, https://doi.org/10.5281/zenodo.7086231 Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. John Hughes. 2007. QuickCheck testing for fun and profit. In International Symposium on Practical Aspects of Declarative Languages. 1–32. https://dl.acm.org/doi/10.1007/978-3-540-69611-7_1 Google ScholarGoogle Scholar
  12. Oleg Kiselyov and Hiromi Ishii. 2015. Freer monads, more extensible effects. ACM SIGPLAN Notices, 50, 12 (2015), 94–105. https://dl.acm.org/doi/10.1145/2804302.2804319 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Leonidas Lampropoulos, Diane Gallois-Wong, Catalin Hritcu, John Hughes, Benjamin C. Pierce, and Li-yao Xia. 2017. Beginner’s Luck: a language for property-based generators. In Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages, POPL 2017, Paris, France, January 18-20, 2017. 114–129. http://dl.acm.org/citation.cfm?id=3009868 Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Leonidas Lampropoulos, Zoe Paraskevopoulou, and Benjamin C Pierce. 2017. Generating good generators for inductive relations. Proceedings of the ACM on Programming Languages, 2, POPL (2017), 1–30. https://dl.acm.org/doi/10.1145/3158133 Google ScholarGoogle Scholar
  15. Daan Leijen and Erik Meijer. 2001. Parsec: Direct style monadic parser combinators for the real world. Google ScholarGoogle Scholar
  16. Vladimir I Levenshtein. 1966. Binary codes capable of correcting deletions, insertions, and reversals. In Soviet physics doklady. 10, 707–710. Google ScholarGoogle Scholar
  17. Andreas Löscher and Konstantinos Sagonas. 2017. Targeted Property-Based Testing. In Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2017). Association for Computing Machinery, New York, NY, USA. 46–56. isbn:9781450350761 https://doi.org/10.1145/3092703.3092711 Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. David R MacIver and Zac Hatfield-Dodds. 2019. Hypothesis: A new approach to property-based testing. Journal of Open Source Software, 4, 43 (2019), 1891. Google ScholarGoogle ScholarCross RefCross Ref
  19. Eugenio Moggi. 1991. Notions of computation and monads. Information and computation, 93, 1 (1991), 55–92. Google ScholarGoogle Scholar
  20. Tomáš Petříček. 2009. Encoding monadic computations in C# using iterators. Proceedings of ITAT. Google ScholarGoogle Scholar
  21. Sameer Reddy, Caroline Lemieux, Rohan Padhye, and Koushik Sen. 2020. Quickly generating diverse valid test inputs with reinforcement learning. In ICSE ’20: 42nd International Conference on Software Engineering, Seoul, South Korea, 27 June - 19 July, 2020, Gregg Rothermel and Doo-Hwan Bae (Eds.). ACM, 1410–1421. https://doi.org/10.1145/3377811.3380399 Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. 2011. Finding and understanding bugs in C compilers. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2011, San Jose, CA, USA, June 4-8, 2011. 283–294. https://doi.org/10.1145/1993498.1993532 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Parsing randomness

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader