Abstract
We propose a new language feature for ML-family languages, the ability to selectively unbox certain data constructors, so that their runtime representation gets compiled away to just the identity on their argument. Unboxing must be statically rejected when it could introduce confusion, that is, distinct values with the same representation.
We discuss the use-case of big numbers, where unboxing allows to write code that is both efficient and safe, replacing either a safe but slow version or a fast but unsafe version. We explain the static analysis necessary to reject incorrect unboxing requests. We present our prototype implementation of this feature for the OCaml programming language, discuss several design choices and the interaction with advanced features such as Guarded Algebraic Datatypes.
Our static analysis requires expanding type definitions in type expressions, which is not necessarily normalizing in presence of recursive type definitions. In other words, we must decide normalization of terms in the first-order λ-calculus with recursion. We provide an algorithm to detect non-termination on-the-fly during reduction, with proofs of correctness and completeness. Our algorithm turns out to be closely related to the normalization strategy for macro expansion in the cpp preprocessor.
- Ömer Sínan Ağacan. 2016. GHC unboxed sums. https://github.com/ghc/ghc/commit/714bebff44076061d0a719c4eda2cfd213b7ac3d Google Scholar
- Noah Lev Bartell-Mangel. 2022. Filling a Niche: Using Spare Bits to Optimize Data Representations. https://www.noahlev.org/papers/popl22src-filling-a-niche.pdf POPL’22 student research presentation Google Scholar
- Thaïs Baudon, Gabriel Radanne, and Laure Gonnord. 2023. Bit-Stealing Made Legal. In ICFP. https://doi.org/10.1145/3607858 Google ScholarDigital Library
- Aria Beingessner. 2015. Rust RFC 1230: More Exotic Enum Layout Optimizations. https://github.com/rust-lang/rfcs/issues/1230 Google Scholar
- Michael Benfield. 2022. rustc PR 94075: Use niche-filling optimization even when multiple variants have data. https://github.com/rust-lang/rust/pull/94075 Google Scholar
- Mathieu Boespflug, Maxime Dénès, and Benjamin Grégoire. 2011. Full Reduction at Full Throttle. In CPP. https://inria.hal.science/hal-00650940 Google Scholar
- Eduard-Mihai Burtescu. 2017. rustc PR 45225: Refactor type memory layouts and ABIs, to be more general and easier to optimize. https://github.com/rust-lang/rust/pull/45225 Google Scholar
- Lloyd Chan. 2017. Scala Pre-SIP: Unboxed wrapper types. https://contributors.scala-lang.org/t/pre-sip-unboxed-wrapper-types/987 Google Scholar
- Zilin Chen, Ambroise Lafont, Liam O’Connor, Gabriele Keller, Craig McLaughlin, Vincent Jackson, and Christine Rizkallah. 2023. Dargent: A Silver Bullet for Verified Data Layout Refinement. PACMPL, 7, POPL (2023), Article 47, Jan, 27 pages. https://doi.org/10.1145/3571240 Google ScholarDigital Library
- Simon Colin, Rodolphe Lepigre, and Gabriel Scherer. 2019. Unboxing Mutually Recursive Type Definitions in OCaml. In JFLA 2019. https://hal.inria.fr/hal-01929508 Google Scholar
- Stephen Compall. 2017. Blog post: the high cost of AnyVal classes. https://failex.blogspot.com/2017/04/the-high-cost-of-anyval-subclasses.html Google Scholar
- Iavor S. Diatchki, Mark P. Jones, and Rebekah Leslie. 2005. High-Level Views on Low-Level Representations. In ICFP’05. http://web.cecs.pdx.edu/~mpj/pubs/bitdata-icfp05.pdf Google Scholar
- Torbjörn Granlund and contributors. 1991. GMP. https://gmplib.org/ Google Scholar
- John Hughes. 1982. Super-Combinators a New Implementation Method for Applicative Languages. In Proceedings of the 1982 ACM Symposium on LISP and Functional Programming (LFP). https://doi.org/10.1145/800068.802129 Google ScholarDigital Library
- Zurab Khasidashvil. 2020. A short proof of the decidability of normalization in recursive program schemes. In Shalva Pkhakadze’s Festschrift, AMIM Vol. 25 No. 2. http://www.viam.science.tsu.ge/Ami/2020_2/5_zura.pdf Google Scholar
- Simon Marlow. 2003. GHC’s UNPACK pragma. https://github.com/ghc/ghc/commit/abbc5a0be1df84a33015470319062ed7a3aa3153 Google Scholar
- Antoine Miné and Xavier Leroy. 2012. Zarith. https://github.com/ocaml/Zarith/ Google Scholar
- Martin Odersky and Adriaan Moors. 2018. dotty PR 5300: Opaque types. https://github.com/lampepfl/dotty/pull/5300 Google Scholar
- Erik Osheim, Jorge Vicente Cantero, and Sébastien Doeraene. 2017. Scala SIP 35: Opaque types. https://contributors.scala-lang.org/t/pre-sip-unboxed-wrapper-types/987 Google Scholar
- Simon Peyton-Jones. 2007. GHC view patterns. https://gitlab.haskell.org/ghc/ghc/-/wikis/view-patterns Google Scholar
- Gordon Plotkin. 2022. Recursion does not always help. arxiv:2206.08413 Google Scholar
- Dave Prosser. 1986. X3J11/86-196: Complete macro expansion algorithm. https://www.spinellis.gr/blog/20060626/x3J11-86-196.pdf Google Scholar
- Sylvain Salvati and Igor Walukiewicz. 2015. Using models to model-check recursive schemes. Logical Methods in Computer Science, Volume 11, Issue 2 (2015), June, https://doi.org/10.2168/LMCS-11(2:7)2015 Google ScholarCross Ref
- Diomidis Spinellis. 2008. A corrected and annotated version of the X4J11/86-196 document. https://www.spinellis.gr/blog/20060626/ Google Scholar
- Don Syme. 2016. Fsharp PR 1395: struct discriminated unions. https://github.com/dotnet/fsharp/pull/1395 Google Scholar
- Don Syme, Gregory Neverov, and James Margetson. 2007. Extensible Pattern Matching via a Lightweight Language Extension. In ICFP’07 (ICFP ’07). https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/p29-syme.pdf Google Scholar
- The C++ standard committee, working group SG12. 2014. n3882; An update to the preprocessor specification. https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n3882.pdf Google Scholar
- The C standard committee, working group WG14. 1992. Defect report 017. https://www.open-std.org/Jtc1/sc22/wg14/www/docs/dr_017.html Google Scholar
- David A. Turner. 1979. A new implementation technique for applicative languages. In Software - Practice and Experience. Google Scholar
- Stephen Weeks. 2006. Whole-Program Compilation in MLton. In ML Workshop 2006. http://www.mlton.org/References.attachments/060916-mlton.pdf Google Scholar
- Jeremy Yallop. 2020. OCaml RFC: constructor unboxing. https://github.com/ocaml/RFCs/pull/14 Google Scholar
Index Terms
- Unboxed Data Constructors: Or, How cpp Decides a Halting Problem
Recommendations
Self type constructors
OOPSLA '09: Proceedings of the 24th ACM SIGPLAN conference on Object oriented programming systems languages and applicationsBruce and Foster proposed the language LOOJ, an extension of Java with the notion of MyType, which represents the type of a self reference and changes its meaning along with inheritance. MyType is useful to write extensible yet type-safe classes for ...
Self type constructors
OOPSLA '09Bruce and Foster proposed the language LOOJ, an extension of Java with the notion of MyType, which represents the type of a self reference and changes its meaning along with inheritance. MyType is useful to write extensible yet type-safe classes for ...
Unboxed values and polymorphic typing revisited
FPCA '95: Proceedings of the seventh international conference on Functional programming languages and computer architecture
Comments