ABSTRACT
Automatic Differentiation (AD) has become a dominant technique in ML. AD frameworks have first been implemented for imperative languages using tapes. Meanwhile, functional implementations of AD have been developed, often based on dual numbers, which are close to the formal specification of differentiation and hence easier to prove correct. But these papers have focussed on correctness not efficiency. Recently, it was shown how an approach using dual numbers could be made efficient through the right optimizations. Optimizations are highly dependent on order, as one optimization can enable another. It can therefore be useful to have fine-grained control over the scheduling of optimizations. One method expresses compiler optimizations as rewrite rules, whose application can be combined and controlled using strategy languages. Previous work describes the use of term rewriting and strategies to generate high-performance code in a compiler for a functional language. In this work, we implement dual numbers AD in a functional array programming language using rewrite rules and strategy combinators for optimization. We aim to combine the elegance of differentiation using dual numbers with a succinct expression of the optimization schedule using a strategy language. We give preliminary evidence suggesting the viability of the approach on a micro-benchmark.
- Richard F Blute, J Robin B Cockett, and Robert AG Seely. 2009. Cartesian differential categories. Theory and Applications of Categories, 22, 23 (2009), 622–672. Google Scholar
- James Bradbury, Roy Frostig, Peter Hawkins, Matthew James Johnson, Chris Leary, Dougal Maclaurin, George Necula, Adam Paszke, Jake VanderPlas, Skye Wanderman-Milne, and Qiao Zhang. 2018. JAX: composable transformations of Python+NumPy programs. http://github.com/google/jax Google Scholar
- Antonio Bucciarelli, Thomas Ehrhard, and Giulio Manzonetto. 2010. Categorical Models for Simply Typed Resource Calculi. In Proceedings of the 26th Conference on the Mathematical Foundations of Programming Semantics, MFPS 2010, Ottawa, Ontario, Canada, May 6-10, 2010, Michael W. Mislove and Peter Selinger (Eds.) (Electronic Notes in Theoretical Computer Science, Vol. 265). Elsevier, 213–230. https://doi.org/10.1016/j.entcs.2010.08.013 Google ScholarDigital Library
- Alonzo Church. 1940. A Formulation of the Simple Theory of Types. J. Symb. Log., 5, 2 (1940), 56–68. https://doi.org/10.2307/2266170 Google ScholarCross Ref
- J. Robin B. Cockett, Geoff S. H. Cruttwell, Jonathan Gallagher, Jean-Simon Pacaud Lemay, Benjamin MacAdam, Gordon D. Plotkin, and Dorette Pronk. 2020. Reverse Derivative Categories. In 28th EACSL Annual Conference on Computer Science Logic, CSL 2020, January 13-16, 2020, Barcelona, Spain, Maribel Fernández and Anca Muscholl (Eds.) (LIPIcs, Vol. 152). Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 18:1–18:16. https://doi.org/10.4230/LIPIcs.CSL.2020.18 Google ScholarCross Ref
- Geoffrey S. H. Cruttwell, Jonathan Gallagher, and Dorette Pronk. 2020. Categorical semantics of a simple differential programming language. In Proceedings of the 3rd Annual International Applied Category Theory Conference 2020, ACT 2020, Cambridge, USA, 6-10th July 2020, David I. Spivak and Jamie Vicary (Eds.) (EPTCS, Vol. 333). 289–310. https://doi.org/10.4204/EPTCS.333.20 Google ScholarCross Ref
- Leonardo de Moura and Sebastian Ullrich. 2021. The Lean 4 Theorem Prover and Programming Language. In Automated Deduction - CADE 28 - 28th International Conference on Automated Deduction, Virtual Event, July 12-15, 2021, Proceedings, André Platzer and Geoff Sutcliffe (Eds.) (Lecture Notes in Computer Science, Vol. 12699). Springer, 625–635. https://doi.org/10.1007/978-3-030-79876-5_37 Google ScholarDigital Library
- Paulo Emílio de Vilhena and François Pottier. 2021. Verifying an Effect-Handler-Based Define-By-Run Reverse-Mode AD Library. arXiv preprint arXiv:2112.07292. Google Scholar
- Conal Elliott. 2018. The simple essence of automatic differentiation. Proc. ACM Program. Lang., 2, ICFP (2018), 70:1–70:29. https://doi.org/10.1145/3236765 Google ScholarDigital Library
- Bastian Hagedorn, Johannes Lenfers, Thomas Koehler, Xueying Qin, Sergei Gorlatch, and Michel Steuwer. 2020. Achieving high-performance the functional way: a functional pearl on expressing high-performance optimizations as rewrite strategies. Proc. ACM Program. Lang., 4, ICFP (2020), 92:1–92:29. https://doi.org/10.1145/3408974 Google ScholarDigital Library
- Laurent Hascoët and Valérie Pascual. 2013. The Tapenade automatic differentiation tool: Principles, model, and specification. ACM Trans. Math. Softw., 39, 3 (2013), 20:1–20:43. https://doi.org/10.1145/2450153.2450158 Google ScholarDigital Library
- Troels Henriksen, Niels G. W. Serup, Martin Elsman, Fritz Henglein, and Cosmin E. Oancea. 2017. Futhark: purely functional GPU-programming with nested parallelism and in-place array updates. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2017, Barcelona, Spain, June 18-23, 2017, Albert Cohen and Martin T. Vechev (Eds.). ACM, 556–571. https://doi.org/10.1145/3062341.3062354 Google ScholarDigital Library
- Michael Innes. 2018. Don’t Unroll Adjoint: Differentiating SSA-Form Programs. CoRR, abs/1810.07951 (2018), arXiv:1810.07951. arxiv:1810.07951 Google Scholar
- Simon L. Peyton Jones and Simon Marlow. 2002. Secrets of the Glasgow Haskell Compiler inliner. J. Funct. Program., 12, 4&5 (2002), 393–433. https://doi.org/10.1017/S0956796802004331 Google ScholarDigital Library
- Damiano Mazza and Michele Pagani. 2021. Automatic differentiation in PCF. Proc. ACM Program. Lang., 5, POPL (2021), 1–27. https://doi.org/10.1145/3434309 Google ScholarDigital Library
- Samuel Mimram. 2020. PROGRAM = PROOF. Google Scholar
- William S. Moses and Valentin Churavy. 2020. Instead of Rewriting Foreign Code for Machine Learning, Automatically Synthesize Fast Gradients. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, Hugo Larochelle, Marc’Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin (Eds.). https://proceedings.neurips.cc/paper/2020/hash/9332c513ef44b682e9347822c2e457ac-Abstract.html Google Scholar
- Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Köpf, Edward Z. Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d’Alché-Buc, Emily B. Fox, and Roman Garnett (Eds.). 8024–8035. https://proceedings.neurips.cc/paper/2019/hash/bdbca288fee7f92f2bfa9f7012727740-Abstract.html Google Scholar
- Adam Paszke, Daniel D. Johnson, David Duvenaud, Dimitrios Vytiniotis, Alexey Radul, Matthew J. Johnson, Jonathan Ragan-Kelley, and Dougal Maclaurin. 2021. Getting to the point: index sets and parallelism-preserving autodiff for pointful array programming. Proc. ACM Program. Lang., 5, POPL (2021), 1–29. https://doi.org/10.1145/3473593 Google ScholarDigital Library
- Amir Shaikhha, Andrew W. Fitzgibbon, Simon Peyton Jones, and Dimitrios Vytiniotis. 2017. Destination-passing style for efficient memory management. In Proceedings of the 6th ACM SIGPLAN International Workshop on Functional High-Performance Computing, FHPC@ICFP 2017, Oxford, UK, September 7, 2017, Phil Trinder and Cosmin E. Oancea (Eds.). ACM, 12–23. https://doi.org/10.1145/3122948.3122949 Google ScholarDigital Library
- Amir Shaikhha, Andrew W. Fitzgibbon, Dimitrios Vytiniotis, and Simon Peyton Jones. 2019. Efficient differentiable programming in a functional array-processing language. Proc. ACM Program. Lang., 3, ICFP (2019), 97:1–97:30. https://doi.org/10.1145/3341701 Google ScholarDigital Library
- Eelco Visser. 2005. A survey of strategies in rule-based program transformation systems. J. Symb. Comput., 40, 1 (2005), 831–873. https://doi.org/10.1016/j.jsc.2004.12.011 Google ScholarDigital Library
- Eelco Visser, Zine-El-Abidine Benaissa, and Andrew P. Tolmach. 1998. Building Program Optimizers with Rewriting Strategies. In Proceedings of the third ACM SIGPLAN International Conference on Functional Programming (ICFP ’98), Baltimore, Maryland, USA, September 27-29, 1998, Matthias Felleisen, Paul Hudak, and Christian Queinnec (Eds.). ACM, 13–26. https://doi.org/10.1145/289423.289425 Google ScholarDigital Library
- Fei Wang, Xilun Wu, Grégory M. Essertel, James M. Decker, and Tiark Rompf. 2018. Demystifying Differentiable Programming: Shift/Reset the Penultimate Backpropagator. CoRR, abs/1803.10228 (2018), arXiv:1803.10228. arxiv:1803.10228 Google Scholar
- Yann LeCun. 2018. Yann LeCun - OK, Deep Learning has outlived its usefulness... | Facebook. https://web.archive.org/web/20180106001630/https://www.facebook.com/yann.lecun/posts/10155003011462143 [Online; accessed 7-April-2022] Google Scholar
Index Terms
- Using Rewrite Strategies for Efficient Functional Automatic Differentiation
Recommendations
Oberon-0 in Kiama
The Kiama language processing library is a collection of domain-specific languages for software language processing embedded in the Scala programming language. The standard Scala parsing library is augmented by Kiama's facilities for defining attribute ...
Super 8 languages for making movies (functional pearl)
The Racket doctrine tells developers to narrow the gap between the terminology of a problem domain and general programming constructs by creating languages instead of just plain programs. This pearl illustrates this point with the creation of a ...
Everything old is new again: quoted domain-specific languages
PEPM '16: Proceedings of the 2016 ACM SIGPLAN Workshop on Partial Evaluation and Program ManipulationWe describe a new approach to implementing Domain-Specific Languages(DSLs), called Quoted DSLs (QDSLs), that is inspired by two old ideas:quasi-quotation, from McCarthy's Lisp of 1960, and the subformula principle of normal proofs, from Gentzen's ...
Comments