skip to main content
10.1145/3605156.3606456acmconferencesArticle/Chapter ViewAbstractPublication PagesecoopConference Proceedingsconference-collections
research-article

Using Rewrite Strategies for Efficient Functional Automatic Differentiation

Published:18 July 2023Publication History

ABSTRACT

Automatic Differentiation (AD) has become a dominant technique in ML. AD frameworks have first been implemented for imperative languages using tapes. Meanwhile, functional implementations of AD have been developed, often based on dual numbers, which are close to the formal specification of differentiation and hence easier to prove correct. But these papers have focussed on correctness not efficiency. Recently, it was shown how an approach using dual numbers could be made efficient through the right optimizations. Optimizations are highly dependent on order, as one optimization can enable another. It can therefore be useful to have fine-grained control over the scheduling of optimizations. One method expresses compiler optimizations as rewrite rules, whose application can be combined and controlled using strategy languages. Previous work describes the use of term rewriting and strategies to generate high-performance code in a compiler for a functional language. In this work, we implement dual numbers AD in a functional array programming language using rewrite rules and strategy combinators for optimization. We aim to combine the elegance of differentiation using dual numbers with a succinct expression of the optimization schedule using a strategy language. We give preliminary evidence suggesting the viability of the approach on a micro-benchmark.

References

  1. Richard F Blute, J Robin B Cockett, and Robert AG Seely. 2009. Cartesian differential categories. Theory and Applications of Categories, 22, 23 (2009), 622–672. Google ScholarGoogle Scholar
  2. James Bradbury, Roy Frostig, Peter Hawkins, Matthew James Johnson, Chris Leary, Dougal Maclaurin, George Necula, Adam Paszke, Jake VanderPlas, Skye Wanderman-Milne, and Qiao Zhang. 2018. JAX: composable transformations of Python+NumPy programs. http://github.com/google/jax Google ScholarGoogle Scholar
  3. Antonio Bucciarelli, Thomas Ehrhard, and Giulio Manzonetto. 2010. Categorical Models for Simply Typed Resource Calculi. In Proceedings of the 26th Conference on the Mathematical Foundations of Programming Semantics, MFPS 2010, Ottawa, Ontario, Canada, May 6-10, 2010, Michael W. Mislove and Peter Selinger (Eds.) (Electronic Notes in Theoretical Computer Science, Vol. 265). Elsevier, 213–230. https://doi.org/10.1016/j.entcs.2010.08.013 Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Alonzo Church. 1940. A Formulation of the Simple Theory of Types. J. Symb. Log., 5, 2 (1940), 56–68. https://doi.org/10.2307/2266170 Google ScholarGoogle ScholarCross RefCross Ref
  5. J. Robin B. Cockett, Geoff S. H. Cruttwell, Jonathan Gallagher, Jean-Simon Pacaud Lemay, Benjamin MacAdam, Gordon D. Plotkin, and Dorette Pronk. 2020. Reverse Derivative Categories. In 28th EACSL Annual Conference on Computer Science Logic, CSL 2020, January 13-16, 2020, Barcelona, Spain, Maribel Fernández and Anca Muscholl (Eds.) (LIPIcs, Vol. 152). Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 18:1–18:16. https://doi.org/10.4230/LIPIcs.CSL.2020.18 Google ScholarGoogle ScholarCross RefCross Ref
  6. Geoffrey S. H. Cruttwell, Jonathan Gallagher, and Dorette Pronk. 2020. Categorical semantics of a simple differential programming language. In Proceedings of the 3rd Annual International Applied Category Theory Conference 2020, ACT 2020, Cambridge, USA, 6-10th July 2020, David I. Spivak and Jamie Vicary (Eds.) (EPTCS, Vol. 333). 289–310. https://doi.org/10.4204/EPTCS.333.20 Google ScholarGoogle ScholarCross RefCross Ref
  7. Leonardo de Moura and Sebastian Ullrich. 2021. The Lean 4 Theorem Prover and Programming Language. In Automated Deduction - CADE 28 - 28th International Conference on Automated Deduction, Virtual Event, July 12-15, 2021, Proceedings, André Platzer and Geoff Sutcliffe (Eds.) (Lecture Notes in Computer Science, Vol. 12699). Springer, 625–635. https://doi.org/10.1007/978-3-030-79876-5_37 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Paulo Emílio de Vilhena and François Pottier. 2021. Verifying an Effect-Handler-Based Define-By-Run Reverse-Mode AD Library. arXiv preprint arXiv:2112.07292. Google ScholarGoogle Scholar
  9. Conal Elliott. 2018. The simple essence of automatic differentiation. Proc. ACM Program. Lang., 2, ICFP (2018), 70:1–70:29. https://doi.org/10.1145/3236765 Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Bastian Hagedorn, Johannes Lenfers, Thomas Koehler, Xueying Qin, Sergei Gorlatch, and Michel Steuwer. 2020. Achieving high-performance the functional way: a functional pearl on expressing high-performance optimizations as rewrite strategies. Proc. ACM Program. Lang., 4, ICFP (2020), 92:1–92:29. https://doi.org/10.1145/3408974 Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Laurent Hascoët and Valérie Pascual. 2013. The Tapenade automatic differentiation tool: Principles, model, and specification. ACM Trans. Math. Softw., 39, 3 (2013), 20:1–20:43. https://doi.org/10.1145/2450153.2450158 Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Troels Henriksen, Niels G. W. Serup, Martin Elsman, Fritz Henglein, and Cosmin E. Oancea. 2017. Futhark: purely functional GPU-programming with nested parallelism and in-place array updates. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2017, Barcelona, Spain, June 18-23, 2017, Albert Cohen and Martin T. Vechev (Eds.). ACM, 556–571. https://doi.org/10.1145/3062341.3062354 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Michael Innes. 2018. Don’t Unroll Adjoint: Differentiating SSA-Form Programs. CoRR, abs/1810.07951 (2018), arXiv:1810.07951. arxiv:1810.07951 Google ScholarGoogle Scholar
  14. Simon L. Peyton Jones and Simon Marlow. 2002. Secrets of the Glasgow Haskell Compiler inliner. J. Funct. Program., 12, 4&5 (2002), 393–433. https://doi.org/10.1017/S0956796802004331 Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Damiano Mazza and Michele Pagani. 2021. Automatic differentiation in PCF. Proc. ACM Program. Lang., 5, POPL (2021), 1–27. https://doi.org/10.1145/3434309 Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Samuel Mimram. 2020. PROGRAM = PROOF. Google ScholarGoogle Scholar
  17. William S. Moses and Valentin Churavy. 2020. Instead of Rewriting Foreign Code for Machine Learning, Automatically Synthesize Fast Gradients. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, Hugo Larochelle, Marc’Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin (Eds.). https://proceedings.neurips.cc/paper/2020/hash/9332c513ef44b682e9347822c2e457ac-Abstract.html Google ScholarGoogle Scholar
  18. Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Köpf, Edward Z. Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d’Alché-Buc, Emily B. Fox, and Roman Garnett (Eds.). 8024–8035. https://proceedings.neurips.cc/paper/2019/hash/bdbca288fee7f92f2bfa9f7012727740-Abstract.html Google ScholarGoogle Scholar
  19. Adam Paszke, Daniel D. Johnson, David Duvenaud, Dimitrios Vytiniotis, Alexey Radul, Matthew J. Johnson, Jonathan Ragan-Kelley, and Dougal Maclaurin. 2021. Getting to the point: index sets and parallelism-preserving autodiff for pointful array programming. Proc. ACM Program. Lang., 5, POPL (2021), 1–29. https://doi.org/10.1145/3473593 Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Amir Shaikhha, Andrew W. Fitzgibbon, Simon Peyton Jones, and Dimitrios Vytiniotis. 2017. Destination-passing style for efficient memory management. In Proceedings of the 6th ACM SIGPLAN International Workshop on Functional High-Performance Computing, FHPC@ICFP 2017, Oxford, UK, September 7, 2017, Phil Trinder and Cosmin E. Oancea (Eds.). ACM, 12–23. https://doi.org/10.1145/3122948.3122949 Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Amir Shaikhha, Andrew W. Fitzgibbon, Dimitrios Vytiniotis, and Simon Peyton Jones. 2019. Efficient differentiable programming in a functional array-processing language. Proc. ACM Program. Lang., 3, ICFP (2019), 97:1–97:30. https://doi.org/10.1145/3341701 Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Eelco Visser. 2005. A survey of strategies in rule-based program transformation systems. J. Symb. Comput., 40, 1 (2005), 831–873. https://doi.org/10.1016/j.jsc.2004.12.011 Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Eelco Visser, Zine-El-Abidine Benaissa, and Andrew P. Tolmach. 1998. Building Program Optimizers with Rewriting Strategies. In Proceedings of the third ACM SIGPLAN International Conference on Functional Programming (ICFP ’98), Baltimore, Maryland, USA, September 27-29, 1998, Matthias Felleisen, Paul Hudak, and Christian Queinnec (Eds.). ACM, 13–26. https://doi.org/10.1145/289423.289425 Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Fei Wang, Xilun Wu, Grégory M. Essertel, James M. Decker, and Tiark Rompf. 2018. Demystifying Differentiable Programming: Shift/Reset the Penultimate Backpropagator. CoRR, abs/1803.10228 (2018), arXiv:1803.10228. arxiv:1803.10228 Google ScholarGoogle Scholar
  25. Yann LeCun. 2018. Yann LeCun - OK, Deep Learning has outlived its usefulness... | Facebook. https://web.archive.org/web/20180106001630/https://www.facebook.com/yann.lecun/posts/10155003011462143 [Online; accessed 7-April-2022] Google ScholarGoogle Scholar

Index Terms

  1. Using Rewrite Strategies for Efficient Functional Automatic Differentiation

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      FTfJP 2023: Proceedings of the 25th ACM International Workshop on Formal Techniques for Java-like Programs
      July 2023
      64 pages
      ISBN:9798400702464
      DOI:10.1145/3605156

      Copyright © 2023 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 18 July 2023

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate51of75submissions,68%
    • Article Metrics

      • Downloads (Last 12 months)37
      • Downloads (Last 6 weeks)2

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader