skip to main content

Exploiting the Sparseness of Control-Flow and Call Graphs for Efficient and On-Demand Algebraic Program Analysis

Published:16 October 2023Publication History
Skip Abstract Section

Abstract

Algebraic Program Analysis (APA) is a ubiquitous framework that has been employed as a unifying model for various problems in data-flow analysis, termination analysis, invariant generation, predicate abstraction and a wide variety of other standard static analysis tasks. APA models program summaries as elements of a regular algebra . Suppose that a summary in A is assigned to every transition of the program and that we aim to compute the effect of running the program starting at line s and ending at line t. APA first computes a regular expression capturing all program paths of interest. In case of intraprocedural analysis, models all paths from s to t, whereas in the interprocedural case it models all interprocedurally-valid paths, i.e. ‍paths that go back to the right caller function when a callee returns. This regular expression is then interpreted over the algebra to obtain the desired result. Suppose the program has n lines of code and each evaluation of an operation in the regular algebra takes O(k) time. It is well-known that a single APA query, or a set of queries with the same starting point s, can be answered in O(n · α(n) · k), where α is the inverse Ackermann function. In this work, we consider an on-demand setting for APA: the program is given in the input and can be preprocessed. The analysis has to then answer a large number of on-line queries, each providing a pair (s, t) of program lines which are the start and end point of the query, respectively. The goal is to avoid the significant cost of running a fresh APA instance for each query. Our main contribution is a series of algorithms that, after a lightweight preprocessing of O(n · lgn · k), answer each query in O(k) time. In other words, our preprocessing has almost the same asymptotic complexity as a single APA query, except for a sub-logarithmic factor, and then every future query is answered instantly, i.e. ‍by a constant number of operations in the algebra. We achieve this remarkable speedup by relying on certain structural sparsity properties of control-flow and call graphs (CFGs and CGs). Specifically, we exploit the fact that control-flow graphs of real-world programs have a tree-like structure and bounded treewidth and nesting depth and that their call graphs have small treedepth in comparison to the size of the program. Finally, we provide experimental results demonstrating the effectiveness and efficiency of our approach and showing that it beats the runtime of classical APA by several orders of magnitude.

References

  1. Ali Ahmadi, Krishnendu Chatterjee, Amir Kafshdar Goharshady, Tobias Meggendorfer, Roodabeh Safavi, and Ð orde Zikelic. 2022. Algorithms and Hardness Results for Computing Cores of Markov Chains. In FSTTCS. 250, 29:1–29:20. Google ScholarGoogle Scholar
  2. Ali Ahmadi, Majid Daliri, Amir Kafshdar Goharshady, and Andreas Pavlogiannis. 2022. Efficient approximations for cache-conscious data placement. In PLDI. 857–871. Google ScholarGoogle Scholar
  3. C Aiswarya. 2022. How treewidth helps in verification. ACM SIGLOG News, 9, 1 (2022), 6–21. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Noga Alon and Baruch Schieber. 1987. Optimal preprocessing for answering on-line product queries. https://citeseerx.ist.psu.edu/document?repid=rep1&doi=cf740240d3a7440e23e92a09bf590cb70544cf4f Google ScholarGoogle Scholar
  5. Ali Asadi, Krishnendu Chatterjee, Amir Kafshdar Goharshady, Kiarash Mohammadi, and Andreas Pavlogiannis. 2020. Faster Algorithms for Quantitative Analysis of MCs and MDPs with Small Treewidth. In ATVA. 253–270. Google ScholarGoogle Scholar
  6. Wayne A. Babich and Mehdi Jazayeri. 1978. The Method of Attributes for Data Flow Analysis: Part II. Demand Analysis. Acta Informatica, 10 (1978), 265–272. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Roland C Backhouse and Bernard A Carré. 1975. Regular algebra applied to path-finding problems. IMA Journal of Applied Mathematics, 15, 2 (1975), 161–186. Google ScholarGoogle ScholarCross RefCross Ref
  8. Thomas Ball, Ella Bounimova, Vladimir Levin, Rahul Kumar, and Jakob Lichtenberg. 2010. The Static Driver Verifier Research Platform. In CAV. 6174, 119–122. Google ScholarGoogle Scholar
  9. Thomas Ball and Sriram K. Rajamani. 2000. Bebop: A Symbolic Model Checker for Boolean Programs. In SPIN. 1885, 113–130. Google ScholarGoogle Scholar
  10. Michael A Bender, Martin Farach-Colton, Giridhar Pemmasani, Steven Skiena, and Pavel Sumazin. 2005. Lowest common ancestors in trees and directed acyclic graphs. Journal of Algorithms, 57, 2 (2005), 75–94. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Stephen M. Blackburn, Robin Garner, Chris Hoffmann, Asjad M. Khan, Kathryn S. McKinley, Rotem Bentzur, Amer Diwan, Daniel Feinberg, Daniel Frampton, Samuel Z. Guyer, Martin Hirzel, Antony L. Hosking, Maria Jump, Han Bok Lee, J. Eliot B. Moss, Aashish Phansalkar, Darko Stefanovic, Thomas VanDrunen, Daniel von Dincklage, and Ben Wiedermann. 2006. The DaCapo benchmarks: Java benchmarking development and analysis. In OOPSLA. ACM, 169–190. Google ScholarGoogle Scholar
  12. Eric Bodden. 2012. Inter-procedural data-flow analysis with IFDS/IDE and Soot. In SOAP@PLDI. 3–8. Google ScholarGoogle Scholar
  13. Hans L. Bodlaender. 1988. Dynamic Programming on Graphs with Bounded Treewidth. In ICALP. 317, 105–118. Google ScholarGoogle Scholar
  14. Hans L. Bodlaender. 1996. A Linear-Time Algorithm for Finding Tree-Decompositions of Small Treewidth. SIAM J. Comput., 25, 6 (1996), 1305–1317. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Hans L. Bodlaender and Torben Hagerup. 1998. Parallel Algorithms with Optimal Speedup for Bounded Treewidth. SIAM J. Comput., 27, 6 (1998), 1725–1746. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Richard B Borie, R Gary Parker, and Craig A Tovey. 1992. Automatic generation of linear-time algorithms from predicate calculus descriptions of problems on recursively constructed graph families. Algorithmica, 7 (1992), 555–581. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Jason Breck. 2020. Enhancing Algebraic Program Analysis. University of Wisconsin. Google ScholarGoogle Scholar
  18. Igor Carpanese. 2018. A Visual Introduction to Centroid Decomposition. https://medium.com/carpanese/an-illustrated-introduction-to-centroid-decomposition-8c1989d53308 Google ScholarGoogle Scholar
  19. Krishnendu Chatterjee, Amir Kafshdar Goharshady, and Ehsan Kafshdar Goharshady. 2019. The treewidth of smart contracts. In SAC. 400–408. Google ScholarGoogle Scholar
  20. Krishnendu Chatterjee, Amir Kafshdar Goharshady, Prateesh Goyal, Rasmus Ibsen-Jensen, and Andreas Pavlogiannis. 2019. Faster Algorithms for Dynamic Algebraic Queries in Basic RSMs with Constant Treewidth. ACM Trans. Program. Lang. Syst., 41, 4 (2019), 23:1–23:46. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Krishnendu Chatterjee, Amir Kafshdar Goharshady, Rasmus Ibsen-Jensen, and Andreas Pavlogiannis. 2016. Algorithms for algebraic path properties in concurrent systems of constant treewidth components. In POPL. 733–747. Google ScholarGoogle Scholar
  22. Krishnendu Chatterjee, Amir Kafshdar Goharshady, Rasmus Ibsen-Jensen, and Andreas Pavlogiannis. 2020. Optimal and Perfectly Parallel Algorithms for On-demand Data-Flow Analysis. In ESOP. 112–140. Google ScholarGoogle Scholar
  23. Krishnendu Chatterjee, Amir Kafshdar Goharshady, Nastaran Okati, and Andreas Pavlogiannis. 2019. Efficient parameterized algorithms for data packing. In POPL. 53:1–53:28. Google ScholarGoogle Scholar
  24. Krishnendu Chatterjee, Amir Kafshdar Goharshady, and Andreas Pavlogiannis. 2017. JTDec: A Tool for Tree Decompositions in Soot. In ATVA. 10482, 59–66. Google ScholarGoogle Scholar
  25. Krishnendu Chatterjee, Rasmus Ibsen-Jensen, Amir Kafshdar Goharshady, and Andreas Pavlogiannis. 2018. Algorithms for Algebraic Path Properties in Concurrent Systems of Constant Treewidth Components. ACM Trans. Program. Lang. Syst., 40, 3 (2018), 9:1–9:43. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Krishnendu Chatterjee, Rasmus Ibsen-Jensen, and Andreas Pavlogiannis. 2021. Faster algorithms for quantitative verification in bounded treewidth graphs. Formal Methods Syst. Des., 57, 3 (2021), 401–428. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Giovanna Kobus Conrado, Amir Kafshdar Goharshady, Kerim Kochekov, Yun Chen Tsai, and Ahmed Khaled Zaher. 2023. Artifact for Exploiting the Sparseness of Control-flow and Call Graphs for Efficient and On-demand Algebraic Program Analysis. Zenodo. https://doi.org/10.5281/zenodo.8320671 Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Giovanna Kobus Conrado, Amir Kafshdar Goharshady, Kerim Kochekov, Yun Chen Tsai, and Ahmed Khaled Zaher. 2023. Exploiting the Sparseness of Control-flow and Call Graphs for Efficient and On-demand Algebraic Program Analysis. https://hal.science/hal-04194535 Google ScholarGoogle Scholar
  29. Giovanna Kobus Conrado, Amir Kafshdar Goharshady, and Chun Kit Lam. 2023. The Bounded Pathwidth of Control-Flow Graphs. In OOPSLA. 232:1–232:26. Google ScholarGoogle Scholar
  30. Bruno Courcelle. 1990. The monadic second-order logic of graphs. I. Recognizable sets of finite graphs. Information and computation, 85, 1 (1990), 12–75. Google ScholarGoogle Scholar
  31. Patrick Cousot and Radhia Cousot. 1977. Static Determination of Dynamic Properties of Recursive Procedures. In Formal Description of Programming Concepts. 237–278. Google ScholarGoogle Scholar
  32. Marek Cygan, Fedor V Fomin, Ł ukasz Kowalik, Daniel Lokshtanov, Dániel Marx, Marcin Pilipczuk, Michał Pilipczuk, and Saket Saurabh. 2015. Parameterized algorithms. Springer. Google ScholarGoogle Scholar
  33. Víctor Dalmau, Phokion G. Kolaitis, and Moshe Y. Vardi. 2002. Constraint Satisfaction, Bounded Treewidth, and Finite-Variable Logics. In CP. Springer, 310–326. Google ScholarGoogle Scholar
  34. Mark de Berg, Marc van Kreveld, Mark Overmars, Otfried Cheong Schwarzkopf, Mark de Berg, Marc van Kreveld, Mark Overmars, and Otfried Cheong Schwarzkopf. 2000. More geometric data structures: Windowing. Computational Geometry: algorithms and applications, 211–233. Google ScholarGoogle Scholar
  35. Holger Dell, Christian Komusiewicz, Nimrod Talmon, and Mathias Weller. 2017. The PACE 2017 Parameterized Algorithms and Computational Experiments Challenge: The Second Iteration. In IPEC. 89, 30:1–30:12. Google ScholarGoogle Scholar
  36. Davide della Giustina, Nicola Prezza, and Rossano Venturini. 2019. A New Linear-Time Algorithm for Centroid Decomposition. In SPIRE. 274–282. Google ScholarGoogle Scholar
  37. Evelyn Duesterwald, Rajiv Gupta, and Mary Lou Soffa. 1995. Demand-driven Computation of Interprocedural Data Flow. In POPL. ACM Press, 37–48. Google ScholarGoogle Scholar
  38. Michael Elberfeld, Andreas Jakoby, and Till Tantau. 2010. Logspace Versions of the Theorems of Bodlaender and Courcelle. In FOCS. IEEE Computer Society, 143–152. Google ScholarGoogle Scholar
  39. Javier Esparza, Stefan Kiefer, and Michael Luttenberger. 2010. Newtonian program analysis. J. ACM, 57, 6 (2010), 33:1–33:47. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Andrea Ferrara, Guoqiang Pan, and Moshe Y. Vardi. 2005. Treewidth in Verification: Local vs. Global. In LPAR. 489–503. Google ScholarGoogle Scholar
  41. Fedor V. Fomin, Daniel Lokshtanov, Saket Saurabh, Michal Pilipczuk, and Marcin Wrochna. 2018. Fully Polynomial-Time Parameterized Computations for Graphs and Matrices of Low Treewidth. ACM Trans. Algorithms, 14, 3 (2018), 34:1–34:45. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Harold N. Gabow and Robert Endre Tarjan. 1983. A Linear-Time Algorithm for a Special Case of Disjoint Set Union. In STOC. 246–251. Google ScholarGoogle Scholar
  43. Amir Kafshdar Goharshady and Fatemeh Mohammadi. 2020. An efficient algorithm for computing network reliability in small treewidth. Reliab. Eng. Syst. Saf., 193 (2020), 106665. Google ScholarGoogle ScholarCross RefCross Ref
  44. Amir Kafshdar Goharshady and Ahmed Khaled Zaher. 2023. Efficient Interprocedural Data-Flow Analysis Using Treedepth and Treewidth. In VMCAI. 177–202. Google ScholarGoogle Scholar
  45. Jens Gustedt, Ole A. Mæ hle, and Jan Arne Telle. 2002. The Treewidth of Java Programs. In ALENEX. 2409, 86–97. Google ScholarGoogle Scholar
  46. Susan Horwitz, Thomas W. Reps, and Shmuel Sagiv. 1995. Demand Interprocedural Dataflow Analysis. In FSE. 104–115. Google ScholarGoogle Scholar
  47. Camille Jordan. 1869. Sur les assemblages de lignes. Journal für die reine und angewandte Mathematik, 70 (1869), 185–190. Google ScholarGoogle ScholarCross RefCross Ref
  48. Alexander Kernozhitsky, Anton Älgmyr, Oleksandr Kulkov, and Wiktor Kuchta. 2022. Sqrt Tree. https://cp-algorithms.com/data_structures/sqrt-tree.html Google ScholarGoogle Scholar
  49. Gary A. Kildall. 1973. A Unified Approach to Global Program Optimization. In POPL. 194–206. Google ScholarGoogle Scholar
  50. Zachary Kincaid, Jason Breck, Ashkan Forouhi Boroujeni, and Thomas W. Reps. 2017. Compositional recurrence analysis revisited. In PLDI. ACM, 248–262. Google ScholarGoogle Scholar
  51. Zachary Kincaid, John Cyphert, Jason Breck, and Thomas W. Reps. 2018. Non-linear reasoning for invariant synthesis. In POPL. 54:1–54:33. Google ScholarGoogle Scholar
  52. Zachary Kincaid, Thomas W. Reps, and John Cyphert. 2021. Algebraic Program Analysis. In CAV. 46–83. Google ScholarGoogle Scholar
  53. Stephen Kleene. 1956. Representation of events in nerve nets and finite automata. Automata studies, 34 (1956), 3–41. Google ScholarGoogle Scholar
  54. Joachim Kneis and Alexander Langer. 2008. A Practical Approach to Courcelle’s Theorem. In MEMICS (Electronic Notes in Theoretical Computer Science, Vol. 251). 65–81. Google ScholarGoogle Scholar
  55. Lukasz Kowalik, Marcin Mucha, Wojciech Nadara, Marcin Pilipczuk, Manuel Sorge, and Piotr Wygocki. 2020. The PACE 2020 Parameterized Algorithms and Computational Experiments Challenge: Treedepth. In IPEC. 180, 37:1–37:18. Google ScholarGoogle Scholar
  56. Dexter Kozen. 1990. On Kleene Algebras and Closed Semirings. In MFCS. 26–47. Google ScholarGoogle Scholar
  57. Daniel Kroening, Natasha Sharygina, Stefano Tonetta, Aliaksei Tsitovich, and Christoph M. Wintersteiger. 2008. Loop Summarization Using Abstract Transformers. In ATVA. 111–125. Google ScholarGoogle Scholar
  58. Jørn Lind-Nielsen. 1999. BuDDy: A binary decision diagram package.. Google ScholarGoogle Scholar
  59. Mohsen Alambardar Meybodi, Amir Kafshdar Goharshady, Mohammad Reza Hooshmandasl, and Ali Shakiba. 2022. Optimal Mining: Maximizing Bitcoin Miners’ Revenues from Transaction Fees. In Blockchain. 266–273. Google ScholarGoogle Scholar
  60. Jaroslav Nesetril and Patrice Ossona de Mendez. 2006. Tree-depth, subgraph coloring and homomorphism bounds. Eur. J. Comb., 27, 6 (2006), 1022–1041. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Rolf Niedermeier. 2004. Ubiquitous Parameterization - Invitation to Fixed-Parameter Algorithms. In MFCS. 84–103. Google ScholarGoogle Scholar
  62. Jan Obdrzálek. 2003. Fast Mu-Calculus Model Checking when Tree-Width Is Bounded. In CAV. 80–92. Google ScholarGoogle Scholar
  63. Thomas W. Reps. 1993. Demand Interprocedural Program Analysis Using Logic Databases. In ILPS. 163–196. Google ScholarGoogle Scholar
  64. Thomas W. Reps, Susan Horwitz, and Shmuel Sagiv. 1995. Precise Interprocedural Dataflow Analysis via Graph Reachability. In POPL. 49–61. Google ScholarGoogle Scholar
  65. Thomas W. Reps, Emma Turetsky, and Prathmesh Prabhu. 2017. Newtonian Program Analysis via Tensor Product. ACM Trans. Program. Lang. Syst., 39, 2 (2017), 9:1–9:72. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Neil Robertson and Paul D. Seymour. 1986. Graph Minors. II. Algorithmic Aspects of Tree-Width. J. Algorithms, 7, 3 (1986), 309–322. Google ScholarGoogle ScholarCross RefCross Ref
  67. Shmuel Sagiv, Thomas W. Reps, and Susan Horwitz. 1996. Precise Interprocedural Dataflow Analysis with Applications to Constant Propagation. Theor. Comput. Sci., 167 (1996), 131–170. Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. Micha Sharir and Amir Pnueli. 1978. Two approaches to interprocedural data flow analysis. Courant Institute of Mathematical Sciences. Google ScholarGoogle Scholar
  69. Manu Sridharan, Denis Gopan, Lexin Shan, and Rastislav Bodík. 2005. Demand-driven points-to analysis for Java. In OOPSLA. ACM, 59–76. Google ScholarGoogle Scholar
  70. Robert Endre Tarjan. 1981. Fast Algorithms for Solving Path Problems. J. ACM, 28, 3 (1981), 594–614. Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. Robert Endre Tarjan. 1981. A Unified Approach to Path Problems. J. ACM, 28, 3 (1981), 577–593. Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. Mikkel Thorup. 1998. All structured programs have small tree width and good register allocation. Information and Computation, 142, 2 (1998), 159–181. Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. Raja Vallée-Rai, Phong Co, Etienne Gagnon, Laurie J. Hendren, Patrick Lam, and Vijay Sundaresan. 1999. Soot - a Java bytecode optimization framework. In CASCON. 13. Google ScholarGoogle Scholar
  74. Dacong Yan, Guoqing Xu, and Atanas Rountev. 2011. Demand-driven context-sensitive alias analysis for Java. In ISSTA. ACM, 155–165. Google ScholarGoogle Scholar
  75. Xin Zheng and Radu Rugina. 2008. Demand-driven alias analysis for C. In POPL. ACM, 197–208. Google ScholarGoogle Scholar
  76. Shaowei Zhu and Zachary Kincaid. 2021. Termination analysis without the tears. In PLDI. 1296–1311. Google ScholarGoogle Scholar

Index Terms

  1. Exploiting the Sparseness of Control-Flow and Call Graphs for Efficient and On-Demand Algebraic Program Analysis

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Article Metrics

          • Downloads (Last 12 months)256
          • Downloads (Last 6 weeks)33

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader