ABSTRACT
Inclusion-based alias analysis for C can be formulated as a context-free language (CFL) reachability problem. It is well known that the traditional cubic CFL-reachability algorithm does not scale well in practice. We present a highly scalable and efficient CFL-reachability-based alias analysis for C. The key novelty of our algorithm is to propagate reachability information along only original graph edges and bypass a large portion of summary edges, while the traditional CFL-reachability algorithm propagates along all summary edges. We also utilize the Four Russians' Trick - a key enabling technique in the subcubic CFL-reachability algorithm - in our alias analysis. We have implemented our subcubic alias analysis and conducted extensive experiments on widely-used C programs from the pointer analysis literature. The results demonstrate that our alias analysis scales extremely well in practice. In particular, it can analyze the recent Linux kernel (which consists of 10M SLOC) in about 30 seconds.
- A. V. Aho, J. E. Hopcroft, and J. D. Ullman. The Design and Analysis of Computer Algorithms. Addison-Wesley, 1974. Google ScholarDigital Library
- R. Alur, M. Benedikt, K. Etessami, P. Godefroid, T. W. Reps, and M. Yannakakis. Analysis of recursive state machines. ACM Trans. Program. Lang. Syst., 27(4):786--818, 2005. Google ScholarDigital Library
- L. Andersen. Program analysis and specialization for the C programming language. PhD thesis, University of Cophenhagen, 1994.Google Scholar
- V. Arlazarov, E. Dinic, M. Kronrod, and I. Faradzev. On economic construction of the transitive closure of a directed graph. Soviet Mathematics Doklady, 11:1209--1210, 1970.Google Scholar
- S. Blackshear, B.-Y. E. Chang, S. Sankaranarayanan, and M. Sridharan. The flow-insensitive precision of Andersen's analysis in practice. In SAS, pages 60--76, 2011. Google ScholarDigital Library
- V. T. Chakaravarthy. New results on the computability and complexity of points-to analysis. In POPL, pages 115--125, 2003. Google ScholarDigital Library
- T. M. Chan. All-pairs shortest paths for unweighted undirected graphs in o(mn) time. In SODA, pages 514--523, 2006. Google ScholarDigital Library
- S. Chaudhuri. Subcubic algorithms for recursive state machines. In POPL, pages 159--169, 2008. Google ScholarDigital Library
- J.-D. Choi, M. G. Burke, and P. R. Carini. Efficient flow-sensitive interprocedural computation of pointer-induced aliases and side effects. In POPL, pages 232--245, 1993. Google ScholarDigital Library
- J. Conway. Regular Algebra and Finite Machines. Chapman and Hall, London, 1971.Google Scholar
- M. Das. Unification-based pointer analysis with directional assignments. In PLDI, pages 35--46, 2000. Google ScholarDigital Library
- C. Earl, I. Sergey, M. Might, and D. V. Horn. Introspective pushdown analysis of higher-order programs. In ICFP, pages 177--188, 2012. Google ScholarDigital Library
- M. Fähndrich, J. S. Foster, Z. Su, and A. Aiken. Partial online cycle elimination in inclusion constraint graphs. In PLDI, pages 85--96, 1998. Google ScholarDigital Library
- B. Hardekopf. personal communication, 2012.Google Scholar
- B. Hardekopf and C. Lin. The ant and the grasshopper: fast and accurate pointer analysis for millions of lines of code. In PLDI, pages 290--299, 2007. Google ScholarDigital Library
- N. Heintze and D. A. McAllester. On the cubic bottleneck in subtyping and flow analysis. In LICS, pages 342--351, 1997. Google ScholarDigital Library
- N. Heintze and O. Tardieu. Ultra-fast aliasing analysis using CLA: A million lines of C code in a second. In PLDI, pages 254--263, 2001. Google ScholarDigital Library
- M. Hind. Pointer analysis: Haven't we solved this problem yet? In PASTE, pages 54--61, 2001. Google ScholarDigital Library
- J. E. Hopcroft, R. Motwani, and J. D. Ullman. Introduction to automata theory, languages, and computation. Addison-Wesley-Longman, 2001. Google ScholarDigital Library
- S. Horwitz. Precise flow-insensitive may-alias analysis is NPhard. ACM Trans. Program. Lang. Syst., 19(1):1--6, 1997. Google ScholarDigital Library
- V. Kahlon. Bootstrapping: A technique for scalable flow and context-sensitive pointer alias analysis. In PLDI, pages 249--259, 2008. Google ScholarDigital Library
- J. Kodumal and A. Aiken. The set constraint/CFL reachability connection in practice. In PLDI, pages 207--218, 2004. Google ScholarDigital Library
- D. Kozen. Automata and computability. Undergraduate texts in computer science. Springer, 1997. Google ScholarDigital Library
- W. Landi and B. G. Ryder. Pointer-induced aliasing: A problem classification. In POPL, pages 93--103, 1991. Google ScholarDigital Library
- W. Landi and B. G. Ryder. A safe approximate algorithm for interprocedural pointer aliasing. In PLDI, pages 235--248, 1992. Google ScholarDigital Library
- M. Naik, A. Aiken, and J. Whaley. Effective static race detection for java. In PLDI, pages 308--319, 2006. Google ScholarDigital Library
- F. M. Q. Pereira and D. Berlin. Wave propagation and deep propagation for pointer analysis. In CGO, pages 126--135, 2009. Google ScholarDigital Library
- P. Pratikakis, J. S. Foster, and M. Hicks. Existential label flow inference via cfl reachability. In SAS, pages 88--106, 2006. Google ScholarDigital Library
- G. Ramalingam. The undecidability of aliasing. ACM Trans. Program. Lang. Syst., 16(5):1467--1471, 1994. Google ScholarDigital Library
- J. Rehof and M. Fähndrich. Type-base flow analysis: from polymorphic subtyping to CFL-reachability. In POPL, pages 54--66, 2001. Google ScholarDigital Library
- T. W. Reps. Shape analysis as a generalized path problem. In PEPM, pages 1--11, 1995. Google ScholarDigital Library
- T.W. Reps. Program analysis via graph reachability. Information & Software Technology, 40(11--12):701--726, 1998.Google Scholar
- T. W. Reps, S. Horwitz, S. Sagiv, and G. Rosay. Speeding up slicing. In SIGSOFT FSE, pages 11--20, 1994. Google ScholarDigital Library
- T. W. Reps, S. Horwitz, and S. Sagiv. Precise interprocedural dataflow analysis via graph reachability. In POPL, pages 49--61, 1995. Google ScholarDigital Library
- A. Rountev and S. Chandra. Off-line variable substitution for scaling points-to analysis. In PLDI, pages 47--56, 2000. Google ScholarDigital Library
- W. Rytter. Fast recognition of pushdown automaton and context-free languages. Information and Control, 67(1-3):12--22, 1985. Google ScholarDigital Library
- L. Shang, X. Xie, and J. Xue. On-demand dynamic summary-based points-to analysis. In CGO, pages 264--274, 2012. Google ScholarDigital Library
- M. Sridharan and R. Bodík. Refinement-based context-sensitive points-to analysis for Java. In PLDI, pages 387--400, 2006. Google ScholarDigital Library
- M. Sridharan and S. J. Fink. The complexity of andersen's analysis in practice. In SAS, pages 205--221, 2009. Google ScholarDigital Library
- M. Sridharan, D. Gopan, L. Shan, and R. Bodík. Demand-driven points-to analysis for Java. In OOPSLA, pages 59--76, 2005. Google ScholarDigital Library
- M. Sridharan, S. J. Fink, and R. Bodík. Thin slicing. In PLDI, pages 112--122, 2007. Google ScholarDigital Library
- B. Steensgaard. Points-to analysis in almost linear time. In POPL, pages 32--41, 1996. Google ScholarDigital Library
- Z. Su, M. Fähndrich, and A. Aiken. Projection merging: Reducing redundancies in inclusion constraint graphs. In POPL, pages 81--95, 2000. Google ScholarDigital Library
- O. Tripp, M. Pistoia, S. J. Fink, M. Sridharan, and O.Weisman. TAJ: effective taint analysis of web applications. In PLDI, pages 87--97, 2009. Google ScholarDigital Library
- V. Vassilevska. Efficient algorithms for clique problems. Information Processing Letters, 109(4):254--257, 2009. Google ScholarDigital Library
- H. S. Warren. Hacker's Delight. Addison-Wesley Longman Publishing Co., Inc., 2002. Google ScholarDigital Library
- J. Whaley and M. S. Lam. Cloning-based context-sensitive pointer alias analysis using binary decision diagrams. In PLDI, pages 131--144, 2004. Google ScholarDigital Library
- X. Xiao, Q. Zhang, J. Zhou, and C. Zhang. Persistent pointer information. In PLDI, pages 463--474, 2014. Google ScholarDigital Library
- G. Xu, A. Rountev, and M. Sridharan. Scaling CFL-reachability-based points-to analysis using context-sensitive must-not-alias analysis. In ECOOP, pages 98--122, 2009. Google ScholarDigital Library
- D. Yan, G. H. Xu, and A. Rountev. Demand-driven context-sensitive alias analysis for Java. In ISSTA, pages 155--165, 2011. Google ScholarDigital Library
- M. Yannakakis. Graph-theoretic methods in database theory. In PODS, pages 230--242, 1990. Google ScholarDigital Library
- Q. Zhang, M. R. Lyu, H. Yuan, and Z. Su. Fast algorithms for Dyck-CFL-reachability with applications to alias analysis. In PLDI, pages 435--446, 2013. Google ScholarDigital Library
- S. Zhang, B. G. Ryder, andW. Landi. Program decomposition for pointer aliasing: A step toward practical analyses. In SIGSOFT FSE, pages 81--92, 1996. Google ScholarDigital Library
- X. Zheng and R. Rugina. Demand-driven alias analysis for C. In POPL, pages 197--208, 2008. Google ScholarDigital Library
Index Terms
- Efficient subcubic alias analysis for C
Recommendations
Demand-driven alias analysis for C
POPL '08: Proceedings of the 35th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languagesThis paper presents a demand-driven, flow-insensitive analysisalgorithm for answering may-alias queries. We formulate thecomputation of alias queries as a CFL-reachability problem, and use this formulation to derive a demand-driven analysis algorithm. ...
Demand-driven context-sensitive alias analysis for Java
ISSTA '11: Proceedings of the 2011 International Symposium on Software Testing and AnalysisSoftware tools for program understanding, transformation, verification, and testing often require an efficient yet highly-precise alias analysis. Typically this is done by computing points-to information, from which alias queries can be answered. This ...
Efficient subcubic alias analysis for C
OOPSLA '14Inclusion-based alias analysis for C can be formulated as a context-free language (CFL) reachability problem. It is well known that the traditional cubic CFL-reachability algorithm does not scale well in practice. We present a highly scalable and ...
Comments