ABSTRACT
Most compiler optimizations and software productivity tools rely on information about the effects of pointer dereferences in a program. The purpose of points-to analysis is to compute this information safely, and as accurately as is practical. Unfortunately, accurate points-to information is difficult to obtain for large programs, because the time and space requirements of the analysis become prohibitive.
We consider the problem of scaling flow- and context-insensitive points-to analysis to large programs, perhaps containing hundreds of thousands of lines of code. Our approach is based on a variable substitution transformation, which is performed off-line, i.e., before a standard points-to analysis is performed. The general idea of variable substitution is that a set of variables in a program can be replaced by a single representative variable, thereby reducing the input size of the problem. Our main contribution is a linear-time algorithm which finds a particular variable substitution that maintains the precision of the standard analysis, and is also very effective in reducing the size of the problem.
We report our experience in performing points-to analysis on large C programs, including some industrial-sized ones. Experiments show that our algorithm can reduce the cost of Andersen's points-to analysis substantially: on average, it reduced the running time by 53% and the memory cost by 59%, relative to an efficient baseline implementation of the analysis.
- 1.A. Aiken, M. FShndrich, J. Foster, and Z. Su. A toolkit for constructing type- and constraint-based program analyses. In Prec. Workshop on Types in Compilation, LNCS 1473, pages 78-96, 1998. Google ScholarDigital Library
- 2.L. Andersen. Program Analysis and Specialization for the C Programming Language. PhD thesis, DIKU, University of Copenhagen, May 1994.Google Scholar
- 3.G. DeFouw, D. Grove, and C. Chambers. Fast interprocedural class analysis. In Prec. Symposium on Principles of Programming Languages, pages 222-236, 1998. Google ScholarDigital Library
- 4.M. Emami, R. Ghiya, and L. Hendren. Contextsensitive interprocedural points-to analysis in the presence of function pointers. In Prec. Conference on Programming Language Design and Implementation, pages 242-257, 1994. Google ScholarDigital Library
- 5.M. FShndrich, J. Foster, Z. Su, and A. Aiken. Partial online cycle elimination in inclusion constraint graphs. In Prec. Conference on Programming Language Design and Implementation, pages 85-96, 1998. Google ScholarDigital Library
- 6.R. Ghiya and L. Hendren. Is it a tree, a DAG or a cyclic graph? In Prec. Symposium on Principles of Programming Languages, pages 1-15, 1996. Google ScholarDigital Library
- 7.M. Hind, M. Burke, P. Carini, and J. D. Choi. Interprocedural pointer alias analysis. A CM Transactions on Programming Languages and Systems, 21(4):848-894, July 1999. Google ScholarDigital Library
- 8.W. Landi and B. G. Ryder. A safe approximation algorithm for interprocedural pointer aliasing. In Prec. Conference on Programming Language Design and Implementation, pages 235-248' 1992. Google ScholarDigital Library
- 9.D. Liang and M. J. Harrold. Efficient points-to analysis for whole-program analysis. In Proc. Symposium on the Foundations of Software Engineering' LNCS 1687, pages 199-215, 1999. Google ScholarDigital Library
- 10.D. Liang and M. J. Harrold. Equivalence analysis: A general technique to improve the efficiency of data-flow analyses in the presence of pointers. In Proc. Workshop on Program Analysis for Software Tools and Engineering, pages 39-46, 1999. Google ScholarDigital Library
- 11.J. Reppy. A high-performance garbage collector for Standard ML. Technical memorandum, AT-T Bell Laboratories, Dec. 1993.Google Scholar
- 12.A. Rountev, B. G. Ryder, and W. Landi. Data-flow analysis of program fragments. In Proc. Symposium on the Foundations of Software Engineering, LNCS 1687, pages 235-252, 1999. Google ScholarDigital Library
- 13.M. Sagiv, T. Reps, and R. Wilhelm. Solving shapeanalysis problems in languages with destructive updating. A CM Transactions on Programming Languages and Systems, 20(1):1-50, Jan. 1998. Google ScholarDigital Library
- 14.M. Shapiro and S. Horwitz. Fast and accurate flowinsensitive points-to analysis. In Proc. Symposium on Principles of Programming Languages, pages 1-14, 1997. Google ScholarDigital Library
- 15.B. Steensgaard. Points-to analysis in almost linear time. In Proc. Symposium on Principles of Programming Languages, pages 32-41, 1996. Google ScholarDigital Library
- 16.Z. Su, M. FShndrich, and A. Aiken. Projection merging: Reducing redundancies in inclusion constraint graphs. In Proc. Symposium on Principles of Programming Languages, pages 81-95, 2000. Google ScholarDigital Library
- 17.R. E. Tarjan. Data Structures and Network Algorithms. Society for Industrial and Applied Mathematics, 1983. Google ScholarDigital Library
- 18.R. Wilson and M. Lain. Efficient context-sensitive pointer analysis for C programs. In Proc. Conference on Programming Language Design and Implementation, pages 1-12, 1995. Google ScholarDigital Library
- 19.S. Zhang, B. G. Ryder, and W. Landi. Program decomposition for pointer aliasing: A step towards practical analyses. In Proc. Symposium on the Foundations of Software Engineering, pages 81-92, 1996. Google ScholarDigital Library
Index Terms
- Off-line variable substitution for scaling points-to analysis
Recommendations
Off-line variable substitution for scaling points-to analysis
Most compiler optimizations and software productivity tools rely on information about the effects of pointer dereferences in a program. The purpose of points-to analysis is to compute this information safely, and as accurately as is practical. ...
Efficient points-to analysis for whole-program analysis
To function on programs written in languages such as C that make extensive use of pointers, automated software engineering tools require safe alias information. Existing alias-analysis techniques that are sufficiently efficient for analysis on large ...
Efficient points-to analysis for whole-program analysis
ESEC/FSE-7: Proceedings of the 7th European software engineering conference held jointly with the 7th ACM SIGSOFT international symposium on Foundations of software engineeringTo function on programs written in languages such as C that make extensive use of pointers, automated software engineering tools require safe alias information. Existing alias-analysis techniques that are sufficiently efficient for analysis on large ...
Comments