Abstract
This paper describes Automatic Pool Allocation, a transformation framework that segregates distinct instances of heap-based data structures into seperate memory pools and allows heuristics to be used to partially control the internal layout of those data structures. The primary goal of this work is performance improvement, not automatic memory management, and the paper makes several new contributions. The key contribution is a new compiler algorithm for partitioning heap objects in imperative programs based on a context-sensitive pointer analysis, including a novel strategy for correct handling of indirect (and potentially unsafe) function calls. The transformation does not require type safe programs and works for the full generality of C and C++. Second, the paper describes several optimizations that exploit data structure partitioning to further improve program performance. Third, the paper evaluates how memory hierarchy behavior and overall program performance are impacted by the new transformations. Using a number of benchmarks and a few applications, we find that compilation times are extremely low, and overall running times for heap intensive programs speed up by 10-25% in many cases, about 2x in two cases, and more than 10x in two small benchmarks. Overall, we believe this work provides a new framework for optimizing pointer intensive programs by segregating and controlling the layout of heap-based data structures.
- A. Aiken, M. Fähndrich, and R. Levien. Better static memory management: Improving region-based analysis of higher-order languages. In PLDI, pages 174--185, June 1995.]] Google ScholarDigital Library
- T. Austin, et al. The Pointer-intensive Benchmark Suite. www.cs.wisc.edu/~austin/ptr-dist.html+, Sept 1995.]]Google Scholar
- A. Ayers, S. de Jong, J. Peyton, and R. Schooler. Scalable cross-module optimization. In PLDI, Montreal, June 1998.]] Google ScholarDigital Library
- D. A. Barrett and B. G. Zorn. Using lifetime predictors to improve memory allocation performance. In PLDI, pages 187--196, Albuquerque, New Mexixo, June 1993.]] Google ScholarDigital Library
- E. D. Berger, B. G. Zorn, and K. S. McKinley. Reconsidering custom memory allocation. In OOPSLA, Seattle, Washington, Nov. 2002.]] Google ScholarDigital Library
- B. Blanchet. Escape Analysis for Java(TM): Theory and Practice. TOPLAS, 25(6):713--775, Nov 2003.]] Google ScholarDigital Library
- G. Bollella and J. Gosling. The real-time specification for Java. Computer, 33(6):47--54, 2000.]] Google ScholarDigital Library
- C. Boyapati, A. Salcianu, W. Beebee, and M. Rinard. Ownership types for safe region-based memory management in real-time java. In PLDI, 2003.]] Google ScholarDigital Library
- B. Calder, K. Chandra, S. John, and T. Austin. Cache-conscious data placement. In Proc. ASPLOS-VIII, pages 139--149, San Jose, USA, 1998.]] Google ScholarDigital Library
- S. Cherem and R. Rugina. Region analysis and transformation for java programs. In 2004 Int'l Symposium On Memory Management, Vancouver, Canada, Oct. 2004.]] Google ScholarDigital Library
- T. M. Chilimbi, B. Davidson, and J. R. Larus. Cache-conscious structure definition. In PLDI'99, pages 13--24. ACM Press, 1999.]] Google ScholarDigital Library
- T. M. Chilimbi, M. D. Hill, and J. R. Larus. Cache-conscious structure layout. In PLDI'99, pages 1--12. ACM Press, 1999.]] Google ScholarDigital Library
- T. M. Chilimbi and J. R. Larus. Using generational garbage collection to implement cache-conscious data placement. ACM SIGPLAN Notices, 34(3):37--48, 1999.]] Google ScholarDigital Library
- W.-N. Chin, F. Craciun, S. Qin, and M. Rinard. Region inference for an object-oriented language. In PLDI, Washington, DC, June 2004.]] Google ScholarDigital Library
- R. Courts. Improving locality of reference in a garbage-collecting memory management system. CACM, 31(9):1128--1138, 1988.]] Google ScholarDigital Library
- M. Das. Unification-based pointer analysis with directional assignments. In PLDI, pages 35--46, 2000.]] Google ScholarDigital Library
- R. DeLine and M. Fähndrich. Enforcing high-level protocols in low-level software. In PLDI, Snowbird, UT, June 2001.]] Google ScholarDigital Library
- A. Demers, M. Weiser, B. Hayes, H. Boehm, D. Bobrow, and S. Shenker. Combining generational and conservative garbage collection: framework and implementations. In Proc. ACM POPL, pages 261--269, 1990.]] Google ScholarDigital Library
- D. Dhurjati, S. Kowshik, V. Adve, and C. Lattner. Memory safety without garbage collection for embedded applications. Transactions on Embedded Computing Systems, 4(1):73--111, Feb. 2005.]] Google ScholarDigital Library
- M. Fähndrich, J. Rehof, and M. Das. Scalable context-sensitive flow analysis using instantiation constraints. In PLDI, Vancouver, Canada, June 2000.]] Google ScholarDigital Library
- D. Gay and A. Aiken. Memory management with explicit regions. In PLDI, pages 313--323, Montreal, Canada, 1998.]] Google ScholarDigital Library
- D. Grossman, G. Morrisett, T. Jim, M. Hicks, Y. Wang, and J. Cheney. Region-based memory management in cyclone. In PLDI, June 2002.]] Google ScholarDigital Library
- D. Grunwald and B. Zorn. Customalloc: Efficient synthesized memory allocators. SP&E, 23(8):851--869, 1993.]] Google ScholarDigital Library
- N. Hallenberg, M. Elsman, and M. Tofte. Combining region inference and garbage collection. In PLDI, Berlin, Germany, June 2002.]] Google ScholarDigital Library
- J. Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate generation. In SIGMOD, pages 1--12, 2000.]] Google ScholarDigital Library
- D. R. Hanson. Fast allocation and deallcoation of memory based on object lifetimes. SP&E, 20(1):5--12, Jan 1990.]] Google ScholarDigital Library
- M. Hind. Pointer analysis: haven't we solved this problem yet? In PASTE, pages 54--61. ACM Press, 2001.]] Google ScholarDigital Library
- M. Hirzel, A. Diwan, and M. Hertz. Connectivity-based garbage collection. In OOPSLA, pages 359--373, 2003.]] Google ScholarDigital Library
- X. Huang, S. Blackburn, K. McKinley, E. Moss, Z. Wang, and P. Cheng. The garbage collection advantage: improving program locality. In OOPSLA, pages 69--80, 2004.]] Google ScholarDigital Library
- T. Jim, G. Morrisett, D. Grossman, M. Hicks, J. Cheney, and Y. Wang. Cyclone: A safe dialect of C. In USENIX Annual Technical Conference, Monterey, CA, 2002.]] Google ScholarDigital Library
- R. Jones. Garbage Collection. Algorithms for Automatic Dynamic Memory Management. John Wiley & Sons, 1999.]] Google ScholarDigital Library
- C. Lattner. Macroscopic Data Structure Analysis and Optimization. PhD thesis, Computer Science Dept., University of Illinois at Urbana-Champaign, Urbana, IL, May 2005. See http://llvm.cs.uiuc.edu.]] Google ScholarDigital Library
- C. Lattner and V. Adve. Automatic Pool Allocation for Disjoint Data Structures. In MSP, Berlin, Germany, Jun 2002.]] Google ScholarDigital Library
- C. Lattner and V. Adve. LLVM: A Compilation Framework for Lifelong Program Analysis and Transformation. In CGO, San Jose, USA, Mar 2004.]] Google ScholarDigital Library
- C. Lattner and V. Adve. Transparent Pointer Compression for Linked Data Structures. In Proc. ACM Workshop on Memory System Performance, Chicago, IL, Jun 2005.]] Google ScholarDigital Library
- D. Liang and M. J. Harrold. Efficient points-to analysis for whole-program analysis. In ESEC SIGSOFT FSE, pages 199--215, 1999.]] Google ScholarDigital Library
- D. Liang and M. J. Harrold. Efficient computation of parameterized pointer information for interprocedural analysis. In SAS, July 2001.]] Google ScholarDigital Library
- E. M. Nystrom, H.-S. Kim, and W. mei W. Hwu. Bottom-up and top-down context-sensitive summary-based pointer analysis. In SAS, 2004.]]Google ScholarCross Ref
- A. Rogers, M. Carlisle, J. Reppy, and L. Hendren. Supporting dynamic data structures on distributed memory machines. TOPLAS, 17(2), Mar. 1995.]] Google ScholarDigital Library
- P. Rundberg and F. Warg. The FreeBench v1.0 Benchmark Suite. http://www.freebench.org+, Jan 2002.]]Google Scholar
- M. L. Seidl and B. G. Zorn. Segregating heap objects by reference behavior and lifetime. In ASPLOS-VIII, pages 12--23, San Jose, USA, 1998.]] Google ScholarDigital Library
- R. Shaham, E. Yahav, E. K. Kolodner, and M. Sagiv. Establishing local temporal heap safety properties with applications to compile-time memory management. In SAS, San Diego, USA, June 2003.]] Google ScholarDigital Library
- B. Steensgaard. Points-to analysis in almost linear time. In POPL, pages 32--41, Jan 1996.]] Google ScholarDigital Library
- M. Tofte and L. Birkedal. A region inference algorithm. TOPLAS, 20(4):724--768, July 1998.]] Google ScholarDigital Library
- M. Tofte and J.-P. Talpin. Implementation of the typed call-by-value λ-calculus using a stack of regions. In POPL, pages 188--201, 1994.]] Google ScholarDigital Library
- C. B. Zilles. Benchmark health considered harmful. SIGARCH Comput. Archit. News, 29(3):4--5, 2001.]] Google ScholarDigital Library
Index Terms
- Automatic pool allocation: improving performance by controlling data structure layout in the heap
Recommendations
Making context-sensitive points-to analysis with heap cloning practical for the real world
Proceedings of the 2007 PLDI conferenceContext-sensitive pointer analysis algorithms with full "heapcloning" are powerful but are widely considered to be too expensive to include in production compilers. This paper shows, for the first time, that a context-sensitive, field-sensitive ...
Automatic pool allocation: improving performance by controlling data structure layout in the heap
PLDI '05: Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementationThis paper describes Automatic Pool Allocation, a transformation framework that segregates distinct instances of heap-based data structures into seperate memory pools and allows heuristics to be used to partially control the internal layout of those ...
Transparent pointer compression for linked data structures
MSP '05: Proceedings of the 2005 workshop on Memory system performance64-bit address spaces are increasingly important for modern applications, but they come at a price: pointers use twice as much memory, reducing the effective cache capacity and memory bandwidth of the system (compared to 32-bit address spaces). This ...
Comments