skip to main content
10.1145/1542431.1542446acmconferencesArticle/Chapter ViewAbstractPublication PagesismmConference Proceedingsconference-collections
research-article

A component model of spatial locality

Published:19 June 2009Publication History

ABSTRACT

Good spatial locality alleviates both the latency and bandwidth problem of memory by boosting the effect of prefetching and improving the utilization of cache. However, conventional definitions of spatial locality are inadequate for a programmer to precisely quantify the quality of a program, to identify causes of poor locality, and to estimate the potential by which spatial locality can be improved.

This paper describes a new, component-based model for spatial locality. It is based on measuring the change of reuse distances as a function of the data-block size. It divides spatial locality into components at program and behavior levels. While the base model is costly because it requires the tracking of the locality of every memory access, the overhead can be reduced by using small inputs and by extending a sampling-based tool. The paper presents the result of the analysis for a large set of benchmarks, the cost of the analysis, and the experience of a user study, in which the analysis helped to locate a data-layout problem and improve performance by 7% with a 6-line change in an application with over 2,000 lines.

References

  1. G. Ammons, T. Ball, and J. R. Larus. Exploiting hardware performance counters with flow and context sensitive profiling. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 85--96, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. M. Arnold and B. G. Ryder. A framework for reducing the cost of instrumented code. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, Snowbird, Utah, June 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. E. Berg and E. Hagersten. Fast data-locality profiling of native execution. In Proceedings of the International Conference on Measurement and Modeling of Computer Systems, pages 169--180, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. K. Beyls and E. D'Hollander. Discovery of locality-improving refactoring by reuse path analysis. In Proceedings of HPCC. Springer. Lecture Notes in Computer Science Vol. 4208, pages 220--229, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. R. B. Bunt and J. M. Murphy. Measurement of locality and the behaviour of programs. The Computer Journal, 27(3):238--245, 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. B. Calder, C. Krintz, S. John, and T. Austin. Cache-conscious data placement. In Proceedings of the Eighth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-VIII), San Jose, Oct 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. C. Cascaval, E. Duesterwald, P. F. Sweeney, and R. W. Wisniewski. Multiple page size modeling and optimization. In Proceedings of the International Conference on Parallel Architecture and Compilation Techniques, St. Louis, MO, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. T. M. Chilimbi. Efficient representations and abstractions for quantifying and exploiting data reference locality. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, Snowbird, Utah, June 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. C. Ding and Y. Zhong. Predicting whole-program locality with reuse distance analysis. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, San Diego, CA, June 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. C. Fang, S. Carr, S. Onder, and Z. Wang. Instruction based memory distance analysis and its application to optimization. In Proceedings of the International Conference on Parallel Architecture and Compilation Techniques, St. Louis, MO, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. H. Han and C.-W. Tseng. Exploiting locality for irregular scientific codes. IEEE Transactions on Parallel and Distributed Systems, 17(7):606--618, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. M. Hirzel and T. M. Chilimbi. Bursty tracing: A framework for low-overhead temporal profiling. In Proceedings of ACM Workshop on Feedback-Directed and Dynamic Optimization, Dallas, Texas, 2001.Google ScholarGoogle Scholar
  13. K. Kelsey, T. Bai, and C. Ding. Fast track: a software system for speculative optimization. In Proceedings of the International Symposium on Code Generation and Optimization, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. K.-H. Li. Reservoir-sampling algorithms of time complexity o(n(1+log(n/n))). ACM Transactions on Mathematical Software, 20(4):481--493, December 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. G. Marin and J. Mellor-Crummey. Cross architecture performance predictions for scientific applications using parameterized models. In Proceedings of the International Conference on Measurement and Modeling of Computer Systems, New York City, NY, June 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. G. Marin and J. Mellor-Crummey. Scalable cross-architecture predictions of memory hierarchy response for scientific applications. In Proceedings of the Symposium of the Las Alamos Computer Science Institute, Sante Fe, New Mexico, 2005.Google ScholarGoogle Scholar
  17. R. L. Mattson, J. Gecsei, D. Slutz, and I. L. Traiger. Evaluation techniques for storage hierarchies. IBM System Journal, 9(2):78--117, 1970.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. K. S. McKinley, S. Carr, and C.-W. Tseng. Improving data locality with loop transformations. ACM Transactions on Programming Languages and Systems, 18(4):424--453, July 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. R. C. Murphy and P. M. Kogge. On the memory access patterns of supercomputer applications: Benchmark selection and its implications. IEEE Transactions on Computers, 56(7):937--945, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. N. Nethercote and J. Seward. Valgrind: a framework for heavyweight dynamic binary instrumentation. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 89--100, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. E. Petrank and D. Rawitz. The hardness of cache conscious data placement. In Proceedings of ACM Symposium on Principles of Programming Languages, Portland, Oregon, January 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. M. L. Seidl and B. G. Zorn. Segregating heap objects by reference behavior and lifetime. In Proceedings of the Eighth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-VIII), San Jose, Oct 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. X. Shen, Y. Gao, C. Ding, and R. Archambault. Lightweight reference affinity analysis. In Proceedings of the 19th ACM International Conference on Supercomputing, pages 131--140, Cambridge, MA, June 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. X. Shen and J. Shaw. Scalable implementation of efficient locality approximation. In J. N. Amaral, editor, Proceedings of the Workshop on Languages and Compilers for Parallel Computing, pages 202--216, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. X. Shen, J. Shaw, B. Meeker, and C. Ding. Locality approximation using time. In Proceedings of the ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 55--61, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. X. Shen, Y. Zhong, and C. Ding. Regression-based multi-model prediction of data reuse signature. In Proceedings of the 4th Annual Symposium of the Las Alamos Computer Science Institute, Sante Fe, New Mexico, November 2003.Google ScholarGoogle Scholar
  27. A. J. Smith. On the effectiveness of set associative page mapping and its applications in main memory management. In Proceedings of the 2nd International Conference on Software Engineering, 1976. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Spec cpu benchmarks. http://www.spec.org/benchmarks.html\#cpu.Google ScholarGoogle Scholar
  29. M. M. Strout, L. Carter, and J. Ferrante. Compile-time composition of run-time data and iteration reorderings. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 245--257, San Diego, CA, June 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. J. Weinberg, M. O. McCracken, E. Strohmaier, and A. Snavely. Quantifying locality in the memory access patterns of hpc applications. In Proceedings of Supercomputing, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. B. S. White, S. A. McKee, B. R. de Supinski, B. Miller, D. Quinlan, and M. Schulz. Improving the computational intensity of unstructured mesh applications. In Proceedings of the 19th ACM International Conference on Supercomputing, pages 341--350, Cambridge, MA, June 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. C. Zhang, C. Ding, M. Ogihara, Y. Zhong, and Y. Wu. A hierarchical model of data locality. In Proceedings of the ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, Charleston, SC, January 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. H. Zhang and D. Gildea. Stochastic lexicalized inversion transduction grammar for alignment. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pages 475--482, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Y. Zhong, S. G. Dropsho, X. Shen, A. Studer, and C. Ding. Miss rate prediction across program inputs and cache configurations. IEEE Transactions on Computers, 56(3):328--343, March 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Y. Zhong, M. Orlovich, X. Shen, and C. Ding. Array regrouping and structure splitting using whole-program reference affinity. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, June 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A component model of spatial locality

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        ISMM '09: Proceedings of the 2009 international symposium on Memory management
        June 2009
        158 pages
        ISBN:9781605583471
        DOI:10.1145/1542431

        Copyright © 2009 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 19 June 2009

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        ISMM '09 Paper Acceptance Rate15of32submissions,47%Overall Acceptance Rate72of156submissions,46%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader