skip to main content
10.1145/2370816.2370837acmconferencesArticle/Chapter ViewAbstractPublication PagespactConference Proceedingsconference-collections
research-article

Making data prefetch smarter: adaptive prefetching on POWER7

Published:19 September 2012Publication History

ABSTRACT

Hardware data prefetch engines are integral parts of many general purpose server-class microprocessors in the field today. Some prefetch engines allow the user to change some of their parameters. The prefetcher, however, is usually enabled in a default configuration during system bring-up and dynamic reconfiguration of the prefetch engine is not an autonomic feature of current machines. Conceptually, however, it is easy to infer that commonly used prefetch algorithms, when applied in a fixed mode will not help performance in many cases. In fact, they may actually degrade performance due to useless bus bandwidth consumption and cache pollution. In this paper, we present an adaptive prefetch scheme that dynamically modifies the prefetch settings in order to adapt to the workload requirements. We implement and evaluate adaptive prefetching in the context of an existing, commercial processor, namely the IBM POWER7. Our adaptive prefetch mechanism improves performance with respect to the default prefetch setting up to 2.7X and 30% for single-threaded and multiprogrammed workloads, respectively.

References

  1. Performance Counters for Linux. https://perf.wiki.kernel.org.Google ScholarGoogle Scholar
  2. Power ISATM Version 2.06 Revision B. https://www.power.org/resources/downloads/PowerISA_V2.06B_V2_PUBLIC.pdf.Google ScholarGoogle Scholar
  3. J. Abeles et al. Performance Guide for HPC Applications on IBM POWER 755 System. https://www.power.org/events/Power7/Performance_Guide_for_HPC_Applications_on_Power_755-Rel_1.0.1.pdf.Google ScholarGoogle Scholar
  4. B. Abraham and J. Ledolter. Statistical Methods for Forecasting. Wiley series in probability and mathematical statistics: Applied probability and statistics. Wiley, 1983.Google ScholarGoogle Scholar
  5. J. L. Baer and T. F. Chen. An Effective On-Chip Preloading Scheme To Reduce Data Access Penalty. In Proc. ACM/IEEE Conf. Supercomputing, SC, pages 176--186, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. C. Boneti, F. J. Cazorla, R. Gioiosa, A. Buyuktosunoglu, C. Y. Cher, and M. Valero. Software-Controlled Priority Characterization of POWER5 Processor. In Proc. 35th Int'l Symp. Comp. Arch., ISCA, pages 415--426, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. H. W. Cain and P. Nagpurkar. Runahead Execution vs. Conventional Data Prefetching in the IBM POWER6 Microprocessor. In Proc. Int'l Symp. Perf. Analysis of Systems Software, ISPASS, pages 203--212, 2010.Google ScholarGoogle Scholar
  8. F. J. Cazorla et al. Predictable Performance in SMT Processors: Synergy between the OS and SMTs. IEEE Trans. Comput., 55(7):785--799, July 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. Choi and D. Yeung. Learning-Based SMT Processor Resource Distribution via Hill-Climbing. In Proc. 33rd Int'l Symp. Comp. Arch., ISCA, pages 239--251, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. P. J. Denning. The Working Set Model for Program Behavior. Commun. ACM, 11(5):323--333, May 1968. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. E. Ebrahimi, C. J. Lee, O. Mutlu, and Y. N. Patt. Prefetch-Aware Shared Resource Management for Multi-Core Systems. In Proc. 38th Int'l Symp. Comp. Arch., ISCA, pages 141--152, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. E. Ebrahimi, O. Mutlu, and Y. N. Patt. Techniques for Bandwidth-Efficient Prefetching of Linked Data Structures in Hybrid Prefetching Systems. In Proc. 15th Int'l Symp. High Perf. Comp. Arch., HPCA, pages 7--17, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  13. P. G. Emma, A. Hartstein, T. R. Puzak, and V. Srinivasan. Exploring the limits of prefetching. IBM J. R&D, 49(1):127--144, January 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. L. Henning. SPEC CPU2006 Benchmark Descriptions. SIGARCH Comp. Arch. News, 34(4):1--17, September 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. I. Hur and C. Lin. Memory Prefetching Using Adaptive Stream Detection. In Proc. 39th Int'l Symp. on Microarchitecture, MICRO, pages 397--408, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. C. Isci, A. Buyuktosunoglu, and M. Martonosi. Long-Term Workload Phases: Duration Predictions and Applications to DVFS. IEEE Micro, 25(5):39--51, September 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. D. Joseph and D. Grunwald. Prefetching using Markov Predictors. In Proc. 24th Int'l Symp. Comp. Arch., ISCA, pages 252--263, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. N. P. Jouppi. Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache and Prefetch Buffers. In Proc. 17th Int'l Symp. Comp. Arch., ISCA, pages 364--373, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. C. J. Lee, O. Mutlu, V. Narasiman, and Y. N. Patt. Prefetch-Aware DRAM Controllers. In Proc. 41st Int'l Symp. Microarch., MICRO, pages 200--209, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. S. W. Liao et al. Machine Learning-Based Prefetch Optimization for Data Center Applications. In Proc. Int'l Conf. High Perf. Comp. Networking, Storage and Analysis, SC, pages 1--10, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. F. Liu and Y. Solihin. Studying the Impact of Hardware Prefetching and Bandwidth Partitioning in Chip-Multiprocessors. In Proc. Int'l Conf. Measur. and Model. of Comp. Sys., SIGMETRICS, pages 37--48, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. P. Mochel. The sysfs Filesystem. Proc. Annual Linux Symp., 2005.Google ScholarGoogle Scholar
  23. M. Moreto, F. J. Cazorla, A. Ramirez, R. Sakellariou, and M. Valero. FlexDCP: a QoS Framework for CMP Architectures. SIGOPS Oper. Syst. Rev., 43(2):86--96, April 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. S. Palacharla and R. E. Kessler. Evaluating Stream Buffers as a Secondary Cache Replacement. In Proc. 21st Int'l Symp. Comp. Arch., ISCA, pages 24--33, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. M. K. Qureshi and Y. N. Patt. Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches. In Proc. 39th Int'l Symp. Microarch., MICRO, pages 423--432, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. A. Roth, A. Moshovos, and G. S. Sohi. Dependence Based Prefetching for Linked Data Structures. In Proc. 8th Int'l Conf. Arch. Support for Prog. Lang. and Operat. Sys., ASPLOS, pages 115--126, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. B. Sinharoy et al. IBM POWER7 multicore server processor. IBM J. R&D, 55(3):1--29, May-June 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Y. Solihin, J. Lee, and J. Torrellas. Using a User-Level Memory Thread for Correlation Prefetching. In Proc. 29th Int'l Symp. Comp. Arch., ISCA, pages 171--182, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. V. Srinivasan et al. A prefetch taxonomy. IEEE Trans. Comp., 53(2):126--140, February 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. C. J. Wu and M. Martonosi. Characterization and Dynamic Mitigation of Intra-Application Cache Interference. In Proc. Int'l Symp. Perf. Analysis of Systems and Software, ISPASS, pages 2--11, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. W. A. Wulf and S. A. McKee. Hitting the Memory Wall: Implications of the Obvious. SIGARCH Comp. Arch. News, 23:20--24, March 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. C. L. Yang and A. R. Lebeck. Push vs. Pull: Data Movement for Linked Data Structures. In Proc. 14th Int'l Conf. Supercomputing, ICS, pages 176--186, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Making data prefetch smarter: adaptive prefetching on POWER7

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Conferences
              PACT '12: Proceedings of the 21st international conference on Parallel architectures and compilation techniques
              September 2012
              512 pages
              ISBN:9781450311823
              DOI:10.1145/2370816

              Copyright © 2012 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 19 September 2012

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article

              Acceptance Rates

              Overall Acceptance Rate121of471submissions,26%

              Upcoming Conference

              PACT '24
              International Conference on Parallel Architectures and Compilation Techniques
              October 14 - 16, 2024
              Southern California , CA , USA

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader