skip to main content
research-article

WCET analysis with MRU cache: Challenging LRU for predictability

Authors Info & Claims
Published:01 April 2014Publication History
Skip Abstract Section

Abstract

Most previous work on cache analysis for WCET estimation assumes a particular replacement policy called LRU. In contrast, much less work has been done for non-LRU policies, since they are generally considered to be very unpredictable. However, most commercial processors are actually equipped with these non-LRU policies, since they are more efficient in terms of hardware cost, power consumption and thermal output, while still maintaining almost as good average-case performance as LRU.

In this work, we study the analysis of MRU, a non-LRU replacement policy employed in mainstream processor architectures like Intel Nehalem. Our work shows that the predictability of MRU has been significantly underestimated before, mainly because the existing cache analysis techniques and metrics do not match MRU well. As our main technical contribution, we propose a new cache hit/miss classification, k-Miss, to better capture the MRU behavior, and develop formal conditions and efficient techniques to decide k-Miss memory accesses. A remarkable feature of our analysis is that the k-Miss classifications under MRU are derived by the analysis result of the same program under LRU. Therefore, our approach inherits the advantages in efficiency and precision of the state-of-the-art LRU analysis techniques based on abstract interpretation. Experiments with instruction caches show that our proposed MRU analysis has both good precision and high efficiency, and the obtained estimated WCET is rather close to (typically 1%∼8% more than) that obtained by the state-of-the-art LRU analysis, which indicates that MRU is also a good candidate for cache replacement policies in real-time systems.

References

  1. Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman. 1986. Compilers: Principles, Techniques, and Tools. Addison-Wesley Longman Publishing Co., Inc., Boston, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Hussein Al-Zoubi, Aleksandar Milenkovic, and Milena Milenkovic. 2004. Performance evaluation of cache replacement policies for the spec cpu2000 benchmark suite. In Proceedings of the 42nd Annual Southeast Regional Conference. ACM-SE 42. ACM, New York, 267--272. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Frances E. Allen. 1970. Control flow analysis. In Proceedings of the Symposium on Compiler Optimization. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Sebastian Altmeyer, Claire Maiza, and Jan Reineke. 2010. Resilience analysis: tightening the crpd bound for set-associative caches. In Proceedings of the ACM SIGPLAN/SIGBED 2010 Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES'10). ACM, New York, 153--162. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. R. Arnold, F. Mueller, D. B. Whalley, and M. G. Harmon. 1994. Bounding worst-case instruction cache performance. In Proceedings of RTSS.Google ScholarGoogle Scholar
  6. Todd Austin, Eric Larson, and Dan Ernst. 2002. Simplescalar: An infrastructure for computer system modeling. Computer 35, 2, 59--67. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Clément Ballabriga and Hugues Casse. 2008. Improving the first-miss computation in set-associative instruction caches. In Proceedings of the Euromicro Conference on Real-Time Systems (ECRTS'08). IEEE Computer Society, Los Alamitos, CA, 341--350. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M. Berkelaar. lp_solve: (Mixed Integer) linear programming problem solver. ftp://ftp.es.ele.tue.nl/pub/lp_solve.Google ScholarGoogle Scholar
  9. Sudipta Chattopadhyay, Abhik Roychoudhury, and Tulika Mitra. 2010. Modeling shared cache and bus in multi-cores for timing analysis. In Proceedings of the 13th International Workshop on Software Compilers for Embedded Systems (SCOPES'10). ACM, New York, 6:1--6:10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Christoph Cullmann. 2011. Cache persistence analysis: A novel approachtheory and practice. In Proceedings of the 2011 SIGPLAN/SIGBED Conference on Languages, Compilers and Tools for Embedded Systems (LCTES'11). ACM, New York, 121--130. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. David Eklov, Nikos Nikoleris, David Black-Schaffer, and Erik Hagersten. 2011. Cache pirating: Measuring the curse of the shared cache. In Proceedings of the International Conference on Parallel Processing (ICPP'11). IEEE Computer Society, Los Alamitos, CA, 165--175. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. C. Ferdinand. 1997. Cache behavior prediction for real-time systems. Ph.D. Thesis, Universitat des Saarlandes.Google ScholarGoogle Scholar
  13. Christian Ferdinand and Reinhard Wilhelm. 1998. On predicting data cache behavior for real-time systems. In Proceedings of the ACM SIGPLAN Workshop on Languages, Compilers, and Tools for Embedded Systems (LCTES'98), Frank Mueller and Azer Bestavros, Eds., Lecture Notes in Computer Science, vol. 1474, Springer, 16--30. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Daniel Grund and Jan Reineke. 2009. Abstract interpretation of FIFO replacement. In Proceedings of the 16th International Symposium on Static Analysis (SAS'09). Springer-Verlag, Berlin, 120--136. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Daniel Grund and Jan Reineke. 2010a. Precise and efficient FIFO-replacement analysis based on static phase detection. In Proceedings of the 2010 22nd Euromicro Conference on Real-Time Systems, (ECRTS'10). IEEE Computer Society, Los Alamitos, CA, 155--164. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Daniel Grund and Jan Reineke. 2010b. Toward precise PLRU cache analysis. In Proceedings of 10th International Workshop on Worst-Case Execution Time (WCET) Analysis. B. Lisper, Ed., 28--39.Google ScholarGoogle Scholar
  17. Nan Guan, Mingsong Lv, Wang Yi, and Ge Yu. 2012. WCET analysis with MRU caches: Challenging LRU for predictability. In Proceedings of the IEEE 18th Real Time and Embedded Technology and Applications Symposium (RTAS'12). IEEE Computer Society, Los Alamitos, CA, 55--64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Jan Gustafsson, Adam Betts, Andreas Ermedahl, and Björn Lisper. 2010. The mälardalen WCET benchmarks: Past, present and future. In Proceedings of the 10th International Workshop on Worst-Case Execution Time Analysis (WCET'10). B. Lisper, Ed., OASICS Series, vol. 15, Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, Germany, 136--146.Google ScholarGoogle Scholar
  19. Damien Hardy and Isabelle Puaut. 2008. WCET analysis of multi-level non-inclusive set-associative instruction caches. In Proceedings of the 2008 Real-Time Systems Symposium (RTSS'08). IEEE Computer Society, Los Alamitos, CA, 456--466. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Reinhold Heckmann, Marc Langenbach, Stephan Thesing, and Reinhard Wilhelm. 2003. The influence of processor architecture on the design and the results of WCET tools. Proc. IEEE 91, 7, 1038--1054.Google ScholarGoogle ScholarCross RefCross Ref
  21. John L. Hennessy and David A. Patterson. 2006. Computer Architecture: A Quantitative Approach. 4th Ed. Morgan Kaufmann Publishers Inc., San Francisco, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Bach Khoa Huynh, Lei Ju, and Abhik Roychoudhury. 2011. Scope-aware data cache analysis for WCET estimation. In Proceedings of the 17th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS'11). IEEE Computer Society, Los Alamitos, CA, 203--212. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Poonacha Kongetira, Kathirgamar Aingaran, and Kunle Olukotun. 2005. Niagara: A 32-way multithreaded SPARC processor. IEEE Micro 25, 2, 21--29. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Xianfeng Li, Yun Liang, Tulika Mitra, and Abhik Roychoudhury. 2007. Chronos: A timing analyzer for embedded software. Sci. Comput. Program. 69, 1--3, 56--67. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Y. S. Li, S. Malik, and A. Wolfe. 1996. Cache modeling for real-time software: Beyond direct mapped instruction caches. In Proceedings of the 1996 Real-Time Systems Symposium (NTSS'96). IEEE Computer Society, Los Alamitos, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Yau-Tsun Steven Li and Sharad Malik. 1995. Performance analysis of embedded software using implicit path enumeration. In Proceedings of the 32nd Annual ACM/IEEE Design Automation Conference (DAC'95). ACM, New York, 456--461. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Yun Liang, Huping Ding, Tulika Mitra, Abhik Roychoudhury, Yan Li, and Vivy Suhendra. 2012. Timing analysis of concurrent programs running on shared cache multi-cores. Real-Time Syst. 48, 6, 638--680. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. M. Lv. 2012. CATE: A simulator for Cache Analysis Technique Evaluation in WCET estimation. http://faculty.neu.edu.cn/ise/lvmingsong/cate/.Google ScholarGoogle Scholar
  29. A. Malamy, R. Patel, and N. Hayes. 1994. Methods and apparatus for implementing a pseudo-LRU cache memory replacement scheme with a locking feature. United States Patent 5029072.Google ScholarGoogle Scholar
  30. F. Mueller. 1994. Static cache simulation and its applications. Ph.D. thesis, Florida State University. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Frank Mueller. 2000. Timing analysis for instruction caches. Real-Time Syst. 18, 2/3, 217--247. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Peter P. Puschner and Alan Burns. 2000. Guest editorial: A review of worst-case execution-time analysis. Real-Time Syst. 18, 2/3, 115--128. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. J. Reineke. 2008. Caches in WCET analysis - predictability, competitiveness, sensitivity. In Ph.D. thesis, Saarland University.Google ScholarGoogle Scholar
  34. Jan Reineke and Daniel Grund. 2008. Relative competitive analysis of cache replacement policies. In Proceedings of the ACM SIGPLAN-SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES'08). ACM, New York, 51--60. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Jan Reineke and Daniel Grund. 2013. Sensitivity of cache replacement policies. ACM Trans. Embed. Comput. Syst. 12, 1s, 42:1--42:18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Jan Reineke, Daniel Grund, Christoph Berg, and Reinhard Wilhelm. 2007. Timing predictability of cache replacement policies. Real-Time Syst. 37, 2, 99--122. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. R. Sen and Y. N. Srikant. 2007. WCET estimation for executables in the presence of data caches. In Proceedings of the 7th ACM & IEEE International Conference on Embedded Software (EMSOFT'07). ACM, New York, 203--212. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Tyler Sondag and Hridesh Rajan. 2010. A more precise abstract domain for multi-level caches for tighter WCET analysis. In Proceedings of the 31st IEEE Real-Time Systems Symposium (RTSS'10). IEEE Computer Society, Los Alamitos, CA, 395--404. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Jan Staschulat and Rolf Ernst. 2007. Scalable precision cache analysis for real-time software. ACM Trans. Embed. Comput. Syst. 6, 4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Andrew S. Tanenbaum. 2007. Modern Operating Systems 3rd Ed. Prentice-Hall. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Henrik Theiling, Christian Ferdinand, and Reinhard Wilhelm. 2000. Fast and precise WCET prediction by separated cache and path analyses. Real-Time Syst. 18, 2/3, 157--179. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Reinhard Wilhelm, Jakob Engblom, Andreas Ermedahl, Niklas Holsti, Stephan Thesing, David Whalley, Guillem Bernat, Christian Ferdinand, Reinhold Heckmann, Tulika Mitra, Frank Mueller, Isabelle Puaut, Peter Puschner, Jan Staschulat, and Per Stenström. 2008. The worst-case execution-time problem - Overview of methods and survey of tools. ACM Trans. Embed. Comput. Syst. 7, 3, 36:1--36:53. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Reinhard Wilhelm, Daniel Grund, Jan Reineke, Marc Schlickling, Markus Pister, and Christian Ferdinand. 2009. Memory hierarchies, pipelines, and buses for future architectures in time-critical embedded systems. Trans. Comput.-Aided Des. Integ. Cir. Sys. 28, 7, 966--978. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. WCET analysis with MRU cache: Challenging LRU for predictability

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Embedded Computing Systems
        ACM Transactions on Embedded Computing Systems  Volume 13, Issue 4s
        Special Issue on Real-Time and Embedded Technology and Applications, Domain-Specific Multicore Computing, Cross-Layer Dependable Embedded Systems, and Application of Concurrency to System Design (ACSD'13)
        July 2014
        571 pages
        ISSN:1539-9087
        EISSN:1558-3465
        DOI:10.1145/2601432
        Issue’s Table of Contents

        Copyright © 2014 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 1 April 2014
        • Accepted: 1 April 2013
        • Revised: 1 January 2013
        • Received: 1 June 2012
        Published in tecs Volume 13, Issue 4s

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader