skip to main content
article
Free Access

Improving WCET by applying a WC code-positioning optimization

Published:01 December 2005Publication History
Skip Abstract Section

Abstract

Applications in embedded systems often need to meet specified timing constraints. It is advantageous to not only calculate the worst-case execution time (WCET) of an application, but to also perform transformation, which reduce the WCET, since an application with a lower WCET will be less likely to violate its timing constraints. Some processors incur a pipeline delay whenever an instruction transfers control to a target that is not the next sequential instruction. Code-positioning optimizations attempt to reduce these delays by positioning the basic blocks to minimize the number of unconditional jumps and taken conditional branches that occur. Traditional code-positioning algorithms use profile data to find the frequently executed edges between basic blocks, then minimize the transfers of control along these edges to reduce the average case execution time (ACET). This paper introduces a WCET code-positioning optimization, driven by the worst-case (WC) path information from a timing analyzer, to reduce the WCET instead of ACET. This WCET optimization changes the layout of the code in memory to reduce the branch penalties along the WC paths. Unlike the frequency of edges in traditional profile-driven code positioning, the WC path may change after code-positioning decisions are made. Thus, WCET code positioning is inherently more challenging than ACET code positioning. The experimental results show that this optimization typically finds the optimal layout of the basic blocks with the minimal WCET. The results show over a 7% reduction in WCET is achieved after code positioning is performed.

References

  1. Arnold, R., Mueller, F., and Whalley, D. 1994. Bounding worst-case instruction cache performance. In Proceedings of the Fifteenth IEEE Real-time Systems Symposium, San Juan. IEEE Computer Society Press. 172--181.Google ScholarGoogle Scholar
  2. Benitez, M. 1994. Retargetable register allocation. Ph.D. thesis, University of Virginia, Char-lottesville, VA.Google ScholarGoogle Scholar
  3. Benitez, M. E. and Davidson, J. W. 1988. A portable global optimizer and linker. In Proceedings of the SIGPLAN'88 conference on Programming Language design and Implementation, Atlanta, GA. ACM Press, New York. 329--338. Google ScholarGoogle Scholar
  4. Benitez, M. E. and Davidson, J. W. 1994. The advantages of machine-dependent global optimization. In Proceedings of the 1994 International Conference on Programming Languages and Architectures, 105--124. Google ScholarGoogle Scholar
  5. Calder, B. and Grunwald, D. 1994. Reducing branch costs via branch alignment. In Proceeding of ASPLOS'94, San Jose, CA. ACM Press, New York. 242--251. Google ScholarGoogle Scholar
  6. Engblom, J. and Ermedahl, A. 2000. Modeling complex flows for worst-case execution time analysis. In Proceedings of the 21st IEEE Real-time System Symposium, Orlando, FL. IEEE Computer Society Press, 875--889. Google ScholarGoogle ScholarCross RefCross Ref
  7. Eyre, J. and Bier, J. 1998. Dsp processors hit the mainsteam. IEEE Computer 31, 8 (Aug.), 51--59. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Harmon, M., Baker, T., and Whalley, D. 1994. A retargetable technique for prediction execution time of code segments. Real-Time Systems. 159--182. Google ScholarGoogle Scholar
  9. Healy, C. and Whalley, D. 1999. Tighter timing predictions by automatic detection and exploitation of value-dependent constraints. In Proceedings of the IEEE Real-Time Technology and Applications Symposium, Vancouver. IEEE Computer Society Press. 79--99. Google ScholarGoogle Scholar
  10. Healy, C. and Whalley, D. 2000. Automatic detection and exploitation of branch constraints for timing analysis. IEEE Transaction on Software Engineering 28, 8 (Aug.), 763--781. Google ScholarGoogle Scholar
  11. Healy, C., Whalley, D., and Harmon, M. 1995. Integrating the timing analysis of pipelining and instruction caching. In Proceedings of the Sixteenth IEEE Real-Time Systems Symposium, Pisa. IEEE Computer Society Press. 288--297. Google ScholarGoogle Scholar
  12. Healy, C., Arnold, R., Mueller, F., Whalley, D., and Harmon, M. 1999. Bounding pipeline and instruction cache performance. IEEE Transactions on Computers 48, 1 (Jan.), 53--70. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Healy, C., Sjodin, M., Rustagi, V., Whalley, D., and van engelen, R. 2000a. Supporting timing analysis by automatic bounding of loop iterations. Real-Time Systems 18, 2 (May), 121--148. Google ScholarGoogle ScholarCross RefCross Ref
  14. Healy, C., Whalley, D., and Van engelen, R. 2000b. A general approach for tight timing predictions of non-rectangular loops. In WIP Proceedings of the IEEE Real-Time Technology and Applications Symposium, Washington, DC. IEEE Computer Society Press. 11--14.Google ScholarGoogle Scholar
  15. Hong, S. and Gerber, R. 1993. Compiling real-time programs into schedulable code. In Proceedings of the SIGPLAN'93, Albuquerque, NM. ACM Press, New York. 166--176. Google ScholarGoogle Scholar
  16. Ko, L., Healy, C., Ratliff, E., Arnold, R., Whalley, D., and Harmon, M. 1996. Supporting the specification and analysis of timing constraints. In Proceeding of the IEEE Real-Time Technology and Application Symposium, Boston, MA. IEEE Computer Society Press. 170--178. Google ScholarGoogle Scholar
  17. Ko, L., Al-Yaqoubi, N., Healy, C., Ratliff, E., Arnold, R., Whalley, D., and Harmon, M. 1999. Timing constraint specification and analysis. Software Practice & Experience 29, 1 (Jan.), 77--98. Google ScholarGoogle ScholarCross RefCross Ref
  18. Kulkarni, P., Zhao, W., Moon, H., Cho, K., Whalley, D., Davidson, J., Bailey, M., Paek, Y., and Gallivan, K. 2003. Finding effective optimization phase sequences. In ACM SIGPLAN Conference on Languages, Compilers, and Tools for Embedded Systems, San Diego, CA. ACM Press, New York. 12--23. Google ScholarGoogle Scholar
  19. Lee, S., Lee, J., Park, C., and Min, S. 2004. A flexible tradeoff between code size and wcet using a dual instruction set processor. In International Workshop on Software and Compilers for Embedded Systems, Amsterdam. Springer, New york. 244--258.Google ScholarGoogle Scholar
  20. Li, Y., Malik, S., and Wolfe, A. 1995. Efficient microarchitecture modeling and path analysis for real-time software. In Proceedings of the Sixteenth IEEE Real-time Systems Symposium, Pisa. IEEE Computer Society Press. 298--307. Google ScholarGoogle Scholar
  21. Lim, S., Bae, Y., Jang, G., Rhee, B., Min, S., Park, C., Shin, H., Park, K., and Kim, C. 1994. An accurate worst case timing analysis technique for risc processors. In Proceedings of the Fifteenth IEEE Real-Time Systems Symposium, San Juan. IEEE Computer Society Press. 875--889. Google ScholarGoogle Scholar
  22. Lundqvist, T. and Stenstrom, P. 1998. Integrating path and timing analysis using instruction-level simulation techniques. In Proceedings of SIGPLAN Workshop on Languages, Compilers and Tools for Embedded Systems(LCTES'98), Montreal. IEEE Computer Society Press. 1--15. Google ScholarGoogle Scholar
  23. Mcfarling, S. and Hennessy, J. 1986. Reducing the cost of branches. In 13th Annual International Symposium of Computer Architecture, Tokyo, Japan. 396--403. Google ScholarGoogle Scholar
  24. Mueller, F. 1997. Timing predictions for multi-level caches. In ACM SIGPLAN Workshop on Language, Compiler and Tool Support for Real-time Systems, Las Vegas, NV. ACM Press, New York. 29--36.Google ScholarGoogle Scholar
  25. Mueller, F. 2000. Timing analysis for instruction caches. Real-Time Systems 18, 2 (May), 209--239. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Pettis, K. and hansen, R. 1990. Profile guided code position. In Proceeding of the ACM SIGPLAN'90 Conference on Programming Language Design and Implementation, ACM Press, New York. 16--27. Google ScholarGoogle Scholar
  27. Shaw, A. C. 1989. Reasoning about time in higher-level language software. IEEE Transactions on Software Engineering 15, 7, 875--889. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Star Core, I. 2001a. Sc100 simulator reference manual.Google ScholarGoogle Scholar
  29. Star Core, I. 2001b. Sc110 dsp core reference manual.Google ScholarGoogle Scholar
  30. T. Marlowe, S. M. 1992. Safe optimization for hard real-time programming. In Special Session on Real-Time Programming, Second International Conference on Systems Integration. 438--446.Google ScholarGoogle Scholar
  31. Vivancos, E., Healy, C., Mueller, F., and Whalley, D. 2001. Parametric timing analysis. In Proceedings of the ACM SIGPLAN Workshop on Language, Compilers, and Tools for Embedded Systems, Snowbird, UT. ACM Press, New York. 83--93. Google ScholarGoogle Scholar
  32. White, R. T., Mueller, F., Healy, C., Whalley, D., and Harmon, M. 1997. Timing analysis for data caches and set-associative caches. In Proceedings of the IEEE Real-Time Technology and Application Symposium, Montreal. IEEE Computer Society Press. 192--202. Google ScholarGoogle Scholar
  33. White, R., Mueller, F., Healy, C., Whalley, D., and Harmon, M. 1999. Timing analysis for data caches and wrap-around-fill caches. Real-Time Systems 17, 1 (Nov.), 209--233. Google ScholarGoogle ScholarCross RefCross Ref
  34. Zhao, W., Cai, B., Whalley, D., Bailey, M., van Engelen, R., Yuan, X., Hiser, J., Davidson, J., Gallivan, K., and Jones, D. 2002. Vista: A system for interactive code improvement. In ACM SIGPLAN Conference on Languages, Compilers, and Tools for Embedded Systems, Berlin. ACM Press, New York. 155--164. Google ScholarGoogle Scholar

Index Terms

  1. Improving WCET by applying a WC code-positioning optimization

              Recommendations

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in

              Full Access

              • Published in

                cover image ACM Transactions on Architecture and Code Optimization
                ACM Transactions on Architecture and Code Optimization  Volume 2, Issue 4
                December 2005
                116 pages
                ISSN:1544-3566
                EISSN:1544-3973
                DOI:10.1145/1113841
                Issue’s Table of Contents

                Copyright © 2005 ACM

                Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                Publisher

                Association for Computing Machinery

                New York, NY, United States

                Publication History

                • Published: 1 December 2005
                Published in taco Volume 2, Issue 4

                Permissions

                Request permissions about this article.

                Request Permissions

                Check for updates

                Qualifiers

                • article

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader