skip to main content
article
Free Access

Unconstrained speculative execution with predicated state buffering

Authors Info & Claims
Published:01 May 1995Publication History
Skip Abstract Section

Abstract

Speculative execution is execution of instructions before it is known whether these instructions should be executed. Compiler-based speculative execution has the potential to achieve both a high instruction per cycle rate and high clock rate. Pure compiler-based approaches, however, have greatly limited instruction scheduling due to a limited ability to handle side effects of speculative execution. Significant performance improvement is, thus, difficult in non-numerical applications. This paper proposes a new architectural mechanism, called predicating, which provides unconstrained speculative execution. Predicating removes restrictions which limit the compiler's ability to schedule instructions. Through our hardware support, the compiler is allowed to move instructions past multiple basic block boundaries from any succeeding control path. Predicating buffers the side effects of speculative execution with its predicate, and the buffered predicate efficiently commits or squashes the side effects. The mechanism also provides a speculative exception handling scheme. The scheme, called the future condition, properly postpones speculative exceptions and efficiently restarts the process. We show that our mechanism can be implemented through a modest amount of hardware with little complexity. The evaluation results show that our mechanism significantly improves performance, and achieves a 2.45x speedup over scalar machines.

References

  1. 1 A.V. Aho, R. Sethi, and J. D. Ullman, Compilers: Principles. Techniques, and Tools, Addison-Weslay Publishing Company, Reading, Massachusetts, 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. 2 H. Ando, C. Nakanishi, H. Machida, T. Hara, S. Kishida, and M. Nakaya, "Speculative Execution and Reducing Branch Penalty in a Parallel Issue Machine," In Proc. Int. Conf. on Computer Design, pp. 106-113, October 1993.Google ScholarGoogle Scholar
  3. 3 R.A. Bringmann, S. A. Mahlke, R. E. Hank, j. G. Gyllenhail, and W. W. Hwu, "Speculative Execution Exception Recovery using Write-back Suppression," In Proc. MICRO- 26, pp.214-223, December 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. 4 P.P. Chang, S. A, Mahlke, W. Y. Chen, N. J. Warter, and W. W. Hwu, "IMPACT: An Architectural Framework for Multiple-Instruction-Issue Processors," In Proc. 18th Int. Symp. on Computer Architecture, pp.266-275, May 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. 5 R.P. Colwel, R. P. Nix, J. J. O'Donnell, D. B. Papworth, and P. K. Rodman, "A VLIW Architecture for a Trace Scheduling Compiler," In Proc. Second Int. Conf. on Architectural Support for Programming Languages and Operating Systems, pp. 180-192, April 1987. Google ScholarGoogle ScholarCross RefCross Ref
  6. 6 K. Ebcioglu and A. Nicolau, "A Global Resource- Constrained Parallelization Technique," In Proc. Third Int. Conf. on Supercomputing, pp.154-163, June 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. 7 J.A. Fisher, "Trace Scheduling: A Technique for Global Microcode Compaction," IEEE Trans. on Computers, C-30(7):478-490, July 1981.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. 8 P.Y.T. Hsu, and E. S. Davidson, "Highly Concurrent Scalar Processing," In Proco 13th hrt. Symp. on Computer Architecture, pp.386-395, June 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. 9 G. Kane, MIPS RISC Architecture, Prentice Hail, Englewood Cliffs, New Jersey, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. 10 M. S. Lain and R. P. Wilson, "Limits of Control Flow on Parallelism," In Proc. 19th Int. Symp. on Computer Architecture, pp.46-57, June 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. 11 J.K.F. Lee, A. J. Smith, "Branch Prediction Strategies and Branch Target Buffer Design," Computer 17 (1), pp.6-22, January 1984.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. 12 S.A. Mahlke, W. Y. Chen, W. W. Hwu, B. R. Rau, and M. S. Schlansker, "Sentinel Scheduhng for VLIW and Superscalar Processors," In Proc. Second Int. Conf. on Architectural Support for Programming Language.s and Operating Systems, pp.238-247, October 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. 13 S.A. Mahlke, D. C. Lin, W. Y. Chen, R. E. Hank, and R. A. Bringmann, "Effective Compiler Support for Predicated Execution Using the Hyperblock," in Proc. MICRO-25, pp.45- 54, December 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. 14 K, Murakami, N. Irie, M. Kuga, and S. Tomita, "SIMP (Single Instruction Stream/Multiple Instruction Pipelining): A Novel High-Speed Single-Processor Architecture," in Proc. 16th Int. Symp. on Computer Architecture. pp.78-85, June 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. 15 A. Nlcolau, "Percolation Scheduling: A Parallel Compilation Technique," Computer Sciences Technical Report 85-678, Cornel University, May 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. 16 J.E. Smith and A. R. Pleszkun, "Implementation of Precise Interrupts in Pipelined Processors," In Proc. 12th Int. Symp. on Computer Architecture, pp.36-44, June 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. 17 M. D. Smith, M. S. Lain, and M. A. Horowitz, "Boosting Beyond Static Scheduling in a Superscala~r Processor," In Proc. 17th Int. Symp. on Computer Architecture~ pp.344- 355, May 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. 18 M.D. Smith, M. A. Horowitz, and M. S. Lain, "Efficient Superscalar Performance Through Boosting," In Proc. Fifth Int. Conf. on Architectural Support for Programming Lan. guages and Operating Systems, pp.248-259, October 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. 19 R. M. Tomasulo, "An efficient Algorithm for Exploiting Multiple Arithmetic Units," IBM Journal, 11 (i):25-33, January 1967.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. 20 D.W. Wall, "Limits of Instruction-Level Parallelism," In Proc. Fourth h~t. Conf. on Architectural Support for Programming Languages and Operating Systems, pp.272-282, April 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Unconstrained speculative execution with predicated state buffering

                Recommendations

                Comments

                Login options

                Check if you have access through your login credentials or your institution to get full access on this article.

                Sign in

                Full Access

                • Published in

                  cover image ACM SIGARCH Computer Architecture News
                  ACM SIGARCH Computer Architecture News  Volume 23, Issue 2
                  Special Issue: Proceedings of the 22nd annual international symposium on Computer architecture (ISCA '95)
                  May 1995
                  412 pages
                  ISSN:0163-5964
                  DOI:10.1145/225830
                  Issue’s Table of Contents
                  • cover image ACM Conferences
                    ISCA '95: Proceedings of the 22nd annual international symposium on Computer architecture
                    July 1995
                    426 pages
                    ISBN:0897916980
                    DOI:10.1145/223982

                  Copyright © 1995 ACM

                  Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                  Publisher

                  Association for Computing Machinery

                  New York, NY, United States

                  Publication History

                  • Published: 1 May 1995

                  Check for updates

                  Qualifiers

                  • article

                PDF Format

                View or Download as a PDF file.

                PDF

                eReader

                View online with eReader.

                eReader