Unconstrained speculative execution with predicated state buffering

Authors:
Hideki Ando

System LSI Laboratory, Mitsubishi Electric Corporation, 4-1 Mizuhara, Itami, Hyogo, 664 Japan

System LSI Laboratory, Mitsubishi Electric Corporation, 4-1 Mizuhara, Itami, Hyogo, 664 Japan
View Profile

,
Chikako Nakanishi

System LSI Laboratory, Mitsubishi Electric Corporation, 4-1 Mizuhara, Itami, Hyogo, 664 Japan

System LSI Laboratory, Mitsubishi Electric Corporation, 4-1 Mizuhara, Itami, Hyogo, 664 Japan
View Profile

,
Tetsuya Hara

System LSI Laboratory, Mitsubishi Electric Corporation, 4-1 Mizuhara, Itami, Hyogo, 664 Japan

System LSI Laboratory, Mitsubishi Electric Corporation, 4-1 Mizuhara, Itami, Hyogo, 664 Japan
View Profile

,
Masao Nakaya

System LSI Laboratory, Mitsubishi Electric Corporation, 4-1 Mizuhara, Itami, Hyogo, 664 Japan

System LSI Laboratory, Mitsubishi Electric Corporation, 4-1 Mizuhara, Itami, Hyogo, 664 Japan
View Profile

Authors Info & Claims

ACM SIGARCH Computer Architecture News Volume 23 Issue 2May 1995pp 126–137https://doi.org/10.1145/225830.224367

Published:01 May 1995Publication History

ACM SIGARCH Computer Architecture News

Abstract

Speculative execution is execution of instructions before it is known whether these instructions should be executed. Compiler-based speculative execution has the potential to achieve both a high instruction per cycle rate and high clock rate. Pure compiler-based approaches, however, have greatly limited instruction scheduling due to a limited ability to handle side effects of speculative execution. Significant performance improvement is, thus, difficult in non-numerical applications. This paper proposes a new architectural mechanism, called predicating, which provides unconstrained speculative execution. Predicating removes restrictions which limit the compiler's ability to schedule instructions. Through our hardware support, the compiler is allowed to move instructions past multiple basic block boundaries from any succeeding control path. Predicating buffers the side effects of speculative execution with its predicate, and the buffered predicate efficiently commits or squashes the side effects. The mechanism also provides a speculative exception handling scheme. The scheme, called the future condition, properly postpones speculative exceptions and efficiently restarts the process. We show that our mechanism can be implemented through a modest amount of hardware with little complexity. The evaluation results show that our mechanism significantly improves performance, and achieves a 2.45x speedup over scalar machines.

References

1 A.V. Aho, R. Sethi, and J. D. Ullman, Compilers: Principles. Techniques, and Tools, Addison-Weslay Publishing Company, Reading, Massachusetts, 1986. Google ScholarDigital Library
2 H. Ando, C. Nakanishi, H. Machida, T. Hara, S. Kishida, and M. Nakaya, "Speculative Execution and Reducing Branch Penalty in a Parallel Issue Machine," In Proc. Int. Conf. on Computer Design, pp. 106-113, October 1993.Google Scholar
3 R.A. Bringmann, S. A. Mahlke, R. E. Hank, j. G. Gyllenhail, and W. W. Hwu, "Speculative Execution Exception Recovery using Write-back Suppression," In Proc. MICRO- 26, pp.214-223, December 1993. Google ScholarDigital Library
4 P.P. Chang, S. A, Mahlke, W. Y. Chen, N. J. Warter, and W. W. Hwu, "IMPACT: An Architectural Framework for Multiple-Instruction-Issue Processors," In Proc. 18th Int. Symp. on Computer Architecture, pp.266-275, May 1991. Google ScholarDigital Library
5 R.P. Colwel, R. P. Nix, J. J. O'Donnell, D. B. Papworth, and P. K. Rodman, "A VLIW Architecture for a Trace Scheduling Compiler," In Proc. Second Int. Conf. on Architectural Support for Programming Languages and Operating Systems, pp. 180-192, April 1987. Google ScholarCross Ref
6 K. Ebcioglu and A. Nicolau, "A Global Resource- Constrained Parallelization Technique," In Proc. Third Int. Conf. on Supercomputing, pp.154-163, June 1989. Google ScholarDigital Library
7 J.A. Fisher, "Trace Scheduling: A Technique for Global Microcode Compaction," IEEE Trans. on Computers, C-30(7):478-490, July 1981.Google ScholarDigital Library
8 P.Y.T. Hsu, and E. S. Davidson, "Highly Concurrent Scalar Processing," In Proco 13th hrt. Symp. on Computer Architecture, pp.386-395, June 1986. Google ScholarDigital Library
9 G. Kane, MIPS RISC Architecture, Prentice Hail, Englewood Cliffs, New Jersey, 1988. Google ScholarDigital Library
10 M. S. Lain and R. P. Wilson, "Limits of Control Flow on Parallelism," In Proc. 19th Int. Symp. on Computer Architecture, pp.46-57, June 1992. Google ScholarDigital Library
11 J.K.F. Lee, A. J. Smith, "Branch Prediction Strategies and Branch Target Buffer Design," Computer 17 (1), pp.6-22, January 1984.Google ScholarDigital Library
12 S.A. Mahlke, W. Y. Chen, W. W. Hwu, B. R. Rau, and M. S. Schlansker, "Sentinel Scheduhng for VLIW and Superscalar Processors," In Proc. Second Int. Conf. on Architectural Support for Programming Language.s and Operating Systems, pp.238-247, October 1992. Google ScholarDigital Library
13 S.A. Mahlke, D. C. Lin, W. Y. Chen, R. E. Hank, and R. A. Bringmann, "Effective Compiler Support for Predicated Execution Using the Hyperblock," in Proc. MICRO-25, pp.45- 54, December 1992. Google ScholarDigital Library
14 K, Murakami, N. Irie, M. Kuga, and S. Tomita, "SIMP (Single Instruction Stream/Multiple Instruction Pipelining): A Novel High-Speed Single-Processor Architecture," in Proc. 16th Int. Symp. on Computer Architecture. pp.78-85, June 1989. Google ScholarDigital Library
15 A. Nlcolau, "Percolation Scheduling: A Parallel Compilation Technique," Computer Sciences Technical Report 85-678, Cornel University, May 1985. Google ScholarDigital Library
16 J.E. Smith and A. R. Pleszkun, "Implementation of Precise Interrupts in Pipelined Processors," In Proc. 12th Int. Symp. on Computer Architecture, pp.36-44, June 1985. Google ScholarDigital Library
17 M. D. Smith, M. S. Lain, and M. A. Horowitz, "Boosting Beyond Static Scheduling in a Superscala~r Processor," In Proc. 17th Int. Symp. on Computer Architecture~ pp.344- 355, May 1990. Google ScholarDigital Library
18 M.D. Smith, M. A. Horowitz, and M. S. Lain, "Efficient Superscalar Performance Through Boosting," In Proc. Fifth Int. Conf. on Architectural Support for Programming Lan. guages and Operating Systems, pp.248-259, October 1992. Google ScholarDigital Library
19 R. M. Tomasulo, "An efficient Algorithm for Exploiting Multiple Arithmetic Units," IBM Journal, 11 (i):25-33, January 1967.Google ScholarDigital Library
20 D.W. Wall, "Limits of Instruction-Level Parallelism," In Proc. Fourth h~t. Conf. on Architectural Support for Programming Languages and Operating Systems, pp.272-282, April 1991. Google ScholarDigital Library

Index Terms

Unconstrained speculative execution with predicated state buffering

Recommendations

Unconstrained speculative execution with predicated state buffering
ISCA '95: Proceedings of the 22nd annual international symposium on Computer architecture

Speculative execution is execution of instructions before it is known whether these instructions should be executed. Compiler-based speculative execution has the potential to achieve both a high instruction per cycle rate and high clock rate. Pure ...
Read More
Using Predicated Execution to Improve the Performance of a Dynamically Scheduled Machine with Speculative Execution

Conditional branches incur a severe performance penalty in wide-issue, deeply pipelined processors. Speculative execution(1, 2) and predicated execution(3---9) are two mechanisms that have been proposed for reducing this penalty. Speculative execution ...
Read More
An evaluation of speculative instruction execution on simultaneous multithreaded processors

Modern superscalar processors rely heavily on speculative execution for performance. For example, our measurements show that on a 6-issue superscalar, 93% of committed instructions for SPECINT95 are speculative. Without speculation, processor resources ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM SIGARCH Computer Architecture News Volume 23, Issue 2
Special Issue: Proceedings of the 22nd annual international symposium on Computer architecture (ISCA '95)
May 1995
412 pages
ISSN:0163-5964
DOI:10.1145/225830
Chairman:
David A. Patterson
Univ. of California, Berkeley
Issue’s Table of Contents
ISCA '95: Proceedings of the 22nd annual international symposium on Computer architecture
July 1995
426 pages
ISBN:0897916980
DOI:10.1145/223982
Chairman:
David A. Patterson
Univ. of California, Berkeley
Copyright © 1995 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 May 1995
Check for updates
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 17
  Total Citations
  View Citations
- 368
  Total Downloads
- Downloads (Last 12 months)48
- Downloads (Last 6 weeks)7
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Unconstrained speculative execution with predicated state buffering

ACM SIGARCH Computer Architecture News

Abstract

References

Cited By

Index Terms

Recommendations

Unconstrained speculative execution with predicated state buffering

Using Predicated Execution to Improve the Performance of a Dynamically Scheduled Machine with Speculative Execution

An evaluation of speculative instruction execution on simultaneous multithreaded processors