Skip to main content
Log in

Backtracking-Based Instruction Scheduling to Fill Branch Delay Slots

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Abstract

Conventional schedulers schedule operations in dependence order and never revisit or undo a scheduling decision on any operation. In contrast, backtracking schedulers may unschedule operations and can often generate better schedules. This paper develops and evaluates the backtracking approach to fill branch delay slots. We first present the structure of a generic backtracking scheduling algorithm and prove that it terminates. We then describe two more aggressive backtracking schedulers and evaluate their effectiveness. We conclude that aggressive backtracking-based instruction schedulers can effectively improve schedule quality by eliminating branch delay slots with a small amount of additional computation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

REFERENCES

  1. M. S. Schlansker and B. R. Rau, EPIC: Explicitly Parallel Instruction Computing, Computer, 33(2):37–45 (2000).

    Google Scholar 

  2. S. Muchnick, Advanced Compiler Design and Implementation, Morgan Kaufmann (1997).

  3. P. B. Gibbons and S. S. Muchnick. Efficient Instruction Scheduling for a Pipelined Architecture, in ACM SIGPLAN Symposium on Compiler Construction (1986).

  4. H. S. Warren, Instruction Scheduling for the IBM RISC System/6000 Processor, IBM Journal of Research and Development, 34(1):85–92 (1990).

    Google Scholar 

  5. J. C. Dehnert and R. A. Toole, Compiling for the Cydra-5, J. Supercomput., 7(1/2): 181–227 (1993).

    Google Scholar 

  6. M. Rim and R. Jain, Lower-bound Performance Estimation for the High-Level Synthesis Scheduling Problem, IEEE Transactions on CAD of ICS, 13(4)</del>:452–459 (1994).

    Google Scholar 

  7. M. Langevin and E. Cherny, A Recursive Technique for Computing Lower-Bound Performance of Schedules, ACM Transactions on Design Automaton of Electronic Systems, 1(4):443–455 (1996).

    Google Scholar 

  8. A. Eichenberger and W. M. Meleis, Balance Scheduling: Weighting Branch Tradeoffs in Superblocks, in International Symposium on Microachitecture Haifa, Israel (1999).

  9. S. G. Abraham, Efficient Backtracking Instruction Schedulers, Technical Report HPL-2000-56, Hewlett-Packard Laboratories (2000). www.hpl.hp.com/techreports/2000/HPL-2000-56.html.

  10. B. R. Rau, Iterative Modulo Scheduling, Int. J. Parallel Prog., 24(1):3–64 (1996).

    Google Scholar 

  11. V. Kathail, M. S. Schlansker, and B. R. Rau, HPL PlayDoh Architecture Specification: Version 1.0, Technical Report, Hewlett-Packard Laboratories (1991).

  12. The Trimaran Compilation Infrastructure (1999). www.trimaran.org.

  13. W. W. Hwu, S. A. Mahlke, W. Y. Chen, P. P. Chang, N. J. Warter, R. A. Bringmann, R. G. Ouellette, R. E. Hank, T. Kiyohara, G. E. Haab, J. G. Holm, and D. M. Lavery, The Superblock: An Effective Technique for VLIW and Superscalar Compilation, J. Supercomput., 7(1/2):229–248 (1993).

    Google Scholar 

  14. J. A. Fisher, Global Code Generation for Instruction-Level Parallelism: Trace Scheduling-2, Technical Report, Hewlett-Packard Laboratories (1993).

  15. S. Davidson, D. Landskov, B. D. Shriver, and P. W. Mallet, Some Experiments in Local Microcode Compaction for Horizontal Machines, IEEE Transactions on Computers, C-30(7): 460–477 (1981).

    Google Scholar 

  16. B. R. Rau and J. A. Fisher, Instruction_Level Parallel Processing: History, Overview and Perspective, J. Supercomput., 7(1/2):9–50 (1993).

    Google Scholar 

  17. J. L. Hennessy and T. Gross, Postpass Code Optimization of Pipeline Constraints, ACM Transactions on Programming Languages and Systems, 5(3):422–448 (1983).

    Google Scholar 

  18. V. Bala and N. Rubin, Efficient Instruction Scheduling Using Finite State Automata, Int. J. Parallel Prog., 25(2):53–82 (1997).

    Google Scholar 

  19. J. Fisher, Trace Scheduling: A Technique for Global Microcode Compaction, IEEE Transactions on Computers, C-30(7):478–490 (1981).

    Google Scholar 

  20. P. G. Lowney, S. M. Freudenberger, T. J. Karzes, W. D. Lichtenstein, R. P. Nix, J. S. O'Donnell, and J. C. Ruttenberg, The Multiflow Trace Scheduling Compiler, J. Supercomput., 7(1/2):51–142 (1993).

    Google Scholar 

  21. J. Ferrante, K. J. Ottenstein, and J. D. Warren, The Program Dependence Graph and Its Use in Optimization, ACM Transactions on Programming Languages and Systems, 9(3):319–349 (1987).

    Google Scholar 

  22. D. Bernstein and M. Rodeh, Global Instruction Scheduling for Superscalar Processor, in ACM SIGPLAN Symposium on Programming Languages Design and Implementation, Toronto, Canada (1991).

  23. S. M. Moon and K. Ebcioglu, An Efficient Resource_Constrained Global Scheduling Method for Superscalar and VLIW Processors, in International Symposium on Microachitecture, Portland, Oregon (1992).

  24. P. P. Chang, S. A. Mahlke, and W. W. Hwu, Using Profile Information to Assist Classic Code Optimization, Software Practice and Experience, 21(12):1301–1321 (1991).

    Google Scholar 

  25. S. A. Mahlke, D. C. Lin, W. Y. Chen, R. E. Hank, and R. A. Bringmann, Effective Compiler Support for Predicated Execution Using the Hyperblock, in International Symposium on Microachitecture, Portland, Oregon (1992).

  26. J. Bharadwaj, K. Menezes, and C. McKinsey, Wavefront Scheduling: Path Based Data Representation and Scheduling of Subgraphs, in International Symposium on Microarchitecture, Haifa, Israel (1999).

  27. S. G. Abraham, V. Kathail, and B. L. Dietrich, Meld Scheduling: A Technique for Relaxing Scheduling Constraints, Int. J. Parallel Prog., 26(4):349–381 (1998).

    Google Scholar 

  28. Y. Wang, N. Amato, and D. Friesen, Hindsight Helps: Deterministic Task Scheduling with Backtracking, in International Conference on Parallel Processing, Bloomington, Illinois (1997).

  29. J. Hoogerbrugge and L. Augusteijn, Instruction Scheduling for TriMedia, J. Instruction-Level Parallelism (1) (1999).

  30. A. E. Charlesworth, An Approach to Scientific Array Processing: The Architectural Design of the AP-120B/FPS-164 Family, Computer, 14(9):18–27 (1981).

    Google Scholar 

  31. B. R. Rau, D. W. L. Yen, W. Yen, and R. A. Towle, The Cydra 5 Departmental Supercomputer: Design Philosophies, Decisions and Trade_Offs, Computer, 22(1):12–34 (1989).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Baev, I.D., Meleis, W.M. & Abraham, S.G. Backtracking-Based Instruction Scheduling to Fill Branch Delay Slots. International Journal of Parallel Programming 30, 397–418 (2002). https://doi.org/10.1023/A:1020601110391

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1020601110391

Navigation