skip to main content
10.1145/231379.231386acmconferencesArticle/Chapter ViewAbstractPublication PagespldiConference Proceedingsconference-collections
Article
Free Access

A reduced multipipeline machine description that preserves scheduling constraints

Authors Info & Claims
Published:01 May 1996Publication History

ABSTRACT

High performance compilers increasingly rely on accurate modeling of the machine resources to efficiently exploit the instruction level parallelism of an application. In this paper, we propose a reduced machine description that results in faster detection of resource contentions while preserving the scheduling constraints present in the original machine description. The proposed approach reduces a machine description in an automated, error-free, and efficient fashion, Moreover, it fully supports schedulers that backtrack and process operations in arbitrary order. Reduced descriptions for the DEC Alpha 21064, MIPS R3000/R3010, and Cydra 5 result in 4 to 7 times faster detection of resource contentions and require 22 to 90% of the memory storage used by the original machine descriptions. Precise measurement for the Cydra 5 indicates that reducing the machine description results in a 2.9 times faster contention query module.

References

  1. 1.J. C. Dehnert and R. A. Towle. Compiling for the Cydra 5. In The Journal of Supercom~uting, volume 7, pages 181-227, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. 2.N. J. Warter, G. E. Haab, K. Subramanian, and J. W. Bockhaus. Enhanced Modulo Scheduling for loops with conditional branches. Proc. of the 25th Annual International Symposium on Microarchitecture, pages 170-179, Dec. 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. 3.B. R. Rau. Iterative Modulo Scheduling: An algorithm for software pipelining loops. Proc. of the 27th Annual International Symposium on Microarchitecture, pages 63-74, Nov. 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. 4.R.A. Huff. Lifetime-sensitive modulo scheduling. Proc. ofthe ACM SIGPLAN'93 Conference on Programming Language Design and Implementation, pages 258-267, June 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. 5.J. R. Goodman and W.-C. Hsu. Code scheduling and register allocation in large basic blocs. Proceedings of the International Conference on Supercomputing, pages 442--452, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. 6.K. Ebcioglu, R. D. Groves, K.-C. Kim, G. M. Silberman, ~md I. Ziv. VLIW compilation techniques in a superscalar environment. In Proc. of the ACM SIGPLAN'94 Conference on Programming Language Design and Implementation, pages 36- 48. 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. 7.G.P. Lowney et al. The Multiflow trace scheduling compiler. In The Journal of Supercomputing, volume 7, pages 51-142, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. 8.P.P. Chang, N. J. Warter, S. A. Mahlke, W. Y. Chen, and W. W. Hwu. Three architectural models for compiler-controlled speculative execution. IEEE Transactions on Computers, 44(4):481--494, April t995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. 9.D. Bemstein and M. Rodeh. Globalinstmction scheduling :{or superscalar machines. In Proc. of the ACM SIGPLAN'91 Conference on Programming Language Design and implementation, pages 241-255, June 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. 10.S.-M. Moon and K. Ebcioglu. An efficient resourceconstrained global scheduling technique for superscalar and VLIW processors. Proc. of the 25th Annual international Symposium on Microarchitecture, pages 55-71, Sept. t992;. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. 11.P.P. Chang, S. A. Mahlke, W. Y. Chen, N. J. Warter, and W. W. Hwu. IMPACT: An architectural framework for multipleinstruction-issue processors. In Proceedings of the Eighteenth Annual International Symposium on Computer Architecture, pages 266-275, May 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. 12.J.C. Gyllenhaal. A machine description tanguage for compilation. Master's thesis, Department of Electrical and Computer Engineering, University of Illinois, Urbana, IL, 1994.Google ScholarGoogle Scholar
  13. 13.J. A. Fisher. Trace scheduling: a technique for global microcode compaction. IEEE Transactions on Computers, 30(7):478-490, July 1981.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. 14.V. Kathail, M. S. Schlansker, andB. R. Rau. HPL PlayDoh architecture specification: Version 1.0. Technical Report HPL- 93-80, HP Laboratories, Feb. 1994.Google ScholarGoogle Scholar
  15. 15.T. A. Proebsting and C. W. Fraser. Detecting pipeline structural hazards quickly. Twenty-First Annual ACM SIGPLAN- SIGACT Symposium on Principles of Programming Languages, pages 280-286, Jan. 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. 16.T. Mtiller. Employing finite automata for resource scheduling. Proc. of the 26th Annual International Symposium on Microarchitecture, pages 12-20, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. 17.V. Bala and N. Rubin. Efficient instruction scheduling using finite state automata. Proc. of the 28th Annual International Symposium on Microarchitecture, pages 46-56, Nov. 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. 18.M. Lain. Software Pipelining: An effective scheduling technique for VLIW machines. Proc. of the ACM SIGPZAN'88 Conference on Programming Language Design and Implementation, pages 318-328, June 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. 19.Digital Equipment Corp., Maynard, MA. DecChip 21064 Microprocessor Hardware Reference Manual EC-NO079- 72 .Google ScholarGoogle Scholar
  20. 20.G. Kane and J. Heinrich. MIPS RISC Architecture. Prentice Hall, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. 21.G. R. Beck, D. W. L. Yen, and T. L. Anderson. The Cydra 5 mini-supercomputer: Architecture and implementation. In The Journal of Supercomputing, volume 7, pages 143-180, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. 22.E. S. Davidson, L. E. Shar, A. T Thomas, and J. H. Patel. Effective control for pipelined computers. Spring COMPCON- 75 digest ofpapers, pages 181-184, Feb. 1975.Google ScholarGoogle Scholar
  23. 23.V. Bala. Personal communication. Feb. 1996.Google ScholarGoogle Scholar
  24. 24.J. H. Patel and E. S. Davidson. Improving the throughput of a pipeline by insertion of delays. Proceedings of the Third Annual International Symposium on Computer Architecture, pages 159-164, 1976. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. 25.M. Berry et al. The Perfect Club Benchmarks: Effective performance evaluation of supercomputers. The international Journal of SupercomputerApplications, 3(3):5-40, Fall 1989.Google ScholarGoogle Scholar
  26. 26.J. Uniejewski. SPEC Benchmark Suite: Designed for today's advanced system. SPEC Newsletter, Fall 1989.Google ScholarGoogle Scholar
  27. 27.F. H. McMahon. The Livermore Fortran Kernels: A computer test of the numerical performance range. Technical Report UCRL-53745, Lawrence Livermore National Laboratory, Livermore, California, 1986.Google ScholarGoogle Scholar
  28. 28.M. S. Schlansker. Personal communication. June 1995.Google ScholarGoogle Scholar
  29. 29.P.Y. Hsu. Highly Concurrent Scalar Processing. PhD thesis, University of Illinois at Urbana-Champaign, 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A reduced multipipeline machine description that preserves scheduling constraints

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            PLDI '96: Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation
            May 1996
            300 pages
            ISBN:0897917952
            DOI:10.1145/231379

            Copyright © 1996 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 1 May 1996

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • Article

            Acceptance Rates

            PLDI '96 Paper Acceptance Rate28of112submissions,25%Overall Acceptance Rate406of2,067submissions,20%

            Upcoming Conference

            PLDI '24

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader