skip to main content
article
Free Access

The impact of synchronization and granularity on parallel systems

Authors Info & Claims
Published:01 May 1990Publication History
Skip Abstract Section

Abstract

In this paper, we study the impact of synchronization and granularity on the performance of parallel systems using an execution-driven simulation technique. We find that even though there can be a lot of parallelism at the fine grain level, synchronization and scheduling strategies determine the ultimate performance of the system. Loop-iteration level parallelism seems to be a more appropriate level when those factors are considered. We also study barrier synchronization and data synchronization at the loop iteration level and found both schemes are needed for a better performance.

References

  1. 1 R. Allen, D. Gallahan, and K. Kennedy. Automatic decomposition of scientific programs for parallel execution. ACM Symp. on Principles of Programming Languages, 63-76, Jan. 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. 2 Alliant. FX/Series Architecture Manual. Alliant Computer Systems Corp., Jan. 1986.Google ScholarGoogle Scholar
  3. 3 G. Amdahl. Validity of the single-processor appIO8Cb t0 8ChkViIIg kUgtSCdC COmpUter capabilities. AFIPS Conf., 483485, 1967.Google ScholarGoogle Scholar
  4. 4 Arvind, D. Culler, and G. Maa. Assessing the benefits of fine-grain parallelism in dataflow programs. Supercomputing, 60-69, Nov. 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. 5 D.-K. Chen. MaxPar: An Execution Driven Simulator for Studying Parallel Systems. CSRD T&917, Center for Supercomputing Research and Development, Univ. of lllinois at Urbana-Champaign, Sep. 1989.Google ScholarGoogle Scholar
  6. 6 R. Cytron. Doacross: beyond vtctorization for muI- tiprocessors. 1986 Int. Conf. on Parallel Processing, 836-845, Aug. 1986.Google ScholarGoogle Scholar
  7. 7 H. Diets, T. Schwederski, M. O'keefe, and A. Zaa Static synchronization beyond VLIW. Supercomputing, 416425, Nov. 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. 8 Z. Fang, P. Yew, P. Tang, and C. Zhu. Dynamic processor self-scheduling for general paraRe nested loops. 1987 Int. Conf. Parallel Processing, l-10, Aug. 1987.Google ScholarGoogle Scholar
  9. 9 J. Fisher. Very long word instruction architecture and the ELI-512. hat. Sym. Computer Architecture, 140-150, June 1983. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. 10 M, Flynn, C. Mitchell, and J. Mulder. And now a case for more complex instruction sets. IEEE Computer, 71-83, Sep. 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. 11 A. Gottheb, R. Grishman, C. Kruskal, K. McAuliffe, L. Rudolph, and M. Snir. The NYU Ultracomputer - designing an MIMD shared memory parallel computer. IEEE T&s. Comput., 175-189, Feb. 1983.Google ScholarGoogle Scholar
  12. 12 N. Jouppi and D. WaR. AvaiiabIe instruction-level parallelism for superscalar and superpipehned machines. Int. Conf. Architectural Support for Programming Languages and Operating Systems, 272- 282, Apr. 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. 13 D. Kuck, E. Davidson, D. Lawrie, and A. Sameh. ParaRe1 supercomputing today and the Cedar approach. Science, 231:967-974, Feb. 1986.Google ScholarGoogle Scholar
  14. 14 D. Kuck, A. Sameh, R. Cytron, A. Veidenbaum, C. Polychronopoulos, G. Lee, T. McDaniel, B. Leasure, C. Beckman, J. Davies, and C. Kruskal. The effects of program restructuring, algorithm change, and architecture choice on program performance. 1984 Int. Conf. on Parallel Processing, Aug. 1984.Google ScholarGoogle Scholar
  15. 15 J. Kuebn and B. Smith. The Horizon supercomPuting system: architecture and software. Supercomputing, 28-34, Nov. 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. 16 M. Kumar. Effect of storage aRocation/rcclamation methods on parallelism and storage requirements. Int. Symp. on Computer Architecture, 197-205, June 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. 17 A. Nicolau and J. Fisher. Using an oracle to measure potential parallelism in single instruction stream programs. Annual Microprogramming Workshop, 171-182, 1981. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. 18 The Perfect Club, et al. The Perfect Club benchmarks: effective performance evaluation of supercomputers Int. J. of Supercomputer Applicatiorw, 5-40, Fall 1989.Google ScholarGoogle Scholar
  19. 19 G. Pfister, W. Brantley, D. George, S. Harvey, W. Kleinfelder, K. McAuliffe, EN. Melton, V. Norton, and J. Weiss. The IBM research parallel processor prototype (RP3): introduction and architecture. 1985 Int. Conf. on Parallel Processing, 764-771, Aug. 1985.Google ScholarGoogle Scholar
  20. 20 C. Polychronopoulos and D. Kuck. Guided selfscheduling: a practical schcduIing scheme for parallel supercomputers. IEEE Tbanr. Computer, 1425- 1439, Dec. 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. 21 B. J. Smith. A pipelined, shared resource :mimd computer. 1978 Int. Conf. on Parallel Processing, 16-8, Aug. 1978.Google ScholarGoogle Scholar
  22. 22 M. Smith, M. Johnson, and M. Horowitz. Limits on Multiple Instruction Issue. Int. Conf. Architectural Support for Programming Languages and Operating Systems, 290-302, Apr. 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. 23 H. Su and P. Yew. On data s:ynchronization for multiprocessors, Int. Sym. Computer Architecture, 416- 423, May 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. 24 A. Veidenbaum. Compiler Optimizations and Architecture Design Issues for Multiprocessors. Ph.D. Thesis, Dept. of Computer Science, Univ. of IRinoiz at Urbana-Champaign, Champaign, May 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. 25 C. Zhu and P. Yew. A scheme to enforce data dependence on large multiprocessor systems. IEEE Trans. Software Eng., 726-739, June 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. The impact of synchronization and granularity on parallel systems

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM SIGARCH Computer Architecture News
      ACM SIGARCH Computer Architecture News  Volume 18, Issue 2SI
      Special Issue: Proceedings of the 17th annual international symposium on Computer Architecture
      June 1990
      356 pages
      ISSN:0163-5964
      DOI:10.1145/325096
      Issue’s Table of Contents

      Copyright © 1990 Authors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 1 May 1990

      Check for updates

      Qualifiers

      • article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader