Research Note
Low-Cost, High-Performance Barrier Synchronization on Networks of Workstations

https://doi.org/10.1006/jpdc.1996.1273Get rights and content

Abstract

Circulating active barrier (CAB) is a new low-cost, high-performance hardware mechanism for synchronizing multiple processing elements (PEs) in networks of workstations at fine-grained programmed barriers. CAB is significantly less complex than other hardware barrier synchronization mechanisms with equivalent performance, using only a single conductor, such as a wire or copper run on a printed-circuit board, to circulate barrier packets between PEs. When a PE checks in at a barrier, the CAB hardware will decrement the count associated with that barrier in a bit-serial fashion as a barrier packet passes through, and then will monitor the packets until all PEs have checked in at the barrier. The ring has no clocked sequential logic in the serial loop. A cluster controller (CC) generates packets for active barriers, removes packets when no longer needed, and resets counters when all PEs have seen the zero-count. A hierarchy of PEs can be achieved by connecting the CCs in intercluster rings. When using conservative timing assumptions, the expected synchronization times with optimal clustering are shown to be under 1 μs for as many as 4096 PEs in multiprocessor workstations or 1024 single-processor workstations. The ideal number of clusters for a two-dimensional hierarchy ofNPEs is shown to be [N(D+G)/(I+G)]1/2, whereGis the gate propagation delay,Dis the inter-PE delay, andIis the intercluster transmission time. CAB allows rapid, contention-free check-in and proceed-from- barrier and is applicable to a wide variety of system architectures and topologies.

References (23)

  • X. Fan

    Realization of multiprocessing on a RISC-like architecture

    Multiprocessing and Microprogramming

    (1992)
  • R. Alverson, D. Callahan, D. Cummings, B. Koblenz, A. Porterfield, B. Smith, 1990, The Tera computer system,...
  • J. Anderson, 5/95, Simulation and analysis of barrier synchronization methods, University of...
  • C. Beckmann, C. Polychronopoulos, Nov. 1990, Fast barrier synchronization hardware, Proc. of Supercomputing '90, 180,...
  • J. Bennett, J. Carter, W. Zwaenepoel, May 1990, Adaptive software cache management for distributed shared memory...
  • W. Cohen, W. Dietz, J. Sponaugle, 1994, Dynamic barrier architecture for multi-mode fine-grain parallelism using...
  • Cray T3D System Architecture Overview

    (1993)
  • G. Delp, D. Farber, 1986, Memnet: An experiment in high-speed memory-mapped local network interfaces, University of...
  • R. Gupta

    The fuzzy barrier: A mechanism for high speed synchronization of processors

    Third Architectural Support for Programming Languages and Operating Systems

    (1989)
  • D. Henry et al.

    A tightly-coupled processor–network interface

    Fifth Architectural Support for Programming Languages and Operating Systems

    (1992)
  • R. Iannucci, 1988, Toward a dataflow/Von Neumann hybrid architecture, International Symposium on Computer Architecture,...
  • Cited by (7)

    • Circulating shared-registers for multiprocessor systems

      2006, Journal of Systems Architecture
      Citation Excerpt :

      The new mechanism, however, takes advantage of the simplicity of serial data operations while reducing the latency that such operations introduce by avoiding clocked delays within the ring. The simple bit-serial network between processors is also significantly less expensive than a parallel implementation [19,20]. For a simple single-loop CIRCUS mechanism, shared-register data are continuously circulated in packets around a ring of circulating register hardware (CRH) modules, each one of which is attached to a PE as shown in Fig. 1.

    • Efficient Techniques for Nested and Disjoint Barrier Synchronization

      1999, Journal of Parallel and Distributed Computing
    • A distributed barrier synchronization procedure with dynamic limitation of the coordinating signal propagation region

      2013, Telecommunications and Radio Engineering (English translation of Elektrosvyaz and Radiotekhnika)
    • A barrier synchronization protocol for broadcast networks based on dynamic access control

      2008, Proceedings of the 2008 International Conference on Parallel and Distributed Processing Techniques and Applications, PDPTA 2008
    • A fast barrier synchronization protocol for broadcast networks based on a dynamic access control

      2002, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    View all citing articles on Scopus
    View full text