Article

Free Access

Architecture implications of high-speed I/O for distributed-memory computers

Authors:
Thomas Gross

School of Computer Science, Carnegie Mellon University, Pittsburgh, PA

School of Computer Science, Carnegie Mellon University, Pittsburgh, PA
View Profile

,
Peter Steenkiste

School of Computer Science, Carnegie Mellon University, Pittsburgh, PA

School of Computer Science, Carnegie Mellon University, Pittsburgh, PA
View Profile

ICS '94: Proceedings of the 8th international conference on SupercomputingJuly 1994Pages 176–185https://doi.org/10.1145/181181.181335

Published:16 July 1994Publication History

ICS '94: Proceedings of the 8th international conference on Supercomputing

Pages 176–185

ABSTRACT

We consider the problem of high-speed I/O for a single application running on multiple nodes of a distributed-memory parallel computer. Our model is that the parallel system is connected to an I/O system that provides the interface between the internal connections of the parallel system and one or more external connections, such as HIPPI links. We identify two primary operations for this I/O system: scattering data from a high speed link across several lower speed links and gathering data from multiple links onto a single high speed link. We show that these core operations are the basis of the I/O system, independent of the relative speeds of the internal and external connections.

We identify several architectural features that are critical for supporting high-speed scatter and gather operations. They include flexible routing methods in the parallel system, low overhead communication, and the ability to support multiple data streams in and out of the memory on the I/O node.

References

1.Don Adams. Cray T3D system architecture overview. Cray Research Inc., September 1993. Revision 1.C.Google Scholar
2.Tom Blank. The Maspar MP-1 architecture. In IEEE Compcon Spring 1990, pages 20-24, San Francisco, February/March 1990. IEEE, IEEE Computer Society Press.Google ScholarCross Ref
3.Rajesh Bordawekar, Juan DelRosario, and alok Choudhary. Design and evaluation of primitives for parallel I/O. In Proceedings of Supercomputing '93, pages 452-461, Oregon, November 1993, ACM/IEEE. Google ScholarDigital Library
4.Shekhar Borkar, Robert Cohn, George Cox, Sha Gleason, Thomas Gross, H. T. Kung, Monica Lam, Brian Moore, Craig Peterson, John Pieper, Linda Rankin, P. S. Tseng, Jim Sutton, John Urbanski, and Jon Webb. iWarp: An integrated solution to high-speed parallel computing. In Proceedings of the 1988 International Conference on Supercomputing, pages 330-339, Orlando, Florida, November 1988. IEEE Computer Society and ACM SIGARCH. Google ScholarDigital Library
5.Shekhar Borkar, Robert Cohn, George Cox, Thomas Gross, H.T. Kung, Monica Lam, Margie Levine, Brian Moore, Wire Moore, Craig Peterson, Jim Susman, Jim Sutton, John Urbanski, and Jon Webb. Supporting systolic and memory communication in iWarp. In Proceedings of the 17th annaual International Symposium on Computer Architecture, pages 70-81, Seattle, May 1990. ACM/IEEE. Also published as CMU Technical Report CMU-CS-90-197. Google ScholarDigital Library
6.Claudson Bornstein and Peter Steenkiste. Data reshuffling in support of fast i/o for distributed memory machines. In preparation, 1993.Google Scholar
7.Thinking Mahines Corporation. The Connection Machine CM-5 Technical Summary. Thinking Machines Corporation, 1991.Google Scholar
8.Erik DeBenedictis and Peter Madams. nCUBE's parallel I/O with UNIX compatibility. In Proceedings of the Sixth Distributed Memory Computing Conference, pages 270- 277. IEEE, April 1991.Google ScholarCross Ref
9.S. Hiranandani, K. Kennedy, and C. W. Tseng. Compiling Fortran D for MIMD distributed-memory machines. Communications of the ACM, 35(8):66-80, August 1992. Google ScholarDigital Library
10.R.W. Hockney and Jesshope C.R. Parallel Computers. Adam Hilger Ltd., Bristol, U.K., 1981.Google Scholar
11.Intel Corporation. Paragon X/PS Product Overview, March 1991.Google Scholar
12.Sigurd L. Lillevik. The Touchstone 30 gigaflop Delta prototype. In Proceedings of the Sixth distribution Memory Computing Conference, pages 671-677. IEEE, April 1991.Google ScholarCross Ref
13.Susan LoVeso, Marshall Isman, Andy Nanopoulos, William Nesheim, Ewan D. Milne, and Richard Wheeler. sfs: A parallel file system for the CM-5. In Proceedings of the Summer 1993 USENIX Conference, pages 291-305, Cincinnati, Ohio, June 1993. USENIX.Google Scholar
14.nCUBE Corp. nCUBE2: Technical Overview. nCUBE Corporation, Foster City, CA., 1992.Google Scholar
15.John R. Nickolls. The design of the Maspar MP-1: A cost effective massively parallel computer. In IEEE Compcon Spring 1990, pages 25-28, San Francisco, February/March 1990. IEEE Computer Society Press.Google ScholarCross Ref
16.Paul Pierce. A concurrent file system for a highly parallel mass storage subsystem. In Proceedings of the Fourth Conference on Hypercubes, Concurrent Computers, and Applications, volume 1, pages 155-160, California, March 1989.Google Scholar
17.T.W. Pratt, J. C. French, R M. Dickens, and S. A. Janet Jr. A comparison of the architecture and performance of two parallel file systems. In Proceedings of the Fourth Conference on Hypercubes, Concurrent Computers, and Applications, volume 1, pages 161-166, California, March 1989.Google Scholar
18.Peter Steenkiste, Michael Hemy, Todd Mummert, and Brian Zill. Architecture and evaluation of a high-speed networking subsystem for distributed-memory systems. In Proceedings of the 21th Annual International Symposium on Computer Architecture. IEEE, May 1994. Google ScholarDigital Library
19.Peter A. Steenkiste, Brian D. Zill, H.T. Kung, Steven J. Schlick, Jim Hughes, Bob Kowalski, and John Mullaney. A host interference architecture for high-speed networks. In Proceedings of the 4th IFIP Conference on High Performance Networks, pages A3 1-16, Liege, Belgium, December 1992. IFIP, Elsevier. Google ScholarDigital Library
20.Lewis W. Tucker and George G. Robertson. Architecture and applications of the Connection Machine. IEEE Computer 21(8):26-38, August 1988. Google ScholarDigital Library

Index Terms

Architecture implications of high-speed I/O for distributed-memory computers

Recommendations

High performance computing for vision on distributed-memory machines
Read More
High-Performance Radix-2, 3 and 5 Parallel 1-D Complex FFT Algorithms for Distributed-Memory Parallel Computers

In this paper, we propose high-performance radix-2, 3 and 5 parallel 1-D complex FFT algorithms for distributed-memory parallel computers. We use the four-step or six-step FFT algorithms to implement the radix-2, 3 and 5 parallel 1-D complex FFT ...
Read More
Compiling shared-memory applications for distributed-memory systems
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICS '94: Proceedings of the 8th international conference on Supercomputing
July 1994
452 pages
ISBN:0897916654
DOI:10.1145/181181
Chairmen:
John Gurd
Univ. of Manchester, Manchester, UK
,
William Jalby
Univ. de Versailles, France
Copyright © 1994 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 16 July 1994
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- Article
Conference

Acceptance Rates
ICS '94 Paper Acceptance Rate45of114submissions,39%Overall Acceptance Rate629of2,180submissions,29%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 3
  Total Citations
  View Citations
- 262
  Total Downloads
- Downloads (Last 12 months)15
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Architecture implications of high-speed I/O for distributed-memory computers

ICS '94: Proceedings of the 8th international conference on Supercomputing

ABSTRACT

References

Cited By

Index Terms

Recommendations

High performance computing for vision on distributed-memory machines

High-Performance Radix-2, 3 and 5 Parallel 1-D Complex FFT Algorithms for Distributed-Memory Parallel Computers

Compiling shared-memory applications for distributed-memory systems

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Architecture implications of high-speed I/O for distributed-memory computers

ICS '94: Proceedings of the 8th international conference on Supercomputing

ABSTRACT

References

Cited By

Index Terms

Recommendations

High performance computing for vision on distributed-memory machines

High-Performance Radix-2, 3 and 5 Parallel 1-D Complex FFT Algorithms for Distributed-Memory Parallel Computers

Compiling shared-memory applications for distributed-memory systems

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media