A Low-Cost Fault-Tolerant Structure for the Hypercube

Wang, Dajin

doi:10.1023/A:1011636631661

A Low-Cost Fault-Tolerant Structure for the Hypercube

Published: November 2001

Volume 20, pages 203–216, (2001)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Dajin Wang^1,2

48 Accesses
5 Citations
Explore all metrics

Abstract

We propose a new, low-cost fault-tolerant structure for the hypercube that employs spare processors and extra links. The target of the proposed structure is to fully tolerate the first faulty node, no matter where it occurs, and “almost fully” tolerate the second, meaning that the underlying hypercube topology can be resumed if the second faulty node occurs at most locations—expectantly 92% of locations. The unique features of our structure are that (1) it utilizes the unused extra link-ports in the processor nodes of the hypercube to obtain the proposed topology, so that minimum extra hardware is needed in constructing the fault-tolerant structure and (2) the structure's node-degrees are low as desired—the primary and spare nodes all have node-degrees of n + 2 for an n-dimensional hypercube. The number of spare nodes is one fourth of primary nodes. The reconfiguration algorithm in the presence of faults is elegant and efficient. The proposed structure also effectively enhances the diagnosability of the hypercube system. It is shown that the diagnosability of the structure is increased to n + 2, whereas an ordinary n-dimensional hypercube has diagnosability n.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

J. R. Armstrong and F. G. Gray. Fault diagnosis in a Boolean n cube array of microprocessors. IEEE Transactions on Computing, C-30(8):587-590, 1981.
Google Scholar
F. J. Allan, T. Kameda, and S. Toida. An approach to the diagnosability analysis of a system. IEEE Transactions on Computing, 24(10):1040-1042, 1975.
Google Scholar
M. S. Alam and R. G. Melhem. An efficient modular spare allocation scheme and its application to fault-tolerant binary hypercube. IEEE Transactions on Parallel Distributed Systems, 2:117-126, 1991.
Google Scholar
P. Banerjee. Strategies for recon.guring hypercube under faults. In Proceedings of the 20th International Symposium on Fault-Tolerant Computing, 1990.
J. Bruck, R. Cypher, and C.-T. Ho. Efficient fault-tolerant mesh and hypercube architectures. In Proceedings of the 22nd International Symposium on Fault-Tolerant Computing, July 1992, pp. 162-169.
J. Bruck, R. Cypher, and D. Soroker. Running algorithms ef.ciently on faulty hypercubes. Computer Architecture News, 19(1):89-96, 1991.
Google Scholar
J. Bruck, R. Cypher, and D. Soroker. Embedding cube-connected cycles graphs into faulty hypercubes. IEEE Transactions on Computing, 43(10):1210-1220, 1994.
Google Scholar
S. L. Chau and A. L. Liestman. A proposal for a fault-tolerant binary hypercube architecture. In Proceedings of IEEE Fault Tolerant Computing, 1989, pp. 323-330.
G.-M. Chiu and K.-S. Chen. Use of routing capability for fault-tolerant routing in hypercube multicomputers. IEEE Transactions on Computing, 46(8):953-958, 1997.
Google Scholar
G.-M. Chiu and S.-P. Wu. A fault-tolerant routing strategy in hypercube multicomputers. IEEE Transactions on Computing, 45(2):143-155, 1996.
Google Scholar
K. Kaneko and H. Ito. Fault-tolerant routing algorithms for hypercube networks. In Proceedings of the 13th International Parallel Processing Symposium (IPPS) and 10th Symposium on Parallel and Distributed Processing (SPDP), April 1999, pp. 218-224.
J. Kuhl and S. Reddy. Distributed fault-tolerance for large multiprocessor systems. In Proceedings of the 7th International Symposium Computing Architecture, 1980, pp. 23-30.
T. C. Lee. Quick recovery of embedded structures in hypercube computers. In Proceedings of the 5th Distributed Memory Computing Conference, April 1990, pp. 1426-1435.
F. P. Preparata, G. Metze, and R. T. Chien. On the connection assignment problem of diagnosable systems. IEEE Transactions on Electronic Computing, EC-16(12):848-854, 1967.
Google Scholar
C. S. Raghavendra, P.-J. Yang, and S.-B. Tien. Free dimensions-an ef.cient approach to achieving fault tolerance in hypercubes. In Proceedings of the 22nd International Symposium on Fault-Tolerant Computing, July 1992, pp. 170-177.
G. F. Sullivan. A polynomial time algorithm for fault diagnosability. In Proceedings of the 25th Annual Symposium on the Foundations of Computing Science, pp. 148-156. IEEE Computer Society, 1984.
N.-F. Tzeng and S. Wei. Enhanced hypercubes. IEEE Transactions on Computing, C-40(3):284-294, 1991.
Google Scholar
D. Wang. Diagnosability of enhanced hypercubes. IEEE Transactions on Computing, 43(9):1054-1061, 1994.
Google Scholar
J. Wu. Adaptive fault-tolerant routing in cube-based multicomputers using safety vectors. IEEE Transactions on Parallel and Distributed Systems, 9(4):321-334, 1998.
Google Scholar
C. S. Yang, L. P. Zu, and Y. N. Wu. A reconfigurable modular fault-tolerant hypercube architecture. IEEE Transactions on Parallel and Distributed Systems, 5(10):1018-1032, 1994.
Google Scholar

Download references

Author information

Authors and Affiliations

State Key Laboratory for Novel Software Technology at Nanjing University, Nanjing, 210093, China
Dajin Wang
Department of Computer Science, Montclair State University, Upper Montclair, New Jersey, 07043
Dajin Wang

Authors

Dajin Wang
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, D. A Low-Cost Fault-Tolerant Structure for the Hypercube. The Journal of Supercomputing 20, 203–216 (2001). https://doi.org/10.1023/A:1011636631661

Download citation

Issue Date: November 2001
DOI: https://doi.org/10.1023/A:1011636631661

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Low-Cost Fault-Tolerant Structure for the Hypercube

Abstract

Access this article

Similar content being viewed by others

Improved Precise Fault Diagnosis Algorithm for Hypercube-Like Graphs

A unified approach to reliability and edge fault tolerance of cube-based interconnection networks under three hypotheses

Hamiltonian cycle embedding with fault-tolerant edges and adaptive diagnosis in half hypercube

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

A Low-Cost Fault-Tolerant Structure for the Hypercube

Abstract

Access this article

Similar content being viewed by others

Improved Precise Fault Diagnosis Algorithm for Hypercube-Like Graphs

A unified approach to reliability and edge fault tolerance of cube-based interconnection networks under three hypotheses

Hamiltonian cycle embedding with fault-tolerant edges and adaptive diagnosis in half hypercube

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation