Skip to main content

Introduction

  • Chapter
  • First Online:
Fault-Tolerant Design

Abstract

Fault tolerance is the ability of a system to continue performing its intended functions in presence of faults. Fault tolerance is necessary because it is practically impossible to build a perfect system. As the complexity of a system grows, its reliability drastically decreases, unless compensatory measures are taken. In this chapter, we formally define fault tolerance and discuss its importance for designing a dependable system. We show the relation between fault tolerance and redundancy and consider different types of redundancy. We briefly cover the history of fault-tolerant computing and describe its main application areas.

“If anything can go wrong, it will.” Murphy’s law.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 49.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 64.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 99.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Avižienis, A.: Fault-tolerant systems. IEEE Trans. Comput. 25(12), 1304–1312 (1976)

    Article  MATH  Google Scholar 

  2. Avižienis, A.: Toward systematic design of fault-tolerant systems. Computer 30(4), 51–58 (1997)

    Article  Google Scholar 

  3. Brooks, F.P.: No silver bullet: essence and accidents of software engineering. IEEE Comput 20(4), 10–19 (1987)

    Article  MathSciNet  Google Scholar 

  4. Dubrova, E.: Self-organization for fault-tolerance. In: Hummel, K.A., Sterbenz, J. (eds.) Proceedings of the 3rd International Workshop on Self-Organizing Systems, Lecture Notes in Computer Science, vol. 5243, pp. 145–156. Springer, Berlin/Heidelberg (2008)

    Google Scholar 

  5. Ericsson: More that 50 billions connected devices (2012). www.ericsson.com/res/docs/whitepapers/wp-50-billions.pdf

  6. ITRS: International technology roadmap for semiconductors (2011). http://www.itrs.net/

  7. Johnson, B.: Fault-tolerant microprocessor-based systems. IEEE Micro 4(6), 6–21 (1984)

    Article  Google Scholar 

  8. Laprie, J.C.: Dependable computing and fault tolerance: concepts and terminology. In: Proceedings of 15th International Symposium on Fault-Tolerant Computing (FTSC-15), pp. 2–11 (1985)

    Google Scholar 

  9. Lyu, M.R.: Introduction. In: M.R. Lyu (ed.) Handbook of Software Reliability, pp. 3–25. McGraw-Hill, New York (1996)

    Google Scholar 

  10. Meindl, J.D., Chen, Q., Davis, J.A.: Limits on silicon nanoelectronics for terascale integration. Science 293, 2044–2049 (2001)

    Article  Google Scholar 

  11. Moore, E., Shannon, C.: Reliable circuits using less reliable relays. J. Frankl. Inst. 262(3), 191–208 (1956)

    Article  MathSciNet  Google Scholar 

  12. von Neumann, J.: Probabilistic logics and synthesis of reliable organisms from unreliable components. In: Shannon, C., McCarthy, J. (eds.) Automata Studies, pp. 43–98. Princeton University Press, Princeton (1956)

    Google Scholar 

  13. Weiser, M.: Some computer science problems in ubiquitous computing. Commun. ACM 36(7), 74–83 (1993)

    Article  Google Scholar 

  14. Ziegler, J.F.: Terrestrial cosmic rays and soft errors. IBM J. Res. Dev. 40(1), 19–41 (1996)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Elena Dubrova .

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media New York

About this chapter

Cite this chapter

Dubrova, E. (2013). Introduction. In: Fault-Tolerant Design. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-2113-9_1

Download citation

  • DOI: https://doi.org/10.1007/978-1-4614-2113-9_1

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4614-2112-2

  • Online ISBN: 978-1-4614-2113-9

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics