Abstract
Fault tolerance is the ability of a system to continue performing its intended functions in presence of faults. Fault tolerance is necessary because it is practically impossible to build a perfect system. As the complexity of a system grows, its reliability drastically decreases, unless compensatory measures are taken. In this chapter, we formally define fault tolerance and discuss its importance for designing a dependable system. We show the relation between fault tolerance and redundancy and consider different types of redundancy. We briefly cover the history of fault-tolerant computing and describe its main application areas.
“If anything can go wrong, it will.” Murphy’s law.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Avižienis, A.: Fault-tolerant systems. IEEE Trans. Comput. 25(12), 1304–1312 (1976)
Avižienis, A.: Toward systematic design of fault-tolerant systems. Computer 30(4), 51–58 (1997)
Brooks, F.P.: No silver bullet: essence and accidents of software engineering. IEEE Comput 20(4), 10–19 (1987)
Dubrova, E.: Self-organization for fault-tolerance. In: Hummel, K.A., Sterbenz, J. (eds.) Proceedings of the 3rd International Workshop on Self-Organizing Systems, Lecture Notes in Computer Science, vol. 5243, pp. 145–156. Springer, Berlin/Heidelberg (2008)
Ericsson: More that 50 billions connected devices (2012). www.ericsson.com/res/docs/whitepapers/wp-50-billions.pdf
ITRS: International technology roadmap for semiconductors (2011). http://www.itrs.net/
Johnson, B.: Fault-tolerant microprocessor-based systems. IEEE Micro 4(6), 6–21 (1984)
Laprie, J.C.: Dependable computing and fault tolerance: concepts and terminology. In: Proceedings of 15th International Symposium on Fault-Tolerant Computing (FTSC-15), pp. 2–11 (1985)
Lyu, M.R.: Introduction. In: M.R. Lyu (ed.) Handbook of Software Reliability, pp. 3–25. McGraw-Hill, New York (1996)
Meindl, J.D., Chen, Q., Davis, J.A.: Limits on silicon nanoelectronics for terascale integration. Science 293, 2044–2049 (2001)
Moore, E., Shannon, C.: Reliable circuits using less reliable relays. J. Frankl. Inst. 262(3), 191–208 (1956)
von Neumann, J.: Probabilistic logics and synthesis of reliable organisms from unreliable components. In: Shannon, C., McCarthy, J. (eds.) Automata Studies, pp. 43–98. Princeton University Press, Princeton (1956)
Weiser, M.: Some computer science problems in ubiquitous computing. Commun. ACM 36(7), 74–83 (1993)
Ziegler, J.F.: Terrestrial cosmic rays and soft errors. IBM J. Res. Dev. 40(1), 19–41 (1996)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media New York
About this chapter
Cite this chapter
Dubrova, E. (2013). Introduction. In: Fault-Tolerant Design. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-2113-9_1
Download citation
DOI: https://doi.org/10.1007/978-1-4614-2113-9_1
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-2112-2
Online ISBN: 978-1-4614-2113-9
eBook Packages: EngineeringEngineering (R0)