Multiagent-Based Fault Tolerance Management for Robustness

Gutierrez, Rosa Laura Zavala; Huhns, Michael

doi:10.1007/978-1-84800-261-6_2

Rosa Laura Zavala Gutierrez² &
Michael Huhns³

585 Accesses
2 Citations

Abstract

Despite the use of software engineering best practices and tools, it would be very risky to assume that the software that is developed today is fault-free. Moreover, we have to consider the fact that the software could face unexpected situations not considered during its design. Robustness is a highly desirable and sometimes indispensable software requirement, especially for critical systems, where the consequences of a system failure can be catastrophic. This chapter outlines existing fault tolerance techniques, followed by a discussion of the potential that multiagent systems have to enhance the design of robust, fault-tolerant systems, thereby improving large-scale, critical, and complex system reliability.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Fragility and Robustness in Multiagent Systems

Fault Tolerance in Multiagent Systems

Evaluating fault tolerance approaches in multi-agent systems

Article 23 November 2015

References

Anderson, H. and Hagelin, G. (1981). Computer Controlled Interlocking System. Ericsson Review No 2.
Google Scholar
Anderson, T. (1985). Resilient Computing Systems. Collins, London, UK.
Google Scholar
Avizienis, A. (1995). The methodology of n-version programming. In Lyu, M. R., editor, Software Fault Tolerance, pages 23–46. John Wiley & Sons, New York.
Google Scholar
Avizienis, A. and Chen, L. (1977). On the implementation of N-version programming for software fault tolerance during execution. In Proceedings of the 1st IEEE International Computer Software and Applications Conference (COMPSAC’77), pages 149–155, 8–11 November, Chicago. IEEE Computer Society.
Google Scholar
Avizienis, A. and Kelly, J. P. J. (1984). Fault tolerance by design diversity: Concepts and experiments. Computer, 17:67–80.
Article Google Scholar
Avizienis, A., Laprie, J.-C., and Randell, B. (2000). Fundamental concepts of dependability. In Proceedings of the 3rd IEEE Information Survability Workshop (ISW-2000), pages 7–12, 20–21 December, Boston. IEEE Computer Society.
Google Scholar
Becker, R. and Corkill, D. (2007). Determining confidence when integrating contributions from multiple agents. In Proceedings of the 6th International Joint Conference on Autonomous Agents and Multi-Agent Systems (AAMAS’07), pages 449–456. The International Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS).
Google Scholar
Bishop, P. (1995). Software fault tolerance by design diversity. In Lyu, M., editor, Software Fault Tolerance, pages 211–229. John Wiley & Sons, New York.
Google Scholar
Brachman, R. J. (2006). (AA)AI more than the sum of its parts. AI Magazine, 27(4):19–34.
Google Scholar
Cheyer, A. and Martin, D. L. (2001). The open agent architecture. Autonomous Agents and Multi-Agent Systems, 4(1/2):143–148.
Article Google Scholar
DeMarco, T. and Lister, T. (1987). Peopleware: productive projects and teams. Dorset House Publishing Co., Inc., New York.
Google Scholar
Donald, L., Keller, S., and Calhoun, C. (1989). Sociology. Alfred A. Knopf, New York.
Google Scholar
Fraser, S., Campara, D., Chilley, C., Gabriel, R., Lopez, R., Thomas, D., and Utas, G. (2005). Fostering software robustness in an increasingly hostile world. In Proceedings of the 20th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications (OOPSLA’05), pages 378–380, 16–20 October, San Diego. ACM.
Google Scholar
Grosspietsch, K. E. and Silayeva, T. A. (2003). An adaptive approach for n-version systems. In Proceedings of the 17th International Symposium on Parallel and Distributed Processing (IPDPS’03), page 215.1, Nice, France. IEEE Computer Society.
Google Scholar
Hasling, J. (1975). Group Discussion and Decision Making. Thomas Y. Crowell Company, New York.
Google Scholar
Hempel, J. (2006). Crowdsourcing: Milk the masses for inspiration. BusinessWeek. 25 September.
Google Scholar
Huhns, M. N., Holderfield, V. T., and Zavala Gutierrez, R. L. (2003a). Achieving software robustness via large-scale multiagent. In Garcia, A., Lucena, C., Zambonelli, F., Omicini, A., and Castro, J., editors, Software Engineering for Large-Scale Multi-Agent Systems, volume 2603 of Lecture Notes in Computer Science, pages 199–215. Springer, Berlin Heidelberg.
Google Scholar
Huhns, M. N., Holderfield, V. T., and Zavala Gutierrez, R. L. (2003b). Robust software via agent-based redundancy. In Proceedings of the 2nd International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS’03), pages 1018–1019. ACM.
Google Scholar
Kephart, J. O. and Chess, D. M. (2003). The vision of autonomic computing. Computer, 36(1):41–50.
Article MathSciNet Google Scholar
Kim, K., Vouk, M., and McAllister, D. (1996). An empirical evaluation of maximum likelihood voting in failure correlation conditions. In Proceedings of the 7th International Symposium on Software Reliability Engineering (ISSRE’96), pages 330–339, White Plains, NY. IEEE Computer Society.
Chapter Google Scholar
Knight, J. and Leveson, N. (1986). An experimental evaluation of the assumption of independence in multi-version programming. IEEE Trans. Software Engineering, 12:96–109.
Google Scholar
Laddaga, R. (1999). Guest editor’s introduction: Creating robust software through self-adaptation. IEEE Intelligent Systems, 14(3):26–29.
Article Google Scholar
Laddaga, R., Robertson, P., and Shrobe, H., editors (2001). Self-Adaptive Software, 2nd International Workshop (IWSAS’01), Revised Papers, volume 2614 of Lecture Notes in Computer Science, Balatonfüred, Hungary. Springer, New York.
Google Scholar
Laprie, J. (1995). Dependable computing: Concepts, limits, challenges. In Special Issue of the 25th IEEE International Symposium on Fault-Tolerant Computing, pages 42–54, Pasadena, CA.
Google Scholar
Laprie, J., Avizienis, A., and Kopetz, H., editors (1992). Dependability: Basic Concepts and Terminology. Springer-Verlag, New York.
MATH Google Scholar
Laprie, J. C., Arlat, J., Beounes, C., Kanoun, K., and Hourtolle, C. (1987). Hardware and software fault tolerance: definition and analysis of architectural solutions. In Proceedings of the 17th International Symposium Fault-Tolerant Computing, pages 116–121, Pittsburgh,PA. ACM.
Google Scholar
Laprie, J.-C., Béounes, C., and Kanoun, K. (1990). Definition and analysis of hardware- and software-fault-tolerant architectures. Computer, 23(7):39–51.
Article Google Scholar
Leveson, N. G. (1995). Safeware: System Safety and Computers. ACM, New York.
Google Scholar
Lyu, M., editor (1996). Handbook of Software Reliability Engineering. McGraw-Hill and IEEE Computer Society, New York.
Google Scholar
Lyu, M. and Avizienis, A. (1991). Assuring design diversity in N-version software: A design paradigm for N-version programming. In Meyer, J. and Schlichting, R., editors, Proceedings of the 2nd IFIP International Working Conference on Dependable Computing for Critical Applications (DCCA-2), pages 197–218, Tucson, Arizona, USA. Springer-Verlag, New York.
Google Scholar
Lyu, M., Chen, J., and Avizienis, A. (1992). Software diversity metrics and measurements. In Proceedings of the 16th IEEE Annual International Computer Software and Applications Conference (COMPSAC’92), pages 69–78, 21–25 September, Chicago. IEEE Computer Society.
Google Scholar
Martin, D., Cheyer, A., and Moran, D. (1999). The open agent architecture: a framework for building distributed software systems. Applied Artificial Intelligence, 13(1/2):91–128.
Google Scholar
Maxion, R. A. and Olszewski, R. T. (1998). Improving software robustness with dependability cases. In 28th International Symposium on Fault-Tolerant Computing (FTCS’98), pages 346–355, Munich, Germany. IEEE Computer Society.
Google Scholar
Mitra, S., Saxena, N. R., and McCluskey, E. J. (1999). A design diversity metric and reliability analysis for redundant systems. In Proceedings of the 1999 IEEE International Test Conference (ITC’99), page 662, Washington, DC. IEEE Computer Society.
Google Scholar
Musa, J. D., Iannino, A., and Okumoto, K. (1987). Software reliability: measurement, prediction, application. McGraw-Hill, Inc., New York.
Google Scholar
Parhami, B. (1988). From defects to failures: a view of dependable computing. SIGARCH Computer Architecture News, 16(4):157–168.
Article Google Scholar
Pullum, L. L. (2001). Software fault tolerance techniques and implementation. Artech House, Inc., Norwood, MA.
MATH Google Scholar
Randell, B. (1975). System structure for software fault tolerance. In Proceedings of the International Conference on Reliable Software, pages 437–449, Los Angeles, California. ACM.
Chapter Google Scholar
Randell, B. (1995). The evolution of the recovery block concept. In Lyu, M., editor, Software Fault Tolerance, chapter 1, pages 1–22. John Wiley & Sons, New York.
Google Scholar
Randell, B. (2000). Turing memorial lecture–facing up to faults. Computer, 4(2):95–106.
Google Scholar
Scott, K., Gault, J., and McAllister, D. (1983). The consensus recovery block. In Total Systems Reliability Symposium, pages 3–9, Gaithersburg, MD. IEEE Computer Society.
Google Scholar
Seeley, T. D., Visscher, P. K., and Passino, K. M. (2006). Group decision making in honey bee swarms. American Scientist, 94:220–229.
Google Scholar
Shapley, L. S. and Grofman, B. (1984). Optimizing group judgmental accuracy in the presence of interdependence. Public Choice, 43:329–343.
Article Google Scholar
Smith, R. G. (1988). The contract net protocol: High-level communication and control in a distributed problem solver. IEEE Transactions on Computers, C-29(12):1104–1113.
Article Google Scholar
Smith, W. D. (2006). Ants, bees, and computers agree range voting is best single-winner system. Technical report, Temple University, Department of Mathematics.
Google Scholar
Sommerville, I. (1995). Software Engineering. Addison-Wesley, Reading, MA, 5th edition.
Google Scholar
Tai, A., Meyer, F., and Avizienis, A. (1993). Performability enhancement of fault-tolerant software. IEEE Transactions on Reliability, pages 227–237.
Google Scholar
Townend, P. and Xu, J. (2002). Assessing multi-version systems through fault injection. In Proceedings of the 7th IEEE International Workshop on Object-Oriented Real-Time Dependable Systems (WORDS’02), pages 105–112, San Diego, CA. Computer Society.
Chapter Google Scholar
Traverse, P. (1988). Airbus and ATR system architecture and specification. Software Diversity in Computerised Control Systems, pages 95–104.
Google Scholar
Turlapati, R. and Huhns, M. N. (2005). Multiagent reputation management to achieve robust software using redundancy. In Proceedings of the IEEE/WIC/ACM International Conference on Intelligent Agent Technology (IAT’05), pages 386–392, Compiegne, France. ComputerSociety.
Chapter Google Scholar
Vidotto, A., Brown, K. N., and Beck, J. (2005). Robust constraint solving using multiple heuristics. In Creaney, N., editor, Proceedings of the 16th Irish Artificial Intelligence and Cognitive Science Conference (AICS’05), page 871, Coleraine, Northern Ireland. University of Ulster.
Google Scholar
Voges, U., Fetsch, F., and Gmeiner, L. (1982). Use of microprocessors in a safety-oriented reactor shutdown system. In Lauber, E. and Moltoft, J., editors, Reliability in Electrical and Electronic Components and Systems, pages 493–497. North-Holland Publishing Company, Amsterdam, The Netherlands.
Google Scholar
Vouk, M., McAllister, D., Eckhardt, D., and Kim, K. (1993). An empirical evaluation of consensus voting and consensus recovery block reliability in the presence of failure correlation. Journal of Computer and Software Engineering, 4:367–388.
Google Scholar
Zavala Gutierrez, R. L. and Huhns, M. N. (2003). Achieving software robustness via multiagent-based redundancy (extended abstract). In Das, R. and Walsh, W., editors, Proceedings of the IJCAI-03 Workshop on AI and Autonomic Computing: Developing a Research Agenda for Self-Managing Computer Systems, Acapulco, Mexico. IBM.
Google Scholar
Zavala Gutierrez, R. L. and Huhns, M. N. (2004). On building robust web service-based applications. In Cavedon, L., Maamar, Z., Martin, D., and Benatallah, B., editors, Extending Web Services Technologies: The Use of Multi-Agent Approaches, chapter 14, pages 293–310. Kluwer Academic Publishing, New York.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, University of South Carolina, 301 Main Street, Columbia, SC, 29208, USA
Rosa Laura Zavala Gutierrez
Department of Computer Science and Engineering, University of South Carolina, Columbia, SC, 29208, USA
Michael Huhns

Authors

Rosa Laura Zavala Gutierrez
View author publications
You can also search for this author in PubMed Google Scholar
Michael Huhns
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rosa Laura Zavala Gutierrez .

Editor information

Editors and Affiliations

School of Computing and Mathematics, University of Ulster at Jordanstown, Jordanstown, Northern Ireland, UK
Alfons Schuster

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Gutierrez, R.L.Z., Huhns, M. (2008). Multiagent-Based Fault Tolerance Management for Robustness. In: Schuster, A. (eds) Robust Intelligent Systems. Springer, London. https://doi.org/10.1007/978-1-84800-261-6_2

Download citation

DOI: https://doi.org/10.1007/978-1-84800-261-6_2
Publisher Name: Springer, London
Print ISBN: 978-1-84800-260-9
Online ISBN: 978-1-84800-261-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Multiagent-Based Fault Tolerance Management for Robustness

Abstract

Access this chapter

Preview

Similar content being viewed by others

Fragility and Robustness in Multiagent Systems

Fault Tolerance in Multiagent Systems

Evaluating fault tolerance approaches in multi-agent systems

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

Multiagent-Based Fault Tolerance Management for Robustness

Abstract

Access this chapter

Preview

Similar content being viewed by others

Fragility and Robustness in Multiagent Systems

Fault Tolerance in Multiagent Systems

Evaluating fault tolerance approaches in multi-agent systems

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation