skip to main content
article

Computing the performability of layered distributed systems with a management architecture

Published:01 January 2004Publication History
Skip Abstract Section

Abstract

This paper analyzes the performability of client-server applications that use a separate fault management architecture for monitoring and controlling of the status of the application software and hardware. The analysis considers the impact of the management components and connections, and their reliability, on performability. The approach combines minpath algorithms, Layered Queueing analysis and non-coherent fault tree analysis techniques for efficient computation of expected reward rate of the application.

References

  1. Blischke, W. R. and Murthy, D. N. Prabhakar, "Reliability: Modeling, Prediction, and Optimization", Wiley, 2000.Google ScholarGoogle Scholar
  2. Booch, G., Rumbaugh, J. and Jacobson, I., The Unified Modeling Language User Guide, Addison-Wesley, 1st edition, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Colbourn, C. J., The Combinatorics of Network Reliability, Oxford University Press, 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Das, O. and Woodside, C. M., "The Fault-tolerant layered queueing network model for performability of distributed systems", IEEE Int. Computer Performance and Dependability Symposium (IPDS'98), Sept. 1998, pp. 132--141.Google ScholarGoogle ScholarCross RefCross Ref
  5. Das, O. and Woodside, C. M., "Evaluating layered distributed software systems with fault-tolerant features", Performance Evaluation, 45 (1), 2001, pp. 57--76. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Das, O. and Woodside, C. M, "Modeling the Coverage and Effectiveness of Fault-Management Architectures in Layered Distributed Systems", IEEE International Conference on Dependable Systems and Networks (DSN'2002), June 2002, pp. 745--754. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Dutuit, Y. and Rauzy, A., "Exact and Truncated Computations of Prime Implicants of Coherent and non-Coherent Fault Trees within Aralia", Reliability Engineering and System Safety, 58, 1997, pp. 127--144.Google ScholarGoogle ScholarCross RefCross Ref
  8. Franks, G., Majumdar, S., Neilson, J., Petriu, D., Rolia, J. and Woodside, M., "Performance Analysis of Distributed Server Systems," in the Sixth International Conference on Software Quality (6ICSQ), Ottawa, Ontario, 1996, pp. 15--26.Google ScholarGoogle Scholar
  9. Garg, S., Huang, Y., Kintala, C. M. R., Trivedi, K. S. and Yajnik, S., "Performance and Reliability Evaluation of Passive Replication Schemes in Application Level Fault Tolerance", 29th Annual International Symp. on Fault-Tolerant Computing (FTCS'99), June 1999, pp. 322--329. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Gokhale, S. S., Wong, W. E., Trivedi, K. S. and Horgan, J. R., "An analytical approach to architecture-based software reliability prediction", IEEE Intl. Computer Performance and Dependability Symposium (IPDS'98), Sept. 1998, pp. 13--22.Google ScholarGoogle ScholarCross RefCross Ref
  11. Goseva-Popstojanova, K. and Trivedi, K. S., "Architecture-based approach to reliability assessment of software systems", Performance Evaluation, 45 (2--3), 2001, pp. 179--204. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Haverkort, B. R., Niemegeers, I. G. and Veldhuyzen van Zanten, P., "DYQNTOOL: A performability modelling tool based on the Dynamic Queueing Network concept", in Proc. of the 5th Intl. Conference on Computer Performance Evaluation: Modelling Techniques and Tools, G. Balbo, G. Serazzi, editors, North-Holland, 1992, pp. 181--195.Google ScholarGoogle Scholar
  13. Haverkort, B. R., "Performability modelling using DYQNTOOL+", International Journal of Reliability, Quality and Safety Engineering, 1995, pp. 383--404.Google ScholarGoogle ScholarCross RefCross Ref
  14. Huang, Y., Chung, P. Y., Kintala, C. M. R., Liang, D. and Wang, C., "NT-Swift: Software implemented fault-tolerance for Windows-NT", Proc. of 2nd USENIX WindowsNT Symposium, Aug. 3--5, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Kreger, H., "Java management extensions for application management", IBM Systems Journal, 40(1), 2001, pp. 104--129. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Laranjeira, L. A., "NCAPS: Application high availability in UNIX computer clusters", Proc. of 28th Int. Symp. on Fault Tolerant Computing (FTCS-28), June 1998, pp. 441--450. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Luo, T. and Trivedi, K. S., "Using Multiple Variable Inversion Technique to Analyze Fault-trees with Inverse Gates", Fast Abstracts, ISSRE'98.Google ScholarGoogle Scholar
  18. Lyu, M. R., editor., Handbook of Software Reliability Engineering, McGraw-Hill and IEEE Computer Society, New York, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Meyer, J. F., "On Evaluating the Performability of Degradable Computing Systems", IEEE Trans. on Computers, 29(8), Aug 1980, pp. 720--731.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Musa, J. D., Iannino, A. and Okumoto, K., Software Reliability - Measurement, Prediction, Application, McGraw-Hill, New York, 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Stelling, P., Foster, I., Kesselman, C., Lee, C. and Laszewski, G. von, "A fault detection service for wide area distributed computations" in Proc. of 7th IEEE Symp. on High Performance Distributed Computations, 1998, pp. 268--278. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Sun, H., Han, J. J. and Levendel, I., "Impact of Fault Management Server and Its Failure-related Parameters on High-Availability Communication Systems", IEEE International Conference on Dependable Systems and Networks (DSN'2002), June 2002, pp. 679--686. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Tivoli Systems Inc., 9442 Capital of Texas Highway North, Arboretum Plaza One, Austin, TX 78759. See http://www.tivoli.com.Google ScholarGoogle Scholar

Index Terms

  1. Computing the performability of layered distributed systems with a management architecture
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM SIGSOFT Software Engineering Notes
        ACM SIGSOFT Software Engineering Notes  Volume 29, Issue 1
        January 2004
        300 pages
        ISSN:0163-5948
        DOI:10.1145/974043
        Issue’s Table of Contents
        • cover image ACM Conferences
          WOSP '04: Proceedings of the 4th international workshop on Software and performance
          January 2004
          313 pages
          ISBN:1581136730
          DOI:10.1145/974044

        Copyright © 2004 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 1 January 2004

        Check for updates

        Qualifiers

        • article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader