Abstract
This paper analyzes the performability of client-server applications that use a separate fault management architecture for monitoring and controlling of the status of the application software and hardware. The analysis considers the impact of the management components and connections, and their reliability, on performability. The approach combines minpath algorithms, Layered Queueing analysis and non-coherent fault tree analysis techniques for efficient computation of expected reward rate of the application.
- Blischke, W. R. and Murthy, D. N. Prabhakar, "Reliability: Modeling, Prediction, and Optimization", Wiley, 2000.Google Scholar
- Booch, G., Rumbaugh, J. and Jacobson, I., The Unified Modeling Language User Guide, Addison-Wesley, 1st edition, 1998. Google ScholarDigital Library
- Colbourn, C. J., The Combinatorics of Network Reliability, Oxford University Press, 1987. Google ScholarDigital Library
- Das, O. and Woodside, C. M., "The Fault-tolerant layered queueing network model for performability of distributed systems", IEEE Int. Computer Performance and Dependability Symposium (IPDS'98), Sept. 1998, pp. 132--141.Google ScholarCross Ref
- Das, O. and Woodside, C. M., "Evaluating layered distributed software systems with fault-tolerant features", Performance Evaluation, 45 (1), 2001, pp. 57--76. Google ScholarDigital Library
- Das, O. and Woodside, C. M, "Modeling the Coverage and Effectiveness of Fault-Management Architectures in Layered Distributed Systems", IEEE International Conference on Dependable Systems and Networks (DSN'2002), June 2002, pp. 745--754. Google ScholarDigital Library
- Dutuit, Y. and Rauzy, A., "Exact and Truncated Computations of Prime Implicants of Coherent and non-Coherent Fault Trees within Aralia", Reliability Engineering and System Safety, 58, 1997, pp. 127--144.Google ScholarCross Ref
- Franks, G., Majumdar, S., Neilson, J., Petriu, D., Rolia, J. and Woodside, M., "Performance Analysis of Distributed Server Systems," in the Sixth International Conference on Software Quality (6ICSQ), Ottawa, Ontario, 1996, pp. 15--26.Google Scholar
- Garg, S., Huang, Y., Kintala, C. M. R., Trivedi, K. S. and Yajnik, S., "Performance and Reliability Evaluation of Passive Replication Schemes in Application Level Fault Tolerance", 29th Annual International Symp. on Fault-Tolerant Computing (FTCS'99), June 1999, pp. 322--329. Google ScholarDigital Library
- Gokhale, S. S., Wong, W. E., Trivedi, K. S. and Horgan, J. R., "An analytical approach to architecture-based software reliability prediction", IEEE Intl. Computer Performance and Dependability Symposium (IPDS'98), Sept. 1998, pp. 13--22.Google ScholarCross Ref
- Goseva-Popstojanova, K. and Trivedi, K. S., "Architecture-based approach to reliability assessment of software systems", Performance Evaluation, 45 (2--3), 2001, pp. 179--204. Google ScholarDigital Library
- Haverkort, B. R., Niemegeers, I. G. and Veldhuyzen van Zanten, P., "DYQNTOOL: A performability modelling tool based on the Dynamic Queueing Network concept", in Proc. of the 5th Intl. Conference on Computer Performance Evaluation: Modelling Techniques and Tools, G. Balbo, G. Serazzi, editors, North-Holland, 1992, pp. 181--195.Google Scholar
- Haverkort, B. R., "Performability modelling using DYQNTOOL+", International Journal of Reliability, Quality and Safety Engineering, 1995, pp. 383--404.Google ScholarCross Ref
- Huang, Y., Chung, P. Y., Kintala, C. M. R., Liang, D. and Wang, C., "NT-Swift: Software implemented fault-tolerance for Windows-NT", Proc. of 2nd USENIX WindowsNT Symposium, Aug. 3--5, 1998. Google ScholarDigital Library
- Kreger, H., "Java management extensions for application management", IBM Systems Journal, 40(1), 2001, pp. 104--129. Google ScholarDigital Library
- Laranjeira, L. A., "NCAPS: Application high availability in UNIX computer clusters", Proc. of 28th Int. Symp. on Fault Tolerant Computing (FTCS-28), June 1998, pp. 441--450. Google ScholarDigital Library
- Luo, T. and Trivedi, K. S., "Using Multiple Variable Inversion Technique to Analyze Fault-trees with Inverse Gates", Fast Abstracts, ISSRE'98.Google Scholar
- Lyu, M. R., editor., Handbook of Software Reliability Engineering, McGraw-Hill and IEEE Computer Society, New York, 1996. Google ScholarDigital Library
- Meyer, J. F., "On Evaluating the Performability of Degradable Computing Systems", IEEE Trans. on Computers, 29(8), Aug 1980, pp. 720--731.Google ScholarDigital Library
- Musa, J. D., Iannino, A. and Okumoto, K., Software Reliability - Measurement, Prediction, Application, McGraw-Hill, New York, 1987. Google ScholarDigital Library
- Stelling, P., Foster, I., Kesselman, C., Lee, C. and Laszewski, G. von, "A fault detection service for wide area distributed computations" in Proc. of 7th IEEE Symp. on High Performance Distributed Computations, 1998, pp. 268--278. Google ScholarDigital Library
- Sun, H., Han, J. J. and Levendel, I., "Impact of Fault Management Server and Its Failure-related Parameters on High-Availability Communication Systems", IEEE International Conference on Dependable Systems and Networks (DSN'2002), June 2002, pp. 679--686. Google ScholarDigital Library
- Tivoli Systems Inc., 9442 Capital of Texas Highway North, Arboretum Plaza One, Austin, TX 78759. See http://www.tivoli.com.Google Scholar
Index Terms
- Computing the performability of layered distributed systems with a management architecture
Recommendations
Computing the performability of layered distributed systems with a management architecture
WOSP '04: Proceedings of the 4th international workshop on Software and performanceThis paper analyzes the performability of client-server applications that use a separate fault management architecture for monitoring and controlling of the status of the application software and hardware. The analysis considers the impact of the ...
Evaluation of Performability for Degradable Computer Systems
The performability of degradable heterogeneous computer systems containing k > 1 types of components is considered. Previous analyses of such systems have been numerical in nature and yielded algorithms with either exponential complexity in the number ...
Comment on "Performability Analysis: A New Algorithm”
The paper "Performability Analysis: A New Algorithm” describes an algorithm for computing the complementary distribution of the accumulated reward over an interval of time in a homogeneous Markov process. In this comment, we show that in two particular ...
Comments