Skip to main content

On Designing Dependable Services with Diverse Off-the-Shelf SQL Servers

  • Conference paper
Architecting Dependable Systems II

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3069))

Abstract

The most important non-functional requirements for an SQL server are performance and dependability. This paper argues, based on empirical results from our on-going research with diverse SQL servers, in favour of diverse redundancy as a way of improving both. We show evidence that current data replication solutions are insufficient to protect against the range of faults documented for database servers; outline possible fault-tolerant architectures using diverse servers; discuss the design problems involved; and offer evidence of the potential for performance improvement through diverse redundancy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Babbage, C.: On the Mathematical Powers of the Calculating Engine (Unpublished manuscript, December 1837). In: Randell, B. (ed.) The Origins of Digital Computers: Selected Papers, pp. 17–52. Springer, Heidelberg (1974)

    Google Scholar 

  2. Traverse, P.J.: AIRBUS and ATR System Architecture and Specification. In: Voges, U. (ed.) Software diversity in computerized control systems, pp. 95–104. Springer, Heidelberg (1988)

    Google Scholar 

  3. Randell, B.: System Structure for Software Fault-Tolerance. In: International Conference on Reliable Software, Los Angeles, California (April 1975); ACM SIGPLAN Notices 10(6), 437–449 (June 1975)

    Google Scholar 

  4. Lyu, M.R. (ed.): Software Fault Tolerance. Trends in Software Series. Wiley, Chichester (1995)

    Google Scholar 

  5. Avizienis, A., Kelly, J.P.J.: Fault Tolerance by Design Diversity: Concepts and Experiments. IEEE Computer 17(8), 67–80 (1984)

    Google Scholar 

  6. Laprie, J.C., et al.: Definition and Analysis of Hardware-and-Software Fault-Tolerant Architectures. IEEE Computer 23(7), 39–51 (1990)

    Google Scholar 

  7. Voges, U. (ed.): Software diversity in computerized control systems; Avizienis, A., Kopetz, H., Laprie, J.C. (ed.): Dependable Computing and Fault-Tolerance series, vol. 2. Springer, Wien (1988)

    Google Scholar 

  8. Avizienis, A., et al.: The UCLA DEDIX System: A Distributed Testbed for Multiple-Version Software. In: Proc. of 15th IEEE International Symposium on Fault-Tolerant Computing (FTCS-15), Ann Arbor, Michigan, USA, pp. 126–134. IEEE Computer Society Press, Los Alamitos (1985)

    Google Scholar 

  9. Pullum, L.: Software Fault Tolerance Techniques and Implementation, Artech House (2001)

    Google Scholar 

  10. Bernstein, P.A., Hadzilacos, V., Goodman, N.: Concurrency Control and Recovery in Database Systems. Addison-Wesley, Reading (1987)

    Google Scholar 

  11. Sutter, H.: SQL/Replication Scope and Requirements document, in ISO/IEC JTC 1/SC 32 Data Management and Interchange WG3 Database Languages, p. 7 (2000)

    Google Scholar 

  12. Kalyanakrishnam, M., Kalbarczyk, Z., Iyer, R.: Failure Data Analysis of LAN of Windows NT Based Computers. In: Proc. of 18th Symposium on Reliable and Distributed Systems (SRDS 1999), Lausanne, Switzerland, pp. 178–187 (1999)

    Google Scholar 

  13. Schneider, F.: Byzantine generals in action: Implementing fail-stop processors. ACM Transactions on Computing Systems 2(2), 145–154 (1984)

    Article  Google Scholar 

  14. Gashi, I., Popov, P., Strigini, L.: Fault diversity among off-the-shelf SQL database servers. In: Proc. of Inter. Conf. on Dependable Systems and Networks (DSN 2004), Florence, Italy, IEEE Computer Society Press, Los Alamitos (2004) (to appear)

    Google Scholar 

  15. Chandra, S., Chen, P.M.: How fail-stop are programs. In: Proc. of 28th IEEE International Symposium on Fault-Tolerant Computing (FTCS-28), pp. 240–249. IEEE Computer Society Press, Los Alamitos (1998)

    Google Scholar 

  16. Gray, J.: Why do computers stop and what can be done about it? In: Proc. of 5th Symp. on Reliability in Distributed Software and Database Systems (SRDSDS-5), Los Angeles, CA, USA, pp. 3–12. IEEE Computer Society Press, Los Alamitos (1986)

    Google Scholar 

  17. Chandra, S., Chen, P.M.: Whither Generic Recovery from Application Faults? In: A Fault Study using Open-Source Software, in Proc. of Inter. Conf. on Dependable Systems and Networks (DSN 2000), NY, USA, pp. 97–106. IEEE Computer Society Press, Los Alamitos (2000)

    Google Scholar 

  18. Jimenez-Peris, R., et al.: Are Quorums an Alternative for Data Replication? ACM Transactions on Database Systems 28(3), 257–294 (2003)

    Article  Google Scholar 

  19. Jimenez-Peris, R., et al.: How to Select a Replication Protocol According to Scalability, Availability and Communication Overhead. In: Proc. of Int. Symp. on Reliable Distributed Systems (SRDS), New Orleans, Louisiana, pp. 24–33. IEEE Computer Society Press, Los Alamitos (2001)

    Google Scholar 

  20. Kemme, B., Alonso, G.: Don’t be lazy, be consistent: Postgres-R, A new way to implement Database Replication. In: Proc. of Int. Conf. on Very Large Databases (VLDB), Cairo, Egypt (2000)

    Google Scholar 

  21. Anderson, T., Lee, P.A.: Fault Tolerance: Principles and Practice, 2nd Revised edn. Dependable Computing and Fault Tolerant Systems, vol. 3. Springer, Heidelberg (1990)

    MATH  Google Scholar 

  22. Gray, J., Reuter, A.: Transaction processing: concepts and techniques. Morgan Kaufmann, San Francisco (1993)

    MATH  Google Scholar 

  23. Tso, K.S., Avizienis, A.: Community Error Recovery in N-Version Software: A Design Study with Experimentation. In: Proc. of 17th IEEE International Symposium on Fault- Tolerant Computing (FTCS-17), Pittsburgh, Pennsylvania, July 6-8, pp. 127–133 (1987)

    Google Scholar 

  24. Jimenez-Peris, R., Patino-Martinez, Alonso, G.: An Algorithm for Non-Intrusive, Parallel Recovery of Replicated Data and its Correctness. In: Proc. of 21st IEEE Int. Symp. on Reliable Distributed Systems (SRDS 2002), Osaka, Japan, pp. 150–159 (2002)

    Google Scholar 

  25. Poledna, S.: Replica Determinism in Distributed Real-Time Systems: A Brief Survey. Real-Time Systems Journal 6, 289–316 (1994)

    Article  Google Scholar 

  26. Powell, D.: Delta-4: A Generic Architecture for Dependable Distributed Computing. Springer-Verlag Research Reports ESPRIT. Springer, Heidelberg (1992)

    Google Scholar 

  27. Popov, P., et al.: Software Fault-Tolerance with Off-the-Shelf SQL Servers. In: Kazman, R., Port, D. (eds.) ICCBSS 2004. LNCS, vol. 2959, pp. 117–126. Springer, Heidelberg (2004) (to appear)

    Chapter  Google Scholar 

  28. Gruber, M.: Mastering SQL. SYBEX (2000)

    Google Scholar 

  29. Melton, J.: (ISO-ANSI Working Draft) Persistent Stored Modules, SQL/PSM (2002), http://www.jtc1sc32.org/sc32/jtc1sc32.nsf/Attachments/9611E99B3901802188256D95005B0184/$FILE/32N1008-WD9075-04-PSM-2003-09.PDF

  30. Microsoft, SQL Server ”Yukon” (2003) http://www.microsoft.com/sql/yukon/productinfo/default.asp

  31. Poledna, S.: Fault-Tolerant Real-Time Systems: The Problem of Replica Determinism. Kluwer Academic Publishers, Dordrecht (1996)

    MATH  Google Scholar 

  32. Ammann, P.E., Knight, J.C.: Data Diversity: an Approach to Software Fault-Tolerance. In: Proc. of 17th IEEE International Symposium on Fault-Tolerant Computing (FTCS-17), Pittsburgh, Pennsylvania, USA, pp. 122–126. IEEE Computer Society Press, Los Alamitos (1987)

    Google Scholar 

  33. Chen, P.M., et al.: Raid: High-Performance, Reliable Secondary Storage. ACM Computing Surveys 26(2), 145–185 (1994)

    Article  Google Scholar 

  34. TPC, TPC Benchmark C, Standard Specification, Version 5.0 (2002), http://www.tpc.org/tpcc/

  35. Weismann, M., Pedone, F., Schiper, A.: Database Replication Techniques: a Three Parameter Classification. In: Proc. of 19th IEEE Symposium on Reliable Distributed Systems (SRDS 2000), Nurnberg, Germany, pp. 206–217. IEEE Computer Society Press, Los Alamitos (2000)

    Chapter  Google Scholar 

  36. Vaysburd, A.: Fault Tolerance in Three-Tier Applications: Focusing on the Database Tier. In: Proc. of 18th IEEE Symposium on Reliable Distributed Systems (SRDS 1999), Lausanne, Switzerland, pp. 322–327. IEEE Computer Society Press, Los Alamitos (1999)

    Chapter  Google Scholar 

  37. Pedone, F., Frolund, S.: Pronto: A Fast Failover Protocol for Off-the-shelf Commercial Databases. In: Proc. of 19th IEEE Symposium on Reliable Distributed Systems (SRDS 2000), Nurnberg, Germany, pp. 176–185. IEEE Computer Society Press, Los Alamitos (2000)

    Chapter  Google Scholar 

  38. Jimenez-Peris, R., Patino-Martinez, M.: D5: Transaction Support, ADAPT Middleware Technologies for Adaptive and Composable Distributed Components, pp. 20 (2003)

    Google Scholar 

  39. Patino-Martinez, M., Jimenez-Peris, R., Alonso, G.: Scalable Replication in Database Clusters. In: Herlihy, M.P. (ed.) DISC 2000. LNCS, vol. 1914, pp. 315–329. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  40. Jimenez-Peris, R., et al.: Scalable Database Replication Middleware. In: Proc. of 22nd IEEE Int Conf on Distributed Computing Systems, Vienna, Austria, pp. 477–484 (2002)

    Google Scholar 

  41. Kemme, B., Bartoli, A., Babaoglu, O.: Online Reconfiguration in Replicated Databases Based on Group Communication. In: Proc. of Int. Conf. on Dependable Systems and Networks (DSN 2001), Goteborg, Sweden, pp. 117–126. IEEE Computer Society Press, Los Alamitos (2001)

    Chapter  Google Scholar 

  42. Voas, J.: Deriving Accurate Operational Profiles for Mass-Marketed Software (2000), http://www.cigitallabs.com/resources/papers/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gashi, I., Popov, P., Stankovic, V., Strigini, L. (2004). On Designing Dependable Services with Diverse Off-the-Shelf SQL Servers. In: de Lemos, R., Gacek, C., Romanovsky, A. (eds) Architecting Dependable Systems II. Lecture Notes in Computer Science, vol 3069. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-25939-8_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-25939-8_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-23168-4

  • Online ISBN: 978-3-540-25939-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics