Design and Verification of Distributed Recovery Blocks with CSP

Yeung, W.L.; Schneider, S.A.

doi:10.1023/A:1022997110855

Design and Verification of Distributed Recovery Blocks with CSP

Published: May 2003

Volume 22, pages 225–248, (2003)
Cite this article

Formal Methods in System Design Aims and scope Submit manuscript

W.L. Yeung¹ &
S.A. Schneider²

69 Accesses
7 Citations
Explore all metrics

Abstract

A case study on the application of Communicating Sequential Processes (CSP) to the design and verification of fault-tolerant real-time systems is presented. The distributed recovery block (DRB) scheme is a design technique for the uniform treatment of hardware and software faults in real-time systems. Through a simple fault-tolerant real-time system design using the DRB scheme, the case study illustrates a paradigm for specifying fault-tolerant software and demonstrates how the different behavioural aspects of a fault-tolerant real-time system design can be separately and systematically specified, formulated, and verified using an integrated set of formal techniques based on CSP.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

References

P.E. Ammann and J.C. Knight, “Data diversity: An approach to software fault tolerance,” In Proc. 17th International Symposium on Fault Tolerant Computing Systems, 1987, pp. 122–126.
S.D. Brookes, C.A.R. Hoare, and A.W. Roscoe, “A theory of communicating sequential processes,” J. ACM, Vol. 31, pp. 560–599, 1984.
Google Scholar
A. Cau and W.-P. de Roever, “Specifying fault-tolerance within stark's formalism,” in Proc. 23rd Symp. on Fault-Tolerant Comp., IEEE Computer Society Press, 1993, pp. 392–401.
G.H. Chisholm and A.S. Wojcik, “An application of formal analysis to software in a fault-tolerant environment,” IEEE Transactions on Computers, Vol. 48, No. 10, pp. 1053–1063, 1999.
Google Scholar
J. Coenen and J. Hooman, “A compositional semantics for fault-tolerant real-time systems,” in J. Vytopil (Ed.), Proc. Second International Symposium on Formal Techniques in Real-Time and Fault-Tolerant Systems, Nijmegen, The Netherlands, Springer-Verlag, Jan. 1992, pp. 33–51.
Google Scholar
J. Coenen and J. Hooman, “Parameterized semantics for fault tolerant real-time systems,” in J. Vytopil (Ed.), Formal Techniques in Real-Time Fault-Tolerant Systems, Kluwer Academic Publishers, 1993, pp. 51–78.
F. Cristian, “Exception handling and software fault tolerance,” IEEE Transactions on Computers, Vol. C-31, No. 6, pp. 531–540, 1982.
Google Scholar
F. Cristian, “Arigorous approach to fault-tolerant programming,” IEEE Transactions on Software Engineering, Vol. SE-11, No. 1, pp. 23–31, 1985.
Google Scholar
J.W. Davies, Specification and Proof in Real-Time Systems. Cambridge University Press, 1993.
J.W. Davies and S.A. Schneider, “Real-Time CSP,” in T. Rus and C. Rattray (Eds.), Theories and Experiences for Real-time System Development, Vol. 2. World Scientific, 1995.
D.E. Eckhardt and L.D. Lee, “A theoretical basis for the analysis of multiversion software subject to coincidental errors,” IEEE Transactions on Software Engineering, Vol. SE-11, No. 12, pp. 1511–1517, 1985.
Google Scholar
Tom R. Halfhill, “The truth behind the pentium bug,” Byte, March 1995.
H.A. Hansson, “Modeling real-time and reliability,” in J. Vytopil (Ed.), Formal Techniques in Real-Time Fault-Tolerant Systems, Kluwer Academic Publishers, 1993, pp. 79–105.
Jifeng He and C.A.R. Hoare, “Algebraic specification and proof of a distributed recovery algorithm,” Distributed Computing, Vol. 2, pp. 1–12, 1987.
Google Scholar
C.A.R. Hoare, Communicating Sequential Processes, Prentice Hall, 1985.
J.J. Horning et al., “A Program Structure for Error Detection and Recovery,” in E. Gelenbe and C. Kaiser (Eds.), Lecture Notes in Computer Science, Springer Verlag, 1974, Vol. 16, pp. 171–187.
M. Joseph, A. Moitra, and N. Soundararajan, “Proof rules for fault-tolerant distributed programs,” Science of Computer Programming, Vol. 8, pp. 43–67, 1987.
Google Scholar
K.H. Kim and H.O. Welch, “Distributed execution of recovery blocks: An approach for uniform treatment of hardware and software faults in real-time applications,” IEEE Transactions on Computers, Vol. 38, No. 5, pp. 626–636, 1989.
Google Scholar
J.C. Knight and N.G. Leveson, “An experimental evaluation of the assumption of independence in multiversion programming,” IEEE Transactions on Software Engineering, Vol. SE-12, No. 1, pp. 96–109, 1986.
Google Scholar
L. Lamport, “The temporal logic of actions,” ACM Transactions on Programming Languages and Systems, Vol. 1, No. 3, pp. 872–923, 1994.
Google Scholar
L. Lamport and S. Merz, “Specifying and verifying fault-tolerant systems,” in Proc. Formal Techniques in Real-Time and Fault-Tolerant Systems, H. Langmaak, W.-P. de Roever, and J. Vytopil (Eds.), Springer-Verlag, 1994, pp. 42–76.
Jean-Claude Laprie et al., “Definition and analysis of hardware-and software-fault-tolerant architectures,” IEEE Computer, Vol. 23, No. 7, pp. 39–51, 1990.
Google Scholar
R. Lazic, “A semantic study of data-independence with applications to the mechanical verification of concurrent systems,” Ph.D. Thesis, Oxford University, 1997.
G. Lowe, “Probabilities and priorities in timed CSP,” D. Phil. Thesis, Oxford University, 1993.
R. Milner, Communication and Concurrency, Prentice Hall, 1989.
A.W. Roscoe, M.W. Mislove, and S.A. Schneider, “Fixed points without completeness,” Theoretical Computer Science, Vol. 138, No. 2, pp. 273–314, 1995.
Google Scholar
S. Owre, J. Rushby, N. Shankar, and F. Von Henke, “Formal verification for fault-tolerant architectures: Prolegomena to the design of PVS,” IEEE Transactions on Software Engineering, Vol. 21, No. 2, pp. 107–125, 1995.
Google Scholar
J. Peleska, “Design and verification of fault tolerant systems with CSP,” Distributed Computing, Vol. 5, pp. 95–106, 1991.
Google Scholar
B. Randell. “System structure for software fault tolerance,” IEEE Transactions on Software Engineering, Vol. SE-1, No. 2, pp. 220–232, 1975.
Google Scholar
G.M. Reed, “A uniform mathematical theory for real-time distributed computing,” D.Phil. Thesis, Oxford University, 1988.
G.M. Reed and A.W Roscoe, “A timed model for communicating sequential processes,” in 13th ICALP, Vol. 226 of LNCS, Springer-Verlag, 1986, pp. 314–323.
Google Scholar
A.W. Roscoe, “Model checking CSP,” In A Classical Mind: Essays in Honour of C.A.R. Hoare. Prentice Hall, 1994.
A.W. Roscoe, The Theory and Practice of Concurrency, Prentice Hall, 1997.
Henk Schepers, “Real-time systems and fault-tolerance,” in Real-Time Systems: Specification, Verification and Analysis, M. Joseph (Ed.), Prentice Hall, 1996, Ch. 6, pp. 229–257.
R.D. Schlichting and F.B. Schneider, “Fail-stop processors: An approach to designing fault tolerant computing systems,” ACM Transactions on Computer Systems, Vol. 1, No. 3, pp. 222–238, 1983.
Google Scholar
F.B. Schneider, “Implementing fault-tolerant services using the state machine approach: A tutorial,” ACM Comp. Surveys, Vol. 22, No. 4, pp. 299–319, 1990.
Google Scholar
S.A. Schneider, “Unbounded nondeterminism for real-time processes,” Technical Report 13–92, Oxford University, 1992.
S.A. Schneider, “Timewise refinement for communicating processes,” Science of Computer Programming, Vol. 28, pp. 43–90, 1997.
Google Scholar
S.A. Schneider, Concurrent and Real-time Systems: The CSP Approach, John Wiley, 2000.
W.L. Yeung, S.A. Schneider, and F. Tam, “Design and verification of distributed recovery blocks with CSP,” Technical Report CSD-TR–98–08, Royal Holloway, University of London, 1998.

Download references

Author information

Authors and Affiliations

Lingnan University, Hong Kong, People's Republic of China
W.L. Yeung
Royal Holloway, University of London, Egham, Surrey, TW20 0EX, UK
S.A. Schneider

Authors

W.L. Yeung
View author publications
You can also search for this author in PubMed Google Scholar
S.A. Schneider
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yeung, W., Schneider, S. Design and Verification of Distributed Recovery Blocks with CSP. Formal Methods in System Design 22, 225–248 (2003). https://doi.org/10.1023/A:1022997110855

Download citation

Issue Date: May 2003
DOI: https://doi.org/10.1023/A:1022997110855

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Design and Verification of Distributed Recovery Blocks with CSP

Abstract

Access this article

Similar content being viewed by others

Modular Design and Verification of Distributed Adaptive Real-Time Systems

Compositional verification of asynchronous concurrent systems using CADP

Testing and Verifying Chain Repair Methods for Corfu Using Stateless Model Checking

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Design and Verification of Distributed Recovery Blocks with CSP

Abstract

Access this article

Similar content being viewed by others

Modular Design and Verification of Distributed Adaptive Real-Time Systems

Compositional verification of asynchronous concurrent systems using CADP

Testing and Verifying Chain Repair Methods for Corfu Using Stateless Model Checking

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation