Skip to main content

An Integrated Record&Replay Mechanism for Nondeterministic Message Passing Programs

  • Conference paper
  • First Online:
Recent Advances in Parallel Virtual Machine and Message Passing Interface (EuroPVM/MPI 2001)

Abstract

Nondeterminism is a characteristic of many parallel programs that needs dedicated support from analysis tools and programming environments. In order to allow cyclic debugging of such programs, record&replay mechanisms are used most frequently. Such techniques operate in two phases, where the record phase traces a program’s execution that can be arbitrarily repeated during subsequent replay phases. In contrast to most existing approaches, this paper describes a mechanism that is transparently integrated in the underlying message passing interface. The main advantage of this approach is its omnipresence, such that a program’s execution can be repeated immediately after it has been observed. Other benefits are the lack of instrumentation and a corresponding simplification of the whole technique for inexperienced users. The difficulties addressed by this approach are concerned with the amount of monitor overhead, which must neither perturb the program’s execution nor generate huge amounts of trace data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Chassin de Kergommeaux, J., Ronsse, M., De Bosschere, K.: MPL*: Efficient Record/Replay of Nondeterministic Features in Message Passing Libraries. Proc. 6th EuroPVM/MPI Users’ Group Meeting, Barcelona, Spain, 141–148 (Sept. 1999).

    Google Scholar 

  2. Clemencon, C., Fritscher, J., Rühl, R.: Visualization, Execution Control and Replay of Massively Parallel Programs within Annai’s Debugging Tool. Proc. High Performance Computing Symposium, HPCS’ 95, Montreal, Canada, 393–404 (July 1995).

    Google Scholar 

  3. Curtis, R.S., Wittie, L.D.: BugNet: A Debugging System for Parallel Programming Environments. Proc. 3rd Intl. Conf. on Distr. Computing Systems, Miami, FL, USA, 394–399 (October 1982).

    Google Scholar 

  4. Fagot, A., Chassin de Kergommeaux, J.: Systematic Assessment of the Overhead of Tracing Parallel Programs. Proc. EUROMICRO PDP’ 96, 4th EUROMICRO Workshop on Parallel and Distributed Processing, IEEE Computer Society Press, Braga, Portugal, 179–186 (January 1996).

    Chapter  Google Scholar 

  5. Geist, G.A., Sunderam, V.S.: Network-based Concurrent Computing on the PVM System. in: Concurrency-Practice & Experience, 4, No. 4, 293–311 (1992).

    Article  Google Scholar 

  6. Kranzlmüller, D.: Event Graph Analysis for Debugging Massively Parallel Programs. PhD Thesis, GUP Linz, Joh. Kepler Univ. Linz, Austria, (September 2000) http://www.gup.uni-linz.ac.at/~dk/thesis .

    Google Scholar 

  7. Lamport, L.: Time, Clocks, and the Ordering of Events in a Distributed System. Communications of the ACM, 558–565 (July 1978).

    Google Scholar 

  8. LeBlanc, T.J., Mellor-Crummey, J.M.: Debugging Parallel Programs with Instant Replay. IEEE Transactions on Computers, C-36, No. 4, 471–481 (April 1987).

    Article  Google Scholar 

  9. LeBlanc, T.J., Robbins, A.D.: Event Driven Monitoring of Distributed Programs. Proc. 5th Intl. Conference on Distributed Computing Systems, IEEE Computer Society Press, Denver, CO, USA, 515–522 (May 1985).

    Google Scholar 

  10. Leu, E., Schiper, A.: Execution Replay: A Mechanism for Integrating a Visualization Tool with a Symbolic Debugger. in: Roberts, Y., Bouge, L., Cosnard, M., Trystram, D., (Eds.), Proc. CONPAR 92-VAPP V, Lecture Notes in Computer Science, 634, Springer-Verlag (1992).

    Google Scholar 

  11. Mackey, M.: Program Replay in PVM. Technical Report, Concurrent Computing Department, Hewlett-Packard Laboratories (May 1993).

    Google Scholar 

  12. May, J., Berman, F.: Panorama: A Portable, Extensible Parallel Debugger. Proc. 3rd ACM/ONR Workshop on Parallel and Distributed Debugging, San Diego, CA, USA (May 1993).

    Google Scholar 

  13. Message Passing Interface Forum: MPI: A Message-Passing Interface Standard-Version 1.1. (June 1995) http://www.mcs.anl.gov/mpi/.

  14. Netzer, R.H.B., Miller, B.P.: Optimal Tracing and Replay for Debugging Message-Passing Parallel Program. Proc. Supercomputing 92, Minneapolis, MN, USA, 502–511 (November 1992).

    Google Scholar 

  15. Ronsse, M.A., Kranzlmüller, D.: RoltMP-Replay of Lamport Timestamps for Message-Passing Parallel Systems. Proc. EUROMICRO PDP’ 98, 6th EUROMICRO Workshop on Par. and Distr. Processing, Madrid, Spain, 87–93 (January 1998).

    Google Scholar 

  16. Smith, E.T.: Debugging Tools for Message-Based, Communicating Processes. Proc. 4th Intl. Conference on Distributed Computing Systems, San Francisco, CA, 303–310 (May 1984).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kranzlmüller, D., Schaubschläger, C., Volkert, J. (2001). An Integrated Record&Replay Mechanism for Nondeterministic Message Passing Programs. In: Cotronis, Y., Dongarra, J. (eds) Recent Advances in Parallel Virtual Machine and Message Passing Interface. EuroPVM/MPI 2001. Lecture Notes in Computer Science, vol 2131. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45417-9_28

Download citation

  • DOI: https://doi.org/10.1007/3-540-45417-9_28

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-42609-7

  • Online ISBN: 978-3-540-45417-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics