skip to main content
10.1145/1375527.1375547acmconferencesArticle/Chapter ViewAbstractPublication PagesicsConference Proceedingsconference-collections
research-article

CprFS: a user-level file system to support consistent file states for checkpoint and restart

Authors Info & Claims
Published:07 June 2008Publication History

ABSTRACT

Checkpoint and Restart (CPR) is becoming critical to large scale parallel computers, whose Mean Time Between Failures (MTBF) may be much shorter than the execution times of the applications. The CPR mechanism should be able to store and recover the states of virtual memory, communication and files for the applications in a consistent way.

However, many CPR tools ignore file states, which may cause errors for applications with file operations on recovery. Some CPR tools adopt library-based approaches or kernel-level file systems to deal with file states, but they only support limited types of file operations which are not sufficient for some applications. Moreover, many library-based approaches are not transparent to user applications because they wrap file APIs. Kernel-level file systems are difficult to deploy in production systems due to unnecessary overhead they may introduce to applications that do not need CPR.

In this paper we propose a user-level file system, CprFS, to address these problems. As a file system, CprFS can guarantee transparency to user applications, and is convenient to support arbitrary file operations. It can be deployed on applications' demand to avoid intervention with other applications. Experimental results on micro-benchmarks and real-world applications show that CprFS introduces acceptable overhead and has little impact on checkpointing systems.

References

  1. A. Bouteiller, F. Cappello, T. Herault, G. Krawezik, P. Lemarinier, and F. Magniette. MPICH-V2: a Fault Tolerant MPI for Volatile Nodes based on Pessimistic Sender Based Message Logging. In SC'03, pages 25--41, Washington, DC, USA, 2003. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. P. E. Chung, Y. Huang, S. Yajnik, G. Fowler, K. P. Vo, and Y. M. Wang. Checkpointing in CosMic a User-level Process Migration Environment. In Pacific Rim International Symposium on Fault-Tolerant Systems, pages 187--193, Dec. 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. A. E. Darling, L. Carey, and W. chun Feng. The Design, Implementation, and Evaluation of mpiBlast, June 11 2003.Google ScholarGoogle Scholar
  4. J. Duell, P. Hargrove, and E. Roman. The Design and Implementation of Berkeley Lab's Linux Checkpoint/Restart. white paper, Future Technologies Group, 2003.Google ScholarGoogle Scholar
  5. Q. Gao, W. Yu, W. Huang, and D. K. Panda. Application-Transparent Checkpoint/Restart for MPI Programs over InfiniBand. In ICPP'06, pages 471--478. IEEE Computer Society, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. W. Gropp, E. Lusk, N. Doss, and A. Skjellum. A High-performance, portable implementation of the MPI Message Passing Interface Standard. Parallel Computing, 22(6):789--828, Sept. 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. G. J. Janakiraman, J. R. Santos, D. Subhraveti, and Y. Turner. Cruz: Application-transparent distributed checkpoint-restart on standard operating systems. In Proceedings 2005 International Conference on Dependable Systems and Networks (DSN 2005), pages 260--269, Yokohama, Japan, June-July 2005. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. A. R. Jeyakumar. Metamori: A library for Incremental File Checkpointing. Master's thesis, Virgina Tech, Blacksburg, June 21 2004.Google ScholarGoogle Scholar
  9. H. Jung, D. Shin, H. Han, J. W. Kim, H. Y. Yeom, and J. Lee. Design and implementation of multiple fault-tolerant MPI over myrinet (MÆ3). In SC'2005, Seattle, Washington, USA, Nov. 2005. IEEE/ACM SIGARCH. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. H. Kim and H. Yeom. A User-Transparent Recoverable File System for Distributed Computing Environment. In CLADE 2005, pages 45--53, July 2005.Google ScholarGoogle Scholar
  11. K.-B. Li. ClustalW-MPI: ClustalW analysis using distributed and parallel computing. Bioinformatics, 19(12):1585--1586, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  12. M. Litzkow, T. Tannenbaum, J. Basney, and M. Livny. Checkpoint and Migration of UNIX Processes in the Condor Distributed Processing System. Technical Report CS-TR-1997-1346, University of Wisconsin, Madison, Apr. 1997.Google ScholarGoogle Scholar
  13. I. Lyubashevskiy and V. Strumpen. Fault-tolerant file-I/O for portable checkpointing systems. The Journal of Supercomputing, 16(1-2):69--92, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Y. Masubuchi, S. Hoshina, T. Shimada, H. Hirayama, and N. Kato. Fault Recovery Mechanism for Multiprocessor Servers. In FTCS, pages 184--193, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. Nakano, P. Montesinos, K. Gharachorloo, and J. Torrellas. ReViveI/O: Efficient Handling of I/O in Highly-Available Rollback-Recovery Servers. In HPCA 2006, pages 200--211, Austin, Texas, USA, Feb.11--15, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  16. W. D. Norcott and D. Capps. IOzone Filesystem Benchmark, http://www.iozone.org/, 2006.Google ScholarGoogle Scholar
  17. S. Osman, D. Subhraveti, G. Su, and J. Nieh. The design and implementation of Zap: A system for migrating computing environments. In Proceedings of the Fourth Symposium on Operating Systems Design and Implementation, Dec. 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Ouyang and P. Maheshwari. Supporting Cost-Effective Fault Tolerance in Distributed Message-Passing Applications with File Operations. The Journal of Supercomputing, 14(3):207--232, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. D. Pei. Modification Operations Buffering: A Lowoverhead Approach to Checkpoint User Files. In IEEE 29th Symposium on Fault-Tolerant Computing, pages 36--38, Madison, USA, June 1999.Google ScholarGoogle Scholar
  20. J. S. Plank, M. Beck, G. Kingsley, and K. Li. Libckpt: Transparent checkpointing under UNIX. In Proceedings of the USENIX Technical Conference on UNIX and Advanced Computing Systems, pages 213--224, Berkeley, CA, USA, Jan. 1995. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. J. F. Ruscio, M. A. Heffner, and S. Varadarajan. DejaVu: Transparent User-Level Checkpointing, Migration, and Recovery for Distributed Systems. In IPDPS'07, pages 1--10. IEEE, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  22. G. Stellner. CoCheck: Checkpointing and Process Migration for MPI. In IPPS'96, pages 526--531, Honolulu, Hawaii, Oct. 02 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. M. Szeredi. File System in User Space, 2006.Google ScholarGoogle Scholar
  24. F. Wang, Q. Xin, B. Hong, S. A. Brandt, E. L. Miller, D. D. E. Long, and T. T. McLarty. File system workload analysis for large scale scientific computing applications. In MSST'04, College Park, MD, Apr. 2004. IEEE Computer Society Press.Google ScholarGoogle Scholar
  25. Y.-M. Wang, Y. Huang, K.-P. Vo, P.-Y. Chung, and C. M. R. Kintala. Checkpointing and its applications. In FTCS, pages 22--31, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. P. Wong and R. F. V. der Wijngaart. NAS Parallel Benchmarks I/O Version 2.4. Technical Report NAS-03-002, Computer Sciences Corporation, NASA Advanced Supercomputing (NAS) Division, NASA Ames Research Center, Moffett Field, CA 94035-1000, Jan. 2003.Google ScholarGoogle Scholar
  27. R. N. Xue, Y. H. Zhang, W. G. Chen, and W. M. Zheng. Thckpt: Transparent Checkpointing of UNIX Processes under IA64. In H. R. Arabnia, editor, PDPTA'05, volume 1, pages 325--332, Las Vegas, Nevada, USA, June27--30 2005. CSREA Press.Google ScholarGoogle Scholar
  28. W. Xue, J. Shu, Y. Wu, and W. Zheng. Parallel Algorithm and Implementation for Realtime Dynamic Simulation of Power System. In ICPP'2005, pages 137--144, Oslo, Norway, June 2005. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. V. C. Zandy. ckpt -- process checkpoint library, http://pages.cs.wisc.edu/~zandy/ckpt/, 2004.Google ScholarGoogle Scholar
  30. H. Zhong and J. Nieh. CRAK: Linux Checkpoint/Restart As a Kernel Module. Technical Report CUCS-014-01, Department of Computer Science, Columbia University, Nov. 2001.Google ScholarGoogle Scholar

Index Terms

  1. CprFS: a user-level file system to support consistent file states for checkpoint and restart

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        ICS '08: Proceedings of the 22nd annual international conference on Supercomputing
        June 2008
        390 pages
        ISBN:9781605581583
        DOI:10.1145/1375527

        Copyright © 2008 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 7 June 2008

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate584of2,055submissions,28%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader