Skip to main content
Log in

IOPro: a parallel I/O profiling and visualization framework for high-performance storage systems

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Efficient execution of large-scale scientific applications requires high-performance computing systems designed to meet the I/O requirements. To achieve high-performance, such data-intensive parallel applications use a multi-layer layer I/O software stack, which consists of high-level I/O libraries such as PnetCDF and HDF5, the MPI library, and parallel file systems. To design efficient parallel scientific applications, understanding the complicated flow of I/O operations and the involved interactions among the libraries is quintessential. Such comprehension helps identify I/O bottlenecks and thus exploits the potential performance in different layers of the storage hierarchy. To profile the performance of individual components in the I/O stack and to understand complex interactions among them, we have implemented a GUI-based integrated profiling and analysis framework, IOPro. IOPro automatically generates an instrumented I/O stack, runs applications on it, and visualizes detailed statistics based on the user-specified metrics of interest. We present experimental results from two different real-life applications and show how our framework can be used in practice. By generating an end-to-end trace of the whole I/O stack and pinpointing I/O interference, IOPro aids in understanding I/O behavior and improving the I/O performance significantly.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20

Similar content being viewed by others

Notes

  1. PVFS2 also supports an optional kernel module that allows a file system to be mounted as in other file systems.

References

  1. Ching A, Choudhary A, Coloma K, Liao W-K, Ross R, Gropp W (2003)Noncontiguous I/O accesses through MPI-IO. In: Cluster computing and the grid, (2003) Proceedings. CCGrid 2003. 3rd IEEE/ACM international symposium. IEEE, pp 104–111

  2. Ma X, Winslett M, Lee J, Yu S (2003) Improving MPI-IO output performance with active buffering plus threads. In: Parallel and distributed processing symposium, 2003. Proceedings international. IEEE, p 10

  3. Coloma K, Choudhary A, Liao W-K, Ward L, Russell E, Pundit N (2004) Scalable high-level caching for parallel I/O. In: Parallel and distributed processing symposium, 2004. Proceedings of 18th international. IEEE, p 96

  4. Liao W, Ching A, Coloma K, Choudhary A (2007) An implementation and evaluation of client-side file caching for MPI-IO. In: IEEE international parallel and distributed processing symposium. IEEE, p 49

  5. Thakur R, Gropp W, Lusk E (1998) Data sieving and collective I/O in ROMIO. In: Proceedings of the seventh symposium on the frontiers of massively parallel computation. Published by the IEEE Computer Society, pp 182–189

  6. Del Rosario J, Bordawekar R, Choudhary A (1993) Improved parallel I/O via a two-phase run-time access strategy. ACM SIGARCH Comput Archit News 21(5):31–38

    Article  Google Scholar 

  7. Shan H, Antypas K, Shalf J (2008) Characterizing and predicting the I/O performance of HPC applications using a parameterized synthetic benchmark. In: Proceedings of the 2008 ACM/IEEE conference on supercomputing. IEEE Press, p 42

  8. Zhang X, Davis K, Jiang S (2010) IOrchestrator: improving the performance of multi-node I/O systems via inter-server coordination. In: Proceedings of the 2010 ACM/IEEE international conference for high performance computing, networking, storage and analysis. IEEE Computer Society, pp 1–11

  9. Lofstead J, Zheng F, Liu Q, Klasky S, Oldfield R, Kordenbrock T, Schwan K, Wolf M (2010) Managing variability in the IO performance of petascale storage systems. In: Proceedings of the 2010 ACM/IEEE international conference for high performance computing, networking, storage and analysis. IEEE Computer Society, pp 1–12

  10. Song H, Yin Y, Sun X-H, Thakur R, Lang S (2011) Server-side I/O coordination for parallel file systems. In: Proceedings of 2011 international conference for high performance computing, networking, storage and analysis. ACM, p 17

  11. Zhang X, Davis K, Jiang S (2011) QoS support for end users of I/O-intensive applications using shared storage systems. In: Proceedings of 2011 international conference for high performance computing, networking, storage and analysis. ACM, p 18

  12. Zhang X, Davis K, Jiang S (2012) Opportunistic data-driven execution of parallel programs for efficient I/O services. In: Parallel and distributed processing symposium (IPDPS) (2012) IEEE 26th international. IEEE, pp 330–341

  13. Chen Y, Sun X-H, Thakur R, Song H, Jin H (2010) Improving parallel I/O performance with data layout awareness. In: Cluster computing (CLUSTER), 2010 IEEE international conference. IEEE, pp 302–311

  14. Schwan P (2003) Lustre: building a file system for 1000-node clusters. In: Proceedings of the 2003 Linux symposium, vol 2003

  15. Schmuck FB, Haskin RL (2002) GPFS: a shared-disk file system for large computing clusters. In: FAST, vol 2, p 19

  16. Welch B, Unangst M, Abbasi Z, Gibson G, Mueller B, Small J, Zelenka J, Zhou B (2008) Scalable performance of the Panasas parallel file system. In: Proceedings of the 6th USENIX conference on file and storage technologies. USENIX Association, p 2

  17. Carns P, Ligon III W, Ross R, Thakur R (2000) PVFS: a parallel file system for Linux clusters. In: Proceedings of the 4th annual Linux showcase & conference, vol 4. USENIX Association, pp 28–28

  18. Thakur R, Gropp W, Lusk E (1999) On implementing MPI-IO portably and with high performance. In: Proceedings of the sixth workshop on I/O in parallel and distributed systems. ACM, pp 23–32

  19. Gropp W, Huss-Lederman S, Lumsdaine A, Lusk E, Nitzberg B, Saphir W, Snirv (1998) MPI - the complete reference, The MPI-2 extensions, vol 2

  20. Li J, Liao W, Choudhary A, Ross R, Thakur R, Gropp W, Latham R, Siegel A, Gallagher B, Zingale M (2003) Parallel netCDF: a high-performance scientific I/O interface. In: Proceedings of the 2003 ACM/IEEE conference on supercomputing. IEEE Computer Society, p 39

  21. The HDF Group (1997-2014) Hierarchical Data Format, version 5. http://www.hdfgroup.org/HDF5/

  22. Ali N, Carns P, Iskra K, Kimpe D, Lang S, Latham R, Ross R, Ward L, Sadayappan P (2009) Scalable I/O forwarding framework for high-performance computing systems. In: IEEE international conference on cluster computing and workshops, 2009. CLUSTER’09. IEEE, pp 1–10

  23. Srivastava A, Eustace A (1994) ATOM: a system for building customized program analysis tools. ACM 29(6):196–205

  24. De Bus B, Chanet D, De Sutter B, Van Put L, De Bosschere K (2004) The design and implementation of FIT: a flexible instrumentation toolkit. In: Proceedings of the 5th ACM SIGPLAN–SIGSOFT workshop on program analysis for software tools and engineering. ACM, pp 29–34

  25. Bala V, Duesterwald E, Banerjia S (2000) Dynamo: a transparent dynamic optimization system. In: ACM SIGPLAN Notices, vol 35, no. 5. ACM, pp 1–12

  26. Bruening DL (2004) Efficient, transparent, and comprehensive runtime code manipulation. Ph.D. dissertation, Massachusetts Institute of Technology

  27. Luk C, Cohn R, Muth R, Patil H, Klauser A, Lowney G, Wallace S, Reddi V, Hazelwood K (2005) Pin: building customized program analysis tools with dynamic instrumentation. In: ACM SIGPLAN notices, vol 40, no. 6. ACM, pp 190–200

  28. Source O. Dyninst: An application program interface (api) for runtime code generation. Online, http://www.dyninst.org

  29. Hollingsworth JK, Niam O, Miller BP, Xu Z, Gonçalves MJ, Zheng L (1997) MDL: a language and compiler for dynamic program instrumentation. In: Proceedings of parallel architectures and compilation techniques. IEEE, pp 201–212

  30. Nieuwejaar N, Kotz D, Purakayastha A, Ellis S, Best M (1996) File-access characteristics of parallel scientific workloads. Parallel Distrib Syst IEEE Trans 7(10):1075–1089

    Article  Google Scholar 

  31. Simitci H (1996) Pablo MPI instrumentation user’s guide. Department of Computer Science, University of Illinois

  32. Moore S, Wolf F, Dongarra J, Shende S, Malony A, Mohr B (2005) A scalable approach to MPI application performance analysis. Recent advances in parallel virtual machine and message passing, interface

  33. Moore S, Cronk D, London K, Dongarra J (2001) Review of performance analysis tools for mpi parallel programs. In: Recent advances in parallel virtual machine and message passing interface. Springer, pp 241–248

  34. Pillet V, Labarta J, Cortes T, Girona S (1995) Paraver: a tool to visualize and analyze parallel code. In: Proceedings of WoTUG-18: transputer and occam developments, vol 44, pp 17–31

  35. Open |SpeedShop. http://www.openspeedshop.org/wp/

  36. Mohr B, Wolf F (2004) KOJAK—a tool set for automatic performance analysis of parallel programs. Euro-Par 2003 parallel processing, pp 1301–1304

  37. Arnold D, Ahn D, De Supinski B, Lee G, Miller B, Schulz M (2007) Stack trace analysis for large scale debugging. In: IEEE international parallel and distributed processing symposium. IEEE, p 64

  38. Barham P, Donnelly A, Isaacs R, Mortier R (2004) Using Magpie for request extraction and workload modelling. In: OSDI, vol 4, p 18

  39. Sigelman BH, Barroso LA, Burrows M, Stephenson P, Plakal M, Beaver D, Jaspan S, Shanbhag C (2010) Dapper, a large-scale distributed systems tracing infrastructure, Google research

  40. Erlingsson Ú, Peinado M, Peter S, Budiu M, Mainar-Ruiz G (2012) Fay: extensible distributed tracing from kernels to clusters. ACM Trans Comput Syst (TOCS) 30(4):13

    Article  Google Scholar 

  41. Lee GL, Schulz M, Ahn DH, Bernat A, de Supinski BR, Ko SY, Rountree B (2007) Dynamic binary instrumentation and data aggregation on large scale systems. Int J Parallel Program 35(3):207–232

    Article  Google Scholar 

  42. Carns P, Latham R, Ross R, Iskra K, Lang S, Riley K (2009) 24, 7 characterization of petascale I, O workloads. In: Cluster computing and workshops, (2009) CLUSTER’09. IEEE international conference. IEEE, pp 1–10

  43. Nagel WE, Arnold A, Weber M, Hoppe H-C, Solchenbach K (1996) VAMPIR: visualization and analysis of MPI resources

  44. Kim SJ, Son SW, Liao W-K, Kandemir M, Thakur R, Choudhary A (2012) IOPin: runtime profiling of parallel I/O in HPC systems. In: High performance computing, networking, storage and analysis (SCC). SC Companion: IEEE, pp 18–23

  45. Fryxell B, Olson K, Ricker P, Timmes F, Zingale M, Lamb D, MacNeice P, Rosner R, Truran J, Tufo H (2000) FLASH: an adaptive mesh hydrodynamics code for modeling astrophysical thermonuclear flashes. The Astrophys J Suppl Ser 131:273

    Article  Google Scholar 

  46. Gurumurthi S, Sivasubramaniam A, Kandemir M, Franke H (2003) DRPM: dynamic speed control for power management in server class disks. In: Computer architecture, (2003) Proceedings of 30th annual international symposium. IEEE, pp 169–179

  47. Sankaran R, Hawkes E, Chen J, Lu T, Law C (2006) Direct numerical simulations of turbulent lean premixed combustion. In: Journal of physics: conference series, vol 46. IOP Publishing, p 38

Download references

Acknowledgments

We would like to thank the anonymous reviewers for their comments in improving this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Seong Jo Kim.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kim, S.J., Zhang, Y., Son, S.W. et al. IOPro: a parallel I/O profiling and visualization framework for high-performance storage systems. J Supercomput 71, 840–870 (2015). https://doi.org/10.1007/s11227-014-1329-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-014-1329-0

Keywords

Navigation