Skip to main content
Log in

A Distributed Multi-Storage Resource Architecture and I/O Performance Prediction for Scientific Computing

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

I/O intensive applications have posed great challenges to computational scientists. A major problem of these applications is that users have to sacrifice performance requirements in order to satisfy storage capacity requirements in a conventional computing environment. Further performance improvement is impeded by the physical nature of these storage media even when state-of-the-art I/O optimizations are employed.

In this paper, we present a distributed multi-storage resource architecture, which can satisfy both performance and capacity requirements by employing multiple storage resources. Compared to a traditional single storage resource architecture, our architecture provides a more flexible and reliable computing environment. This architecture can bring new opportunities for high performance computing as well as inherit state-of-the-art I/O optimization approaches that have already been developed. It provides application users with high-performance storage access even when they do not have the availability of a single large local storage archive at their disposal. We also develop an Application Programming Interface (API) that provides transparent management and access to various storage resources in our computing environment. Since I/O usually dominates the performance in I/O intensive applications, we establish an I/O performance prediction mechanism which consists of a performance database and a prediction algorithm to help users better evaluate and schedule their applications. A tool is also developed to help users automatically generate performance data stored in databases. The experiments show that our multi-storage resource architecture is a promising platform for high performance distributed computing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. T. Anderson, M. Dahlin, J. Neefe, D. Patterson, D. Roselli and R. Wang, Serverless network file systems, in: Proc. of the 15th ACM Symposium on Operating Systems Principles (1995) pp. 109-126.

  2. C. Baru, R. Frost, J. Lopez, R. Marciano, R. Moore, A. Rajasekar and M. Wan, Meta-data design for a massive data analysis system, in: Proc. of CASCON'96 Conference (1996).

  3. C. Baru, R. Moore, A. Rajasekar and M. Wan, The SDSC storage resource broker, in: Proc. of CASCON'98 Conference, Toronto, Canada (December 1998).

  4. R. Bennett, K. Bryant, A. Sussman, R. Das and J. Saltz Jovian, A framework for optimizing parallel I/O, in: Proc. of the 1994 Scalable Parallel Libraries Conference (1994).

  5. R. Bordawekar, S. Landherr, D. Capps and M. Davis, Experimental evaluation of the Hewlett-Packard exemplar file system, Proc. ACM SIGMETRICS Performance Evaluation Review 25(3) (1997) 21-28.

    Google Scholar 

  6. A. Chervenak, I. Foster, C. Kesselman, C. Salisbury and S. Tuecke, Towards an architecture for the distributed management and analysis of large scientific datasets, Journal of Network and Computer Applications.

  7. Y. Cho, M. Winslett, J. Lee, Y. Chen, S. Kuo and K. Motukuri, Collective I/O on a SGI CRAY Origin 2000: Strategy and performance, in: Proc. of the 1998 International Conference on Parallel and Distributed Processing Technique and Applications (1998) pp. 485-492.

  8. A. Choudhary, R. Bordawekar, M. Harry, R. Krishnaiyer, R. Ponnusamy, T. Singh and R. Thakur, PASSION: Parallel and scalable software for input–;output, NPAC Technical report SCCS-636 (September 1994).

  9. Coda File System, http://www.coda.cs.cmu.edu.

  10. P. Corbett and D. Feitelson, The Vesta parallel file system, ACM Transactions on Computer Systems 14(3) (August 1996) 225-264.

    Google Scholar 

  11. P. Corbett, D. Feitelson, J.-P. Prost, G. Almasi, S.J. Baylor, A. Bolmarcich, Y. Hsu, J. Satran, M. Snir, R. Colao, B. Herr, J. Kavaky, T. Morgan and A. Zlotek, Parallel file systems for the IBM SP computers, IBM Systems Journal 34(2) (January 1995) 222-248.

    Google Scholar 

  12. R.A. Coyne, H. Hulen and R. Watson, The high performance storage system, in: Proc. of the Conference on Supercomputing, Portland, OR (1993).

  13. I. Foster and C. Kesselman, Globus: A metacomputing infrastructure Toolkit, International Journal of Supercomputer Applications (1997) 115-128.

  14. I. Foster and C. Kesselman, The Globus project: A status report, in: Proc. of IPPS/SPDP'98 Heterogeneous Computing Workshop (1998) pp. 4-18.

  15. I. Foster and C. Kesselman, The Grid: Blueprint for a New Computing Infrastructure (Morgan Kaufmann, Los Altos, CA, 1998).

    Google Scholar 

  16. I. Foster, D. Kohr Jr., R. Krishnaiyer and J. Mogill, Remote I/O: Fast access to distant storage, in: 5th Workshop on I/O in Parallel and Distributed Systems (1997).

  17. Global Grid Forum, http://www.gridforum.org.

  18. HPSS Worldwide Web Site, http://www.sdsc.edu/hpss.

  19. J. Huber, C. Elford, D. Reed, A. Chien and D. Blumenthal, PPFS: A high performance portable parallel file system, in: Proc. of the 9th ACM International Conference on Supercomputing (1995) pp. 385-394.

  20. D. Kotz, Multiprocessor file system interfaces, in: Proc. of the 2nd International Conference on Parallel and Distributed Information Systems (1993) pp. 194-201.

  21. A. Malagoli, A. Dubey and F. Cattaneo, A Portable and Efficient Parallel Code for Astrophysical Fluid Dynamics, http://astro.uchicago.edu/ Computing/On-Line/cfd95/camelse.html.

  22. S. Moyer and V. Sunderam, PIOUS: A scalable parallel I/O system for distributed computing environment, in: Proc. of the Scalable High-Performance Computing Conference (1994) pp. 71-78.

  23. N. Nieuwejaar and D. Kotz, The Galley parallel file system, in: Proc. of the 10th ACM International Conference on Supercomputing, Philadelphia, PA (May 1996) pp. 374-381.

  24. B. Rullman, Paragon parallel file system, External Product Specification, Intel Supercomputer Systems Division.

  25. K.E. Seamons, Y. Chen, P. Jones, J. Jozwiak and M. Winslett, Server-directed collective I/O in Panda, in: Proc. of the Conference on Supercomputing, San Diego, CA (December 1995).

  26. X. Shen, W. Liao and A. Choudhary, Remote I/O Optimization and Evaluation for Tertiary Storage Systems through Storage Resource Broker, in: Proc. of IASTED Applied Informatics, Innsbruck, Austria (2001).

  27. X. Shen, W. Liao, A. Choudhary, G. Memik, M. Kandemir, S. More, G. Thiruvathukal and A. Singh, A novel application development environment for large-scale scientific computations, in: International Conference on Supercomputing (May 2000).

  28. P.H. Smith and J. van Rosendale, Data and Visualization Corridors, Report on DVC Workshop Series (1998).

  29. W. Smith, I. Foster and V. Taylor, Predicting application run time using historical information, in: IPPS/SPDP '9 Workshop on Job Scheduling Strategies for Parallel Processing (1999).

  30. SRB Version 1.1.4 Manual, http://www.npaci.edu/DICE/SRB/ OldReleases/SRB1-1-4/SRB1-1-4.htm.

  31. SRB Version 1.1.7 Manual, http://www.npaci.edu/DICE/SRB/ OldReleases/SRB1-1-7/SRB.htm.

  32. H. Stern, Managing NFS and NIS (O'Reilly and Associates, 1991).

  33. R. Thakur, R. Bordawekar, A. Choudhary, R. Ponnusamy and T. Singh, PASSION runtime library for parallel I/O, in: Proc. of the Intel Supercomputer User's Group Conference (1995).

  34. R. Thakur, A. Choudhary, R. Bordawekar, S. More and S. Kuditipudi, Passion: Optimized I/O for parallel applications, IEEE Computer 29(6) (1996) 70-78.

    Google Scholar 

  35. R. Thakur, W. Gropp and E. Lusk, A case for using MPI's derived datatypes to improve I/O performance, in: Proc. of SC98: High Performance Networking and Computing (1998).

  36. R. Thakur, W. Gropp and E. Lusk, On implementing MPI-IO portably and with high performance, Preprint ANL/MCS-P732-1098, Argonne National Laboratory, Mathematics and Computer Science Division (1998).

  37. R. Thakur, W. Gropp and E. Lusk, Data sieving and collective I/O in ROMIO, in: Proc. of the 7th Symposium on the Frontiers of Massively Parallel Computation (1999).

  38. R. Thakur, E. Lusk and W. Gropp, I/O characterization of a portable astrophysics application on the IBM SP and Intel Paragon Preprint, MCSP534-0895, Argonne National Laboratory, Mathematics and Computer Science Division (1995).

  39. S. Toledo and F.G. Gustavson, The design and implementation of SOLAR, a portable library for scalable out-of-core linear algebra computations, in: Proc. of 4th Annual Workshop on I/O in Parallel and Distributed Systems (1996).

  40. UniTree User Guide, Release 2.0 (UniTree Software, Inc., 1998).

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to X. Shen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shen, X., Choudhary, A., Matarazzo, C. et al. A Distributed Multi-Storage Resource Architecture and I/O Performance Prediction for Scientific Computing. Cluster Computing 6, 189–200 (2003). https://doi.org/10.1023/A:1023584319229

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1023584319229

Navigation