ABSTRACT
We present the Active Storage Fabrics (ASF) model for storage embedded parallel processing as a way to address petascale data intensive challenges. ASF is aimed at emerging scalable system-on-a-chip, storage class memory architectures, but may be realized in prototype form on current parallel systems. ASF can be used to transparently accelerate host workloads by close integration at the middleware data/storage boundary or directly by data intensive applications. We provide an overview of the major components involved in accelerating a parallel file system and a relational database management system, describe some early results, and outline our current research directions.
- Nsf awards $20 million to sdsc to develop "gordon", November 2009. http://ucsdnews.ucsd.edu/newsrel/supercomputer/11-09Gordon.asp.Google Scholar
- D. G. Andersen, J. Franklin, M. Kaminsky, A. Phanishayee, L. Tan, and V. Vasudevan. Fawn: a fast array of wimpy nodes. In SOSP '09: Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles, pages 1--14, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
- F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber. Bigtable: A distributed storage system for structured data. ACM Trans. Comput. Syst., 26(2):1--26, 2008. Google ScholarDigital Library
- G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels. Dynamo: amazon's highly available key-value store. In SOSP '07: Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles, pages 205--220, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- R. Freitas and W. Wilke. Storage-class memory: The next storage system technology. IBM Journal of Research and Development, 52(4/5):439--448, 2008. Google ScholarDigital Library
- http://sortbenchmark.org.Google Scholar
- J. Gray. Tape is dead, disk is tape, flash is disk, RAM locality is king. http://research.microsoft.com/en-us/um/people/gray/talks/flash_is_good.ppt, 2006.Google Scholar
- http://www.openfabrics.org.Google Scholar
- J. Piernas, J. Nieplocha, and E. J. Felix. Evaluation of active storage strategies for the lustre parallel file system. In SC '07: Proceedings of the 2007 ACM/IEEE conference on Supercomputing, pages 1--10, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- M. Poess and C. Floyd. New tpc benchmarks for decision support and web commerce. SIGMOD Rec., 29(4):64--71, 2000. Google ScholarDigital Library
- E. Riedel, C. Faloutsos, G. Gibson, and D. Nagle. Active disks for large-scale data processing. Computer, 34(6):68--74, Jun 2001. Google ScholarDigital Library
- M. T. Roth, M. Arya, L. Haas, M. Carey, W. Cody, R. Fagin, P. Schwarz, J. Thomas, and E. Wimmers. The garlic project. SIGMOD Rec., 25(2):557, 1996. Google ScholarDigital Library
- F. Schmuck and R. Haskin. GPFS: A shared-disk file system for large computing clusters. In FAST '02: Proceedings of the 1st USENIX Conference on File and Storage Technologies, page 19, Berkeley, CA, USA, 2002. USENIX Association. Google ScholarDigital Library
- http://www.mcs.anl.gov/research/projects/zeptoos/.Google Scholar
Index Terms
- Using the Active Storage Fabrics model to address petascale storage challenges
Recommendations
Using Working Set Reorganization to Manage Storage Systems with Hard and Solid State Disks
ICPPW '14: Proceedings of the 2014 43rd International Conference on Parallel Processing WorkshopsScientific applications from many problem domains produce and/or access large volumes of data. To support these applications, designers of high-end computing (HEC) systems have greatly increased the capacity of storage systems in recent years. However, ...
Implementation and evaluation of active storage in modern parallel file systems
Active Storage is a technology aimed at reducing the bandwidth requirements of current supercomputing systems, and leveraging the processing power of the storage nodes used by some modern file systems. To achieve both objectives, Active Storage moves ...
Using Intradisk Parallelism to Build Energy-Efficient Storage Systems
Server storage systems use numerous disks to achieve high performance, thereby consuming a significant amount of power. Intradisk parallelism can significantly reduce such systems' power consumption by letting disk drives exploit parallelism in the I/O ...
Comments