Abstract
Modern large-scale scientific simulations running on HPC systems generate data in the order of terabytes during a single run. To lessen the I/O load during a simulation run, scientists are forced to capture data infrequently, thereby making data collection an inherently lossy process. Yet, lossless compression techniques are hardly suitable for scientific data due to its inherently random nature; for the applications used here, they offer less than 10% compression rate. They also impose significant overhead during decompression, making them unsuitable for data analysis and visualization that require repeated data access.
To address this problem, we propose an effective method for In-situ Sort-And-B-spline Error-bounded Lossy Abatement (ISABELA) of scientific data that is widely regarded as effectively incompressible. With ISABELA, we apply a preconditioner to seemingly random and noisy data along spatial resolution to achieve an accurate fitting model that guarantees a ≥ 0.99 correlation with the original data. We further take advantage of temporal patterns in scientific data to compress data by ≈ 85%, while introducing only a negligible overhead on simulations in terms of runtime. ISABELA significantly outperforms existing lossy compression methods, such as Wavelet compression. Moreover, besides being a communication-free and scalable compression technique, ISABELA is an inherently local decompression method, namely it does not decode the entire data, making it attractive for random access.
Chapter PDF
Similar content being viewed by others
Keywords
References
Abbasi, H., Lofstead, J., Zheng, F., Klasky, S., Schwan, K., Wolf, M.: Extending I/O through high performance data services. In: Cluster Computing, Austin, TX, IEEE International (September 2007)
De Boor, C.: A Practical Guide to Splines. Springer, Heidelberg (1978)
Burtscher, M., Ratanaworabhan, P.: FPC: A high-speed compressor for double-precision floating-point data (2009), http://www.csl.cornell.edu/~burtscher/research/FPC/
Chou, J., Piegl, L.: Data reduction using cubic rational B-splines. IEEE Comput. Graph. Appl. 12, 60–68 (1992)
Cover, T.M., Thomas, J.: Elements of information theory. Wiley-Interscience, New York (1991)
Frazier, M.W.: An introduction to Wavelets through linear algebra, p. 501. Springer, Heidelberg (1999)
He, X., Shi, P.: Monotone B-spline smoothing. Journal of the American Statistical Association 93(442), 643–650 (1998)
Isenburg, M., Lindstrom, P., Snoeyink, J.: Lossless compression of predicted floating-point geometry. Computer-Aided Design 37(8), 869–877 (2005); CAD 2004 Special Issue: Modelling and Geometry Representations for CAD
Lee, S., Wolberg, G., Shin, S.Y.: Scattered data interpolation with multilevel B-splines. IEEE Trans. on Viz. and Comp. Graphics 3(3), 228–244 (1997)
Lindstrom, P., Isenburg, M.: Fast and efficient compression of floating-point data. IEEE Trans. on Viz. and Comp. Graphics 12(5), 1245–1250 (2006)
Lofstead, J., Zheng, F., Klasky, S., Schwan, K.: Adaptable, metadata rich IO methods for portable high performance IO. In: IPDPS 2009, Rome, Italy (May 2009)
Ma, K., Wang, C., Yu, H., Tikhonova, A.: In-situ processing and visualization for ultrascale simulations. Journal of Physics: Conference Series 78(1), 012043 (2007)
Ratanaworabhan, P., Ke, J., Burtscher, M.: Fast lossless compression of scientific floating-point data. In: Proc. of the DCC (2006)
Sayood, K.: Introduction to data compression. Morgan Kaufmann Publishers Inc., San Francisco (1996)
Wang, W.X., al, e.: Gyro-kinetic simulation of global turbulent transport properties in Tokamak experiments. Physics of Plasmas 13(9), 092505 (2006)
Welch, T.A.: A technique for high-performance data compression. Computer 17, 8–19 (1984)
Wold, S.: Spline functions in data analysis. American Statistical Association and American Society for Quality 16(1), 1–11 (1974)
Zheng, F., al, e.: PreDatA–preparatory data analytics on peta-scale machines. In: IPDPS, Atlanta, GA (April 2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lakshminarasimhan, S. et al. (2011). Compressing the Incompressible with ISABELA: In-situ Reduction of Spatio-temporal Data. In: Jeannot, E., Namyst, R., Roman, J. (eds) Euro-Par 2011 Parallel Processing. Euro-Par 2011. Lecture Notes in Computer Science, vol 6852. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23400-2_34
Download citation
DOI: https://doi.org/10.1007/978-3-642-23400-2_34
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23399-9
Online ISBN: 978-3-642-23400-2
eBook Packages: Computer ScienceComputer Science (R0)