article

Permuting data on random-access block storage

Authors:
Risi Thonangi

Duke University

Duke University
View Profile

,
Jun Yang

Duke University

Duke University
View Profile

Proceedings of the VLDB Endowment Volume 6 Issue 9pp 721–732https://doi.org/10.14778/2536360.2536371

Published:01 July 2013Publication History

Proceedings of the VLDB Endowment

Abstract

Permutation is a fundamental operator for array data, with applications in, for example, changing matrix layouts and reorganizing data cubes. We consider the problem of permuting large quantities of data stored on secondary storage that supports fast random block accesses, such as solid state drives and distributed key-value stores. Faster random accesses open up interesting new opportunities for permutation. While external merge sort has often been used for permutation, it is an overkill that fails to exploit the property of permutation fully and carries unnecessary overhead in storing and comparing keys. We propose faster algorithms with lower memory requirements for a large, useful class of permutations. We also tackle practical challenges that traditional permutation algorithms have not dealt with, such as exploiting random block accesses more aggressively, considering the cost asymmetry between reads and writes, and handling arbitrary data dimension sizes (as opposed to perfect powers often assumed by previous work). As a result, our algorithms are faster and more broadly applicable.

References

Agarwal, Agrawal, Deshpande, Gupta, Naughton, Ramakrishnan, and Sarawagi. On the computation of multidimensional aggregates. VLDB 1996. Google Scholar
Balkesen, Teubner, Alonso, and Özsu. Main-memory hash joins on multicore CPUs: Tuning to the underlying hardware. ICDE 2013. Google Scholar
Cao, Bramandia, Chan, and Tan. Optimized query evaluation using cooperative sorts. ICDE 2010.Google Scholar
Cormen. Virtual Memory for Data-Parallel Computing. PhD thesis, MIT, 1993. Google Scholar
Eklundh. A fast computer method for matrix transposing. IEEE Transactions on Computers, 21(7):801-803, July 1972. Google Scholar
Kaushik, Huang, Johnson, Johnson, and Sadayappan. Efficient transposition algorithms for large matrices. Supercomputing 1993. Google Scholar
Krishnamoorthy, Baumgartner, Cociorva, Lam, and Sadayappan. On efficient out-of-core matrix transposition. Technical report, Ohio State University, 2003.Google Scholar
Ross and Srivastava. Fast computation of sparse datacubes. VLDB 1997. Google Scholar
Satish, Kim, Chhugani, Nguyen, Lee, Kim, and Dubey. Fast sort on CPUs and GPUs: A case for bandwidth oblivious SIMD sort. SIGMOD 2010. Google Scholar
Suh and Prasanna. An efficient algorithm for out-of-core matrix transposition. IEEE Transactions on Computers, 51(4):420-438, 2002. Google Scholar
Thonangi and Yang. Permuting data on random-access block storage. Technical report, Duke University, 2013. http://www.cs.duke.edu/dbgroup/papers/ThonangiYang-13-permute_storage.pdf.Google Scholar
The TPC benchmark H, 1993. http://www.tpc.org/tpch/.Google Scholar
Vitter. External memory algorithms and data structures. ACM Computing Surveys, 33(2):209-271, 2001. Google Scholar
Zhao. Performance Issues of Multi-Dimensional Data Analysis. PhD thesis, University of Wisconsin at Madison, 1998. Google Scholar
Zhao, Deshpande, and Naughton. An array-based algorithm for simultaneous multidimensional aggregates. SIGMOD 1997. Google Scholar

Recommendations

Meta-Block: Exploiting Cross-Layer and Direct Storage Access for Decentralized Blockchain Storage Systems
Decentralized storage systems such as blockchain storage applications adopt the distributed storage technology and use distributed storage nodes to store the persistent data. For each off-chain storage node, keyvalue (KV) stores are normally used to ...
Read More
Exposing non-volatile memory cache for adaptive storage access
SAC '15: Proceedings of the 30th Annual ACM Symposium on Applied Computing

This paper proposes a method that combines next generation non-volatile (NV) memory technologies to block storage and makes use of NV memory as storage cache. The existing method to combine cache storage with block storage hides the cache storage under ...
Read More
Efficient archival data storage
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

Proceedings of the VLDB Endowment Volume 6, Issue 9
July 2013
180 pages
ISSN:2150-8097
Issue’s Table of Contents
Sponsors
In-Cooperation
Publisher
VLDB Endowment
Publication History
- Published: 1 July 2013
Published in pvldb Volume 6, Issue 9
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 56
  Total Downloads
- Downloads (Last 12 months)5
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Permuting data on random-access block storage

Proceedings of the VLDB Endowment

Abstract

References

Cited By

Recommendations

Meta-Block: Exploiting Cross-Layer and Direct Storage Access for Decentralized Blockchain Storage Systems

Exposing non-volatile memory cache for adaptive storage access

Efficient archival data storage

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Permuting data on random-access block storage

Proceedings of the VLDB Endowment

Abstract

References

Cited By

Recommendations

Meta-Block: Exploiting Cross-Layer and Direct Storage Access for Decentralized Blockchain Storage Systems

Exposing non-volatile memory cache for adaptive storage access

Efficient archival data storage

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media