An experimental evaluation and analysis of database cracking

Schuhknecht, Felix Martin; Jindal, Alekh; Dittrich, Jens

doi:10.1007/s00778-015-0397-y

An experimental evaluation and analysis of database cracking

Special Issue Paper
Published: 22 August 2015

Volume 25, pages 27–52, (2016)
Cite this article

The VLDB Journal Aims and scope Submit manuscript

2064 Accesses
15 Citations
Explore all metrics

Abstract

Database cracking has been an area of active research in recent years. The core idea of database cracking is to create indexes adaptively and incrementally as a side product of query processing. Several works have proposed different cracking techniques for different aspects including updates, tuple reconstruction, convergence, concurrency control, and robustness. Our 2014 VLDB paper “The Uncracked Pieces in Database Cracking” (PVLDB 7:97–108, 2013/VLDB 2014) was the first comparative study of these different methods by an independent group. In this article, we extend our published experimental study on database cracking and bring it to an up-to-date state. Our goal is to critically review several aspects, identify the potential, and propose promising directions in database cracking. With this study, we hope to expand the scope of database cracking and possibly leverage cracking in database engines other than MonetDB. We repeat several prior database cracking works including the core cracking algorithms as well as three other works on convergence (hybrid cracking), tuple reconstruction (sideways cracking), and robustness (stochastic cracking), respectively. Additionally to our conference paper, we now also look at a recently published study about CPU efficiency (predication cracking). We evaluate these works and show possible directions to do even better. As a further extension, we evaluate the whole class of parallel cracking algorithms that were proposed in three recent works. Altogether, in this work we revisit 8 papers on database cracking and evaluate in total 18 cracking methods, 6 sorting algorithms, and 3 full index structures. Additionally, we test cracking under a variety of experimental settings, including high selectivity (Low selectivity means that many entries qualify. Consequently, a high selectivity means, that only few entries qualify) queries, low selectivity queries, varying selectivity, and multiple query access patterns. Finally, we compare cracking against different sorting algorithms as well as against different main memory optimized indexes, including the recently proposed adaptive radix tree (ART). Our results show that: (1) the previously proposed cracking algorithms are repeatable, (2) there is still enough room to significantly improve the previously proposed cracking algorithms, (3) parallelizing cracking algorithms efficiently is a hard task, (4) cracking depends heavily on query selectivity, (5) cracking needs to catch up with modern indexing trends, and (6) different indexing algorithms have different indexing signatures.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Indexing Techniques of Distributed Ordered Tables: A Survey and Analysis

Article 26 January 2018

A Survey on Parallel Database Systems from a Storage Perspective: Rows Versus Columns

Comparing Oracle and PostgreSQL, Performance and Optimization

Notes

Note that the query time of full scan varies by as much as 4 times. This is because of lazy evaluation in the filtering depending on the position of low key and high key in the value domain.
Measured with Intel VTune Amplifier 2015.
After the first few queries, cracking mostly performs a pair of crack-in-two operations as the likelihood of two splits falling in two different partitions increases with the number of applied queries.
Please note that our current implementation relies on a uniform key distribution to create equal-sized partitions. Handling skewed distributions would require the generation of equi-depth partitions.
The available ART implementation does not support bulk loading.
In contrast to [23], we do not merge the chunks after each query as this results in overhead.

References

Adelson-Velsky, G., et al.: An algorithm for the organization of information. In: USSR Academy of Sciences, pp. 263–266 (1962)
Alvarez, V., Schuhknecht, F.M., Dittrich, J., Richter, S.: Main memory adaptive indexing for multi-core systems. In: DaMoN, Snowbird, UT, USA, pp. 3:1–3:10 (2014)
Bayer, R., McCreight, E.M.: Organization and maintenance of large ordered indices. Acta Inf. 1, 173–189 (1972)
Birkeland, O.R.: Searching large data volumes with MISD processing. Ph.D. Thesis (2008)
DeWitt, D.J., Naughton, J.F., et al.: Practical skew handling in parallel joins. In: VLDB, Proceedings, pp. 27–40 (1992)
Finch, T.: Incremental Calculation of Weighted Mean and Variance. University of Cambridge Computing Service, Cambridge (2009)
Google Scholar
Generalized Heap Impl. https://github.com/valyala/gheap
Graefe, G., Halim, F., Idreos, S., et al.: Concurrency control for adaptive indexing. PVLDB 5, 656–667 (2012)
Graefe, G., Halim, F., Idreos, S., et al.: Transactional support for adaptive indexing. VLDB J. 23(2), 303–328 (2014)
Graefe, G., Kuno, H.: Self-selecting, self-tuning, incrementally optimized indexes. In: EDBT, pp. 371–381 (2010)
Halim, F., Idreos, S., et al.: Stochastic database cracking: towards robust adaptive indexing in main-memory column-stores. PVLDB 5, 502–513 (2012)
Google Scholar
Hildebrandt, P., Isbitz, H.: Radix exchange: an internal sorting method for digital computers. J. ACM 6(2), 156–163 (1959)
Article MATH MathSciNet Google Scholar
Hoare, C.A.R.: Quicksort. Commun. ACM 4(7), 321 (1961)
Article Google Scholar
Idreos, S., Kersten, M., Manegold, S.: Updating a cracked database. In: SIGMOD, pp. 413–424 (2007)
Idreos, S., Kersten, M., Manegold, S.: Self-organizing tuple reconstruction in column-stores. In: SIGMOD, pp. 297–308 (2009)
Idreos, S., Manegold, S., et al.: Merging what’s cracked, cracking what’s merged. PVLDB 4, 586–597 (2011)
Google Scholar
Idreos, S., et al.: Database cracking. In: CIDR, pp. 68–78 (2007)
Kersten, M., et al.: Cracking the database store. In: CIDR, pp. 213–224 (2005)
Kim, C., et al.: FAST: Fast architecture sensitive tree search on modern CPUs and GPUs. In: SIGMOD, pp. 339–350 (2010)
Leis, V., et al.: The adaptive radix tree: ARTful indexing for main-memory databases. In: ICDE, pp. 38–49 (2013)
Martinez-Palau, X., Dominguez-Sal, D., et al.: Two-way replacement selection. PVLDB 3, 871–881 (2010)
Google Scholar
McCalpin, J.D.: STREAM benchmark, version from January 17. https://www.cs.virginia.edu/stream/FTP/Code/stream.c (2013)
Pirk, H., Petraki, E., Idreos, S., Manegold, S., Kersten, M.L.: Database cracking: fancy scan, not poor man’s sort! In: DaMoN, Snowbird, UT, USA, pp. 4:1–4:8 (2014)
Rao, J., Ross, K.A.: Making B+-trees cache conscious in main memory. In: SIGMOD, pp. 475–486 (2000)
Schuhknecht, F.M., Jindal, A., Dittrich, J.: The uncracked pieces in database cracking. PVLDB 7, 97–108 (2013)
Google Scholar
Schuhknecht, F.M., Khanchandani, P., Dittrich, J.: On the surprising difficulty of simple things: the case of radix partitioning. PVLDB 8, 934–937 (2015)
Google Scholar

Download references

Acknowledgments

Special thanks to Stratos Idreos for helping us in understanding the hybrid methods. Work partially supported by BMBF.

Author information

Authors and Affiliations

Information Systems Group, Saarland University, Saarbrücken, Germany
Felix Martin Schuhknecht & Jens Dittrich
CSAIL, MIT, Cambridge, MA, USA
Alekh Jindal

Authors

Felix Martin Schuhknecht
View author publications
You can also search for this author in PubMed Google Scholar
Alekh Jindal
View author publications
You can also search for this author in PubMed Google Scholar
Jens Dittrich
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Felix Martin Schuhknecht.

Ethics declarations

Competing interests

As we re-evaluate research, there are potential competing interests with CWI Amsterdam and the authors of [8].

Rights and permissions

Reprints and permissions

About this article

Cite this article

Schuhknecht, F.M., Jindal, A. & Dittrich, J. An experimental evaluation and analysis of database cracking. The VLDB Journal 25, 27–52 (2016). https://doi.org/10.1007/s00778-015-0397-y

Download citation

Received: 19 December 2014
Revised: 18 May 2015
Accepted: 13 July 2015
Published: 22 August 2015
Issue Date: February 2016
DOI: https://doi.org/10.1007/s00778-015-0397-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An experimental evaluation and analysis of database cracking

Abstract

Access this article

Similar content being viewed by others

Indexing Techniques of Distributed Ordered Tables: A Survey and Analysis

A Survey on Parallel Database Systems from a Storage Perspective: Rows Versus Columns

Comparing Oracle and PostgreSQL, Performance and Optimization

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing interests

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An experimental evaluation and analysis of database cracking

Abstract

Access this article

Similar content being viewed by others

Indexing Techniques of Distributed Ordered Tables: A Survey and Analysis

A Survey on Parallel Database Systems from a Storage Perspective: Rows Versus Columns

Comparing Oracle and PostgreSQL, Performance and Optimization

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing interests

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation