Skip to main content
Log in

Adaptive partitioning and indexing for in situ query processing

The VLDB Journal Aims and scope Submit manuscript

Abstract

The constant flux of data and queries alike has been pushing the boundaries of data analysis systems. The increasing size of raw data files has made data loading an expensive operation that delays the data-to-insight time. To alleviate the loading cost, in situ query processing systems operate directly over raw data and offer instant access to data. At the same time, analytical workloads have increasing number of queries. Typically, each query focuses on a constantly shifting—yet small—range. As a result, minimizing the workload latency requires the benefits of indexing in in situ query processing. In this paper, we present an online partitioning and indexing scheme, along with a partitioning and indexing tuner tailored for in situ querying engines. The proposed system design improves query execution time by taking into account user query patterns, to (i) partition raw data files logically and (ii) build lightweight partition-specific indexes for each partition. We build an in situ query engine called Slalom to showcase the impact of our design. Slalom employs adaptive partitioning and builds non-obtrusive indexes in different partitions on-the-fly based on lightweight query access pattern monitoring. As a result of its lightweight nature, Slalom achieves efficient query processing over raw data with minimal memory consumption. Our experimentation with both microbenchmarks and real-life workloads shows that Slalom outperforms state-of-the-art in situ engines and achieves comparable query response times with fully indexed DBMS, offering lower cumulative query execution times for query workloads with increasing size and unpredictable access patterns.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

Notes

  1. Details on how this formula is derived are found in “Appendix.”

References

  1. Abad, C.L., Roberts, N., Lu, Y., Campbell, R.H.: A storage-centric analysis of MapReduce workloads: file popularity, temporal locality and arrival patterns. In: Proceedings of the IEEE International Symposium on Workload Characterization (IISWC), pp. 100–109 (2012)

  2. Abouzied, A., Abadi, D.J., Silberschatz, A.: Invisible loading: access-driven data transfer from raw files into database systems. In: Proceedings of the International Conference on Extending Database Technology (EDBT), pp. 1–10 (2013)

  3. Agrawal, S., Narasayya, V.R., Yang, B.: Integrating vertical and horizontal partitioning into automated physical database design. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 359–370 (2004)

  4. Ailamaki, A., DeWitt, D.J., Hill, M.D., Skounakis, M.: Weaving relations for cache performance. In: Proceedings of the International Conference on Very Large Data Bases (VLDB), pp. 169–180 (2001)

  5. Alagiannis, I., Borovica, R., Branco, M., Idreos, S., Ailamaki, A.: NoDB: efficient query execution on raw data files. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 241–252 (2012)

  6. Alamoudi, A.A., Grover, R., Carey, M.J., Borkar, V.R.: External data access and indexing in AsterixDB. In: Proceedings of the ACM International Conference on Information and Knowledge Management (CIKM), pp. 3–12 (2015)

  7. Alexiou, K., Kossmann, D., Larson, P.-Å.: Adaptive range filters for cold data: avoiding trips to siberia. Proc. VLDB Endow. 6(14), 1714–1725 (2013)

    Article  Google Scholar 

  8. Athanassoulis, M., Ailamaki, A.: BF-Tree: approximate tree indexing. Proc. VLDB Endow. 7(14), 1881–1892 (2014)

    Article  Google Scholar 

  9. Athanassoulis, M., Idreos, S.: Design tradeoffs of data access methods. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, Tutorial (2016)

  10. Athanassoulis, M., Kester, M.S., Maas, L.M., Stoica, R., Idreos, S., Ailamaki, A., Callaghan, M.: Designing access methods: the RUM conjecture. In: Proceedings of the International Conference on Extending Database Technology (EDBT), pp. 461–466 (2016)

  11. Athanassoulis, M., Yan, Z., Idreos, S.: UpBit: scalable in-memory Updatable Bitmap indexing. In: Proceedings of the ACM SIGMOD International Conference on Management of Data (2016)

  12. Blanas, S., Wu, K., Byna, S., Dong, B., Shoshani, A.: Parallel data analysis directly on scientific file formats. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 385–396 (2014)

  13. Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13(7), 422–426 (1970)

    Article  Google Scholar 

  14. Borwein, P.B.: On the complexity of calculating factorials. J. Algorithms 6(3), 376–380 (1985)

    Article  MathSciNet  Google Scholar 

  15. Bruno, N., Chaudhuri, S.: An online approach to physical design tuning. In: Proceedings of the IEEE International Conference on Data Engineering (ICDE), pp. 826–835 (2007)

  16. Chaudhuri, S., Narasayya, V.R.: An efficient cost-driven index selection tool for microsoft SQL server. In: Proceedings of the International Conference on Very Large Data Bases (VLDB), pp. 146–155 (1997)

  17. Chen, Y., Alspaugh, S., Katz, R.H.: Interactive analytical processing in big data systems: a cross-industry study of MapReduce workloads. Proc. VLDB Endow. 5(12), 1802–1813 (2012)

    Article  Google Scholar 

  18. Cheng, Y., Rusu, F.: Parallel in-situ data processing with speculative loading. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 1287–1298 (2014)

  19. Chou, J.C.-Y., Howison, M., Austin, B., Wu, K., Qiang, J., Bethel, E.W., Shoshani, A., Rübel, O., Prabhat, Ryne, R.D.: Parallel index and query for large scale data analysis. In: Proceedings of the ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 30:1–30:11 (2011)

  20. Clopper, C.J., Pearson, E.S.: The use of confidence or fiducial limits illustrated in the case of the binomial. Biometrika 26(4), 404–413 (1934)

    Article  Google Scholar 

  21. DeWitt, D.J., Halverson, A., Nehme, R.V., Shankar, S., Aguilar-Saborit, J., Avanes, A., Flasza, M., Gramling, J.: Split query processing in polybase. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 1255–1266 (2013)

  22. Finkelstein, S.J., Schkolnick, M., Tiberio, P.: Physical database design for relational databases. ACM Trans. Database Syst. (TODS) 13(1), 91–128 (1988)

    Article  Google Scholar 

  23. Furtado, C., Lima, A.A.B., Pacitti, E., Valduriez, P., Mattoso, M.: Physical and virtual partitioning in OLAP database clusters. In: Proceedings of the Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), pp. 143–150 (2005)

  24. Gankidi, V.R., Teletia, N., Patel, J.M., Halverson, A., DeWitt, D.J.: Indexing HDFS data in PDW: splitting the data from the index. Proc. VLDB Endow. 7(13), 1520–1528 (2014)

    Article  Google Scholar 

  25. Graefe, G., Kuno, H.: Self-selecting, self-tuning, incrementally optimized indexes. In: Proceedings of the International Conference on Extending Database Technology (EDBT), pp. 371–381 (2010)

  26. Graefe, G., McKenna, W.J.: The volcano optimizer generator: extensibility and efficient search. In: Proceedings of the IEEE International Conference on Data Engineering (ICDE), pp. 209–218 (1993)

  27. Grund, M., Krüger, J., Plattner, H., Zeier, A., Cudre-Mauroux, P., Madden, S.: HYRISE: a main memory hybrid storage engine. Proc. VLDB Endow. 4(2), 105–116 (2010)

    Article  Google Scholar 

  28. Halim, F., Idreos, S., Karras, P., Yap, R.H.C.: Stochastic database cracking: towards robust adaptive indexing in main-memory column-stores. Proc. VLDB Endow. 5(6), 502–513 (2012)

    Article  Google Scholar 

  29. Härder, T.: Selecting an optimal set of secondary indices. In: Proceedings of the European Cooperation in Informatics (ECI), pp. 146–160 (1976)

    Chapter  Google Scholar 

  30. Hu, G., Ma, J., Huang, B.: High throughput implementation of MD5 algorithm on GPU. In: Proceedings of the International Conference on Ubiquitous Information Technologies & Applications (ICUT), pp. 1–5 (2009)

  31. Idreos, S., Alagiannis, I., Johnson, R., Ailamaki, A.: Here are my data files. Here are my queries. Where are my results? In: Proceedings of the Biennial Conference on Innovative Data Systems Research (CIDR), pp. 57–68 (2011)

  32. Idreos, S., Kersten, M.L., Manegold, S.: Database cracking. In: Proceedings of the Biennial Conference on Innovative Data Systems Research (CIDR) (2007)

  33. Idreos, S., Kersten, M.L., Manegold, S.: Self-organizing tuple reconstruction in column-stores. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 297–308 (2009)

  34. Idreos, S., Manegold, S., Kuno, H., Graefe, G.: Merging what’s cracked, cracking what’s merged: adaptive indexing in main-memory column-stores. Proc. VLDB Endow. 4(9), 586–597 (2011)

    Article  Google Scholar 

  35. Idreos, S., Zoumpatianos, K., Athanassoulis, M., Dayan, N., Hentschel, B., Kester, M.S., Guo, D., Maas, L., Qin, W., Abdul, W., Sun, Y.: The periodic table of data structures. Bull. IEEE Comput. Soc. Tech. Comm. Data Eng. 41(3), 64–75 (2018)

    Google Scholar 

  36. Ivanova, M., Kersten, M.L., Manegold, S.: Data vaults: a symbiosis between database technology and scientific file repositories. In: Proceedings of the International Conference on Scientific and Statistical Database Management (SSDBM), pp. 485–494 (2012)

    Google Scholar 

  37. Jindal, A., Dittrich, J.: Relax and let the database do the partitioning online. In: Proceedings of the International Conference on Very Large Data Bases (VLDB), pp. 65–80 (2011)

    Google Scholar 

  38. Kargin, Y., Kersten, M.L., Manegold, S., Pirk, H.: The DBMS—your big data sommelier. In: Proceedings of the IEEE International Conference on Data Engineering (ICDE), pp. 1119–1130 (2015)

  39. Karlin, A.R., Manasse, M.S., McGeoch, L.A., Owicki, S.S.: Competitive randomized algorithms for non-uniform problems. In: Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 301–309 (1990)

  40. Karpathiotakis, M., Alagiannis, I., Ailamaki, A.: Fast queries over heterogeneous data through engine customization. Proc. VLDB Endow. 9(12), 972–983 (2016)

    Article  Google Scholar 

  41. Karpathiotakis, M., Alagiannis, I., Heinis, T., Branco, M., Ailamaki, A.: Just-in-time data virtualization: lightweight data management with ViDa. In: Proceedings of the Biennial Conference on Innovative Data Systems Research (CIDR) (2015)

  42. Karpathiotakis, M., Branco, M., Alagiannis, I., Ailamaki, A.: Adaptive query processing on RAW data. Proc. VLDB Endow. 7(12), 1119–1130 (2014)

    Article  Google Scholar 

  43. Kerrisk, M.: The Linux programming interface: a Linux and UNIX system programming handbook. No Starch Press, San Francisco (2010)

    Google Scholar 

  44. Kester, M.S., Athanassoulis, M., Idreos, S.: Access path selection in main-memory optimized data systems: should I scan or should I probe? In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 715–730 (2017)

  45. Kornacker, M., Behm, A., Bittorf, V., Bobrovytsky, T., Ching, C., Choi, A., Erickson, J., Grund, M., Hecht, D., Jacobs, M., Joshi, I., Kuff, L., Kumar, D., Leblang, A., Li, N., Pandis, I., Robinson, H., Rorke, D., Rus, S., Russell, J., Tsirogiannis, D., Wanderman-Milne, S., Yoder, M.: Impala: a modern, open-source SQL engine for Hadoop. In: Proceedings of the Biennial Conference on Innovative Data Systems Research (CIDR) (2015)

  46. Lightstone, S., Teorey, T.J., Nadeau, T.P.: Physical Database Design: The Database Professional’s Guide to Exploiting Indexes, Views, Storage, and More. Morgan Kaufmann, Burlington (2007)

    Google Scholar 

  47. López-Blázquez, F., Mino, B.S.: Binomial approximation to hypergeometric probabilities. J. Stat. Plan. Inference 87(1), 21–29 (2000)

    Article  MathSciNet  Google Scholar 

  48. McCrary, S.: Implementing algorithms to measure common statistics. VLDB J. 8, 1–17 (2015)

  49. Melnik, S., Gubarev, A., Long, J.J., Romer, G., Shivakumar, S., Tolton, M., Vassilakis, T.: Dremel: interactive analysis of web-scale datasets. Proc. VLDB Endow. 3(1), 330–339 (2010)

    Article  Google Scholar 

  50. Moerkotte, G.: Small materialized aggregates: a light weight index structure for data warehousing. In: Proceedings of the International Conference on Very Large Data Bases (VLDB), pp. 476–487 (1998)

  51. Mühlbauer, T., Rödiger, W., Seilbeck, R., Reiser, A., Kemper, A., Neumann, T.: Instant loading for main memory databases. Proc. VLDB Endow. 6(14), 1702–1713 (2013)

    Article  Google Scholar 

  52. O’Neil, P.E.: Model 204 architecture and performance. In: Proceedings of the International Workshop on High Performance Transaction Systems (HPTS), pp. 40–59 (1987)

  53. Papadomanolakis, S., Ailamaki, A.: AutoPart: Automating schema design for large scientific databases using data partitioning. In: Proceedings of the International Conference on Scientific and Statistical Database Management (SSDBM), pp. 383 (2004)

  54. Pearson, K.: Contributions to the mathematical theory of evolution. II. Skew variation in homogeneous material. Philos. Trans. R. Soc. Lond. 186(Part I), 343–424 (1895)

    Google Scholar 

  55. Peterson, W.W., Brown, D.T.: Cyclic codes for error detection. Proc. IRE 49(1), 228–235 (1961)

    Article  MathSciNet  Google Scholar 

  56. Petraki, E., Idreos, S., Manegold, S.: Holistic indexing in main-memory column-stores. In: Proceedings of the ACM SIGMOD International Conference on Management of Data (2015)

  57. Richter, S., Quiané-Ruiz, J.-A., Schuh, S., Dittrich, J.: Towards zero-overhead static and adaptive indexing in Hadoop. VLDB J. 23(3), 469–494 (2013)

    Article  Google Scholar 

  58. Rivest, R.L.: The MD5 message-digest algorithm. RFC 1321, 1–21 (1992)

    Google Scholar 

  59. Schnaitter, K., Abiteboul, S., Milo, T., Polyzotis, N.: COLT: continuous on-line database tuning. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 793–795 (2006)

  60. Schuhknecht, F.M., Jindal, A., Dittrich, J.: The uncracked pieces in database cracking. Proc. VLDB Endow. 7(2), 97–108 (2013)

    Article  Google Scholar 

  61. Sidirourgos, L., Kersten, M.L.: Column imprints: a secondary index structure. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 893–904 (2013)

  62. Sinha, R.R., Mitra, S., Winslett, M.: Bitmap indexes for large scientific data sets: a case study. In: Proceedings of the IEEE International Symposium on Parallel and Distributed Processing (IPDPS) (2006)

  63. Sun, L., Franklin, M.J., Krishnan, S., Xin, R.S.: Fine-grained partitioning for aggressive data skipping. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 1115–1126 (2014)

  64. Thusoo, A., Sarma, J.S., Jain, N., Shao, Z., Chakka, P., Anthony, S., Liu, H., Wyckoff, P., Murthy, R.: Hive—a warehousing solution over a map-reduce framework. Proc. VLDB Endow. 2(2), 1626–1629 (2009)

    Article  Google Scholar 

  65. Wang, X., Yu, H.: How to break MD5 and other hash functions. In: Proceedings of the Annual International Conference on the Theory and Applications of Cryptographic Techniques (EUROCRYPT), pp. 19–35 (2005)

    Google Scholar 

  66. Wu, E., Madden, S.: Partitioning techniques for fine-grained indexing. In: Proceedings of the IEEE International Conference on Data Engineering (ICDE), pp. 1127–1138 (2011)

  67. Wu, K., Ahern, S., Bethel, E.W., Chen, J., Childs, H., Cormier-Michel, E., Geddes, C., Gu, J., Hagen, H., Hamann, B., Koegler, W., Lauret, J., Meredith, J., Messmer, P., Otoo, E.J., Perevoztchikov, V., Poskanzer, A., Rübel, O., Shoshani, A., Sim, A., Stockinger, K., Weber, G., Zhang, W.-M.: FastBit: interactively searching massive data. J. Phys.: Conf. Ser. 180(1), 012053 (2009)

    Google Scholar 

  68. Zilio, D.C., Rao, J., Lightstone, S., Lohman, G.M., Storm, A., Garcia-Arellano, C., Fadden, S.: DB2 design advisor: integrated automatic physical database design. In: Proceedings of the International Conference on Very Large Data Bases (VLDB), pp. 1087–1097 (2004)

    Google Scholar 

Download references

Acknowledgements

We would like to thank the reviewers for their valuable comments. This work is partially funded by the EU FP7 programme (ERC-2013-CoG), Grant No. 617508 (ViDa), the EU FP7 Collaborative project Grant No. 317858 (BigFoot), NSF under Grant No. IIS-1850202, and EU Horizon 2020 research and innovation programme Grant No. 650003 (Human Brain project).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Matthaios Olma.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Derivation of index construction probability formula

Appendix: Derivation of index construction probability formula

This section provides detailed description of how we derive the probability function for deciding to build an index over a logical partition. We expect this section to be useful for achieving a deeper understanding of the tuning decisions of Slalom. The derivation begins with the expected cost formula.

$$\begin{aligned} E&{=}&\sum _{i=1}^{T} {} \bigg (\sum _{j=1}^{i-1}{p_{j}} \cdot {} C_{use,idx} {+}\bigg ( 1 {-} \sum _{j=1}^{i-1}{p_{j}} \bigg ) \cdot {} \bigg ( p_{i} \cdot {} C_{build,idx} \\&+ \left( 1 - p_{i}\right) \cdot {} C_{use,fs} \bigg ) \bigg ) \end{aligned}$$

We exchange \(C_{build,idx}\) with \(C_{use, fs} + \delta \) as building the index will cost at least as much as a full scan.

$$\begin{aligned} E= & {} T \cdot {} C_{use, fs} - \bigg ( C_{use, fs} - C_{use, idx} \bigg )\cdot {} \bigg (\sum _{i=1}^{T}\sum _{j=1}^{i-1}{p_{j}} \bigg ) \nonumber \\&+\delta \cdot {}\bigg (\sum _{i=1}^{T}{p_i} - \sum _{i=1}^{T}{p_{i}\cdot {}\sum _{j=1}^{i-1}{p_{j}}} \bigg ) \end{aligned}$$
(6)

We take the first partial derivative of this formula for \(p_{i}\).

$$\begin{aligned} \frac{\partial E}{\partial p_{i}}= & {} - \bigg ( C_{use, fs} - C_{use, idx} \bigg )\cdot {} \frac{\partial \bigg (\sum _{i=1}^{T}\sum _{j=1}^{i-1}{p_{j}} \bigg )}{\partial p_{i}} \nonumber \\&+\delta \cdot {}\bigg (\frac{\partial \sum _{i=1}^{T}{p_i}}{\partial p_{i}} - \frac{\partial \bigg ( \sum _{i=1}^{T}{p_{i}\cdot {}\sum _{j=1}^{i-1}{p_{j}}} \bigg )}{\partial p_{i}} \bigg )\nonumber \\ \end{aligned}$$
(7)

We calculate that:

$$\begin{aligned} \frac{\partial \bigg (\sum _{i=1}^{T}\sum _{j=1}^{i-1}{p_{j}} \bigg )}{\partial p_{i}} = (T - i) \end{aligned}$$
(8)

and

$$\begin{aligned} \frac{\partial \bigg ( \sum _{i=1}^{T}{p_{i}\cdot {}\sum _{j=1}^{i-1}{p_{j}}} \bigg )}{\partial p_{i}} = \sum _{j=1}^{T-1}{p_{j}} - p_{i} \end{aligned}$$
(9)

Thus, the final derivative becomes:

$$\begin{aligned} \frac{\partial E}{\partial p_{i}}= & {} - \bigg ( C_{use, fs} - C_{use, idx} \bigg )\cdot {} \bigg ( T - i \bigg ) \nonumber \\&+ \delta \cdot {} \bigg ( 1 - \sum _{j=1}^{T-1}{p_{j}} - p_{i} \bigg ) \end{aligned}$$
(10)

To minimize the expected cost, we solve the equation and we solve for \(p_{i}\).

$$\begin{aligned} \frac{\partial E}{\partial p_{i}}= & {} 0 => \nonumber \\ p_{i}= & {} \frac{C_{use, fs} - C_{use, idx}}{\delta } \cdot {} (T - i) - \bigg ( 1 - \sum _{j=1}^{T-1}{p_{j}}\bigg ) \end{aligned}$$
(11)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Olma, M., Karpathiotakis, M., Alagiannis, I. et al. Adaptive partitioning and indexing for in situ query processing. The VLDB Journal 29, 569–591 (2020). https://doi.org/10.1007/s00778-019-00580-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-019-00580-x

Keywords

Navigation