Abstract
Optimization of queries with conjunctive predicates for main memory databases remains a challenging task. The traditional way of optimizing this class of queries relies on predicate ordering based on selectivities or ranks. However, the optimization of queries with conjunctive predicates is a much more challenging task, requiring a holistic approach in view of (1) an accurate cost model that is aware of CPU architectural characteristics such as branch (mis)prediction, (2) a storage layer, allowing for a streamlined query execution, (3) a common subexpression elimination technique, minimizing column access costs, and (4) an optimization algorithm able to pick the optimal plan even in presence of a small (bounded) estimation error. In this work, we embrace the holistic approach, and show its superiority experimentally.
Current approaches typically base their optimization algorithms on at least one of two assumptions: (1) the predicate selectivities are assumed to be independent, (2) the predicate costs are assumed to be constant. Our approach is not based on these assumptions, as they in general do not hold.
- D. Abadi, D. S. Myers, D. J. DeWitt, and S. R. Madden. Materialization strategies in a column-oriented DBMS. In ICDE 2007, pages 466--475, 2007.Google ScholarCross Ref
- J. A. Blakeley, W. J. McKenna, and G. Graefe. Experiences building the open oodb query optimizer. In SIGMOD, volume 22, pages 287--296, 1993. Google ScholarDigital Library
- M. Charikar, S. Chaudhuri, R. Motwani, and V. Narasayya. Towards estimation error guarantees for distinct values. In PODS, pages 268--279, 2000. Google ScholarDigital Library
- S. Christodoulakis. Implications of certain assumptions in database performance evauation. TODS, pages 163--186, 1984. Google ScholarDigital Library
- College of Natural Resources Colorado State University. Forest dataset. http://kdd.ics.uci.edu/databases/covertype/covertype.data.html.Google Scholar
- T. H. Cormen, C. E. Leiserson, R. L. Rivest, C. Stein, et al. Introduction to algorithms. MIT press Cambridge, 2001. Google ScholarDigital Library
- G. Cormode, M. Garofalakis, P. Haas, and C. Jermaine. Synopses for Massive Data: Samples, Histograms, Wavelets, Sketches. NOW Press, 2012. Google ScholarDigital Library
- C. Diaconu, C. Freedman, E. Ismert, P.-A. Larson, P. Mittal, R. Stonecipher, N. Verma, and M. Zwilling. Hekaton: SQL server's memory-optimized OLTP engine. In SIGMOD, pages 1243--1254, 2013. Google ScholarDigital Library
- P. Gibbons. Distinct sampling for highly-accurate answers to distinct values queries and event reports. In VLDB, pages 541--550, 2001. Google ScholarDigital Library
- M. Grund, J. Krüger, H. Plattner, A. Zeier, P. Cudre-Mauroux, and S. Madden. HYRISE: a main memory hybrid storage engine. PVLDB, pages 105--116, 2010. Google ScholarDigital Library
- J. M. Hellerstein and M. Stonebraker. Predicate migration: Optimizing queries with expensive predicates, volume 22. ACM, 1993. Google ScholarDigital Library
- T. Ibaraki and T. Kameda. On the optimal nesting order for computing n-relational joins. TODS, pages 482--502, 1984. Google ScholarDigital Library
- IBM. Soliddb. http://www.ibm.com/software/data/soliddb.Google Scholar
- R. Johnson, V. Raman, R. Sidle, and G. Swart. Row-wise parallel predicate evaluation. VLDB, pages 622--634, 2008. Google ScholarDigital Library
- R. Kallman, H. Kimura, J. Natkins, A. Pavlo, A. Rasin, S. Zdonik, E. P. Jones, S. Madden, M. Stonebraker, Y. Zhang, et al. H-store: a high-performance, distributed main memory transaction processing system. PVLDB, pages 1496--1499, 2008. Google ScholarDigital Library
- C.-C. Kanne and G. Moerkotte. Histograms reloaded: The merits of bucket diversity. In SIGMOD, pages 663--674, 2010. Google ScholarDigital Library
- A. Kemper and G. Moerkotte. Advanced query processing in object bases using access support relations. In VLDB, pages 290--301, 1990. Google ScholarDigital Library
- A. Kemper, G. Moerkotte, and M. Steinbrunn. Optimizing boolean expressions in object bases. In VLDB, pages 79--90, 1992. Google ScholarDigital Library
- A. Kemper and T. Neumann. HyPer: A hybrid OLTP&OLAP main memory database system based on virtual memory snapshots. In ICDE, pages 195--206, 2011. Google ScholarDigital Library
- R. Krishnamurthy, H. Boral, and C. Zaniolo. Optimization of nonrecursive queries. In VLDB, pages 128--137, 1986. Google ScholarDigital Library
- V. Leis, A. Gubichev, A. Mirchev, P. Boncz, A. Kemper, and T. Neumann. How good are query optimizers, really? VLDB, pages 204--215, 2015. Google ScholarDigital Library
- V. Markl, P. J. Haas, M. Kutsch, N. Megiddo, U. Srivastava, and T. M. Tran. Consistent selectivity estimation via maximum entropy. The VLDB journal, 16(1):55--76, 2007. Google ScholarDigital Library
- V. Markl, G. Lohman, and V. Raman. LEO: An autonomic query optimizer for DB2. IBM Systems Journal, 42(1):98--106, 2003. Google ScholarDigital Library
- G. Moerkotte. Building Query Compiler. 2014. pi3.informatik.uni-mannheim.de/~moer/querycompiler.pdf.Google Scholar
- G. Moerkotte, M. Montag, A. Repetti, and G. Steidl. Proximal operator of quotient functions with application to a feasibility problem in query optimization. Journal of Computational and Applied Mathematics, 285:243--255, 2015. Google ScholarDigital Library
- G. Moerkotte, T. Neumann, and G. Steidl. Preventing bad plans by bounding the impact of cardinality estimation errors. VLDB, pages 982--993, 2009. Google ScholarDigital Library
- K. Munagala, S. Babu, R. Motwani, and J. Widom. The pipelined set cover problem. In Database Theory-ICDT 2005, pages 83--98. Springer, 2005. Google ScholarDigital Library
- T. Neumann. Efficiently compiling efficient query plans for modern hardware. PVLDB, pages 539--550, 2011. Google ScholarDigital Library
- T. Neumann, S. Helmer, and G. Moerkotte. On the optimal ordering of maps and selections under factorization. In ICDE, pages 490--501, 2005. Google ScholarDigital Library
- Oracle. TimesTen In-Memory Database. http://www.oracle.com/technetwork/database/database-technologies/timesten/overview/index.html.Google Scholar
- K. A. Ross. Conjunctive selection conditions in main memory. In SIGMOD, pages 109--120, 2002. Google ScholarDigital Library
- SAP. In-Memory Computing (SAP HANA). http://www.sap.com/pc/tech/in-memory-computing-hana/software/overview/index.html.Google Scholar
- S. Setzer, G. Steidl, T. Teuber, and G. Moerkotte. Approximation related to quotient functionals. Journal of Approximation Theory, pages 545--558, 2010. Google ScholarDigital Library
- J. Sompolski, M. Zukowski, and P. Boncz. Vectorization vs. compilation in query execution. In Proceedings of the Seventh International Workshop on Data Management on New Hardware, pages 33--40, 2011. Google ScholarDigital Library
- K. Tzoumas, A. Deshpande, and C. Jensen. Efficiently adapting graphical models for selectivity estimation. VLDB Journal, 22:3--27, 2013. Google ScholarDigital Library
- VoltDB. In-memory database. http://www.voltdb.com.Google Scholar
- M. Zukowski, M. Van de Wiel, and P. Boncz. Vectorwise: A vectorized analytical dbms. In ICDE, pages 1349--1350, 2012. Google ScholarDigital Library
Index Terms
- Optimization of conjunctive predicates for main memory column stores
Recommendations
Optimization of Disjunctive Predicates for Main Memory Column Stores
SIGMOD '17: Proceedings of the 2017 ACM International Conference on Management of DataOptimization of disjunctive predicates is a very challenging task which has been vastly neglected by the research community and commercial databases. In this work, we focus on the complex problem of optimizing disjunctive predicates by means of the ...
Containment and Optimization of Object-Preserving Conjunctive Queries
In the optimization of queries in an object-oriented database (OODB) system, a natural first step is to use the typing constraints imposed by the schema to transform a query into an equivalent one that logically accesses a minimal set of objects. We ...
Conjunctive query optimization in OWL2-DL
DEXA'11: Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part IIConjunctive query answering is becoming a very important task on the Semantic Web as the adoption of SPARQL query language increases. There is considerable work done in the area of optimizing conjunctive query answering for RDF and OWL2-DL ontologies, ...
Comments