ABSTRACT
The state of the art commercial query optimizers employ cost-based optimization and exploit dynamic programming (DP) to find the optimal query execution plan (QEP) without evaluating redundant sub-plans. The number of alternative QEPs enumerated by the DP query optimizer can increase exponentially, as the number of joins in the query increases. Recently, by exploiting the coming wave of multi-core processor architectures, a state of the art parallel optimization algorithm [14], referred to as PDPsva, has been proposed to parallelize the "time-consuming" DP query optimization process itself. While PDPsva significantly extends the practical use of DP to queries having up to 20-25 tables, it has several limitations: 1) supporting only the size-driven DP enumerator, 2) statically allocating search space, and 3) not fully exploiting parallelism. In this paper, we propose the first generic solution for parallelizing any type of bottom-up optimizer, including the graph-traversal driven type, and for supporting dynamic search allocation and full parallelism. This is a challenging problem, since recently developed, state of art DP optimizers such as DPcpp [21] and DPhyp [22] are very difficult to parallelize due to tangled dependencies in the join pairs they generate. Unless the solution is very carefully devised, a lot of synchronization conflicts are bound to occur. By viewing a serial bottom-up optimizer as one which generates a totally ordered sequence of join pairs in a streaming fashion, we propose a novel concept of dependency-aware reordering, which minimizes waiting time caused by dependencies of join pairs. To maximize parallelism, we also introduce a series of novel performance optimization techniques: 1) pipelining of join pair generation and plan generation; 2) the synchronization-free global MEMO; and 3) threading across dependencies. Through extensive experiments with various query topologies, we show that our solution supports any type of bottom up optimization, achieving linear speedup for each type. Despite the fact that our solution is generic, due to sophisticated optimization techniques, our generic parallel optimizer outperforms PDPsva tailored to size-driven enumeration. Experimental results also show that our solution is much more robust than PDPsva with respect to search space allocation.
- A. Agarwal, D. A. Kranz, and V. Natarajan. Automatic partitioning of parallel loops and data arrays for distributed shared-memory multiprocessors. IEEE TPDS, 6(9), 1995. Google ScholarDigital Library
- K. Agrawal, Y. He, and C. E. Leiserson. An empirical evaluation of work stealing with parallelism feedback. In ICDCS, 2006. Google ScholarDigital Library
- C. E. R. Alves, E. Caceres, and F. K. H. A. Dehne. Parallel dynamic programming for solving the string editing problem on a cgm/bsp. In SPAA, 2002. Google ScholarDigital Library
- K. P. Bennett, M. C. Ferris, and Y. E. Ioannidis. A genetic algorithm for database query optimization. In ICGA, 1991.Google Scholar
- G. E. Blelloch, P. B. Gibbons, and Y. Matias. Provably efficient scheduling for languages with fine-grained parallelism. J. ACM, 46(2):281--321, 1999. Google ScholarDigital Library
- R. D. Blumofe and C. E. Leiserson. Scheduling multithreaded computations by work stealing. J. ACM, 46(5):720--748, 1999. Google ScholarDigital Library
- C. Chekuri, W. Hasan, and R. Motwani. Scheduling problems in parallel query optimization. In PODS, 1995. Google ScholarDigital Library
- T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms. The MIT Press, 2nd edition, 2001. Google ScholarDigital Library
- M. Elkihel and D. E. Baz. Load balancing in a parallel dynamic programming multi-method applied to the 0-1 knapsack problem. In PDP, pages 127--132, 2006. Google ScholarDigital Library
- S. Englert, R. Glasstone, and W. Hasan. Parallelism and its price: A case study of nonstop sql/mp. SIGMOD Record, 24(4):61--71, 1995. Google ScholarDigital Library
- J. Erickson. Multicore and gpus: One tool, two processors. Dr. Dobb's Journal, 2007, http://www.ddj.com/hpc-highperformance-computing/199501192.Google Scholar
- M. B. et al. Pam: a novel performance/power aware meta-scheduler for multi-core systems. In SC, 2008.Google Scholar
- A. Grama, A. Gupta, G. Karypis, and V. Kumar. Introduction to Parallel Computing: Design and Analysis of Algorithms. McGraw-Hill, 3rd edition, 1994. Google ScholarDigital Library
- W.-S. Han, W. Kwak, J. Lee, G. M. Lohman, and V. Markl. Parallelizing query optimization. In VLDB, 2008. Google ScholarDigital Library
- W.-S. Han and J. Lee. Dependency-Aware Reordering for Parallelizing Query Optimization in Multi-Core CPUs. http://wshan.org/HanL09Reordering.pdf. Google ScholarDigital Library
- W. Hong and M. Stonebraker. Optimization of parallel query execution plans in xprs. Distrib. Parallel Databases, 1(1):9--32, 1993. Google ScholarDigital Library
- S.-H. S. Huang, H. Liu, and V. Viswanathan. Parallel dynamic programming. IEEE TPDS, 5(3), 1994.Google ScholarDigital Library
- I. F. Ilyas, J. Rao, G. M. Lohman, D. Gao, and E. T. Lin. Estimating compilation time of a query optimizer. In SIGMOD, 2003. Google ScholarDigital Library
- Y. E. Ioannidis and Y. C. Kang. Randomized algorithms for optimizing large join queries. In SIGMOD, 1990. Google ScholarDigital Library
- R. S. G. Lanzelotte, P. Valduriez, M. Zait, and M. Ziane. Industrial-strength parallel query optimization: issues and lessons. Inf. Syst., 19(4), 1994. Google ScholarDigital Library
- G. Moerkotte and T. Neumann. Analysis of two existing and one new dynamic programming algorithm for the generation of optimal bushy join trees without cross products. In VLDB, 2006. Google ScholarDigital Library
- G. Moerkotte and T. Neumann. Dynamic programming strikes back. In SIGMOD, 2008. Google ScholarDigital Library
- T. Morzy, M. Matysiak, and S. Salza. Tabu search optimization of large join queries. In M. Jarke, J. A. B. Jr., and K. G. Jeffery, editors, EDBT, volume 779, pages 309--322, 1994. Google ScholarDigital Library
- K. Ono and G. M. Lohman. Measuring the complexity of join enumeration in query optimization. In VLDB, 1990. Google ScholarDigital Library
- N. W. Paton, V. Raman, G. Swart, and I. Narang. Autonomic query parallelization using non-dedicated computers: An evaluation of adaptivity options. In ICAC, 2006. Google ScholarDigital Library
- Postgresql version 8.3. http://www.postgresql.org.Google Scholar
- F. Rastello and Y. Robert. Automatic partitioning of parallel loops with parallelepiped-shaped tiles. IEEE TPDS, 13(5):460--470, 2002. Google ScholarDigital Library
- J. Reinders. Intel Threading Building Blocks. O'Reilly Media, Inc, Sebastopol, 2007. Google ScholarDigital Library
- P. Stenstrom. Ipdps panel: Is the multi-core roadmap going to live up to its promises? IPDPS, 2007.Google Scholar
- H. Sutter and J. Larus. Software and the concurrency revolution. Queue, 3(7):54--62, 2005. Google ScholarDigital Library
- A. N. Swami. Optimization of large join queries: Combining heuristic and combinatorial techniques. In SIGMOD, 1989. Google ScholarDigital Library
- A. N. Swami and A. Gupta. Optimization of large join queries. In SIGMOD, 1988. Google ScholarDigital Library
- G. Tan, S. Feng, and N. Sun. Biology - locality and parallelism optimization for dynamic programming algorithm in bioinformatics. In SC, 2006. Google ScholarDigital Library
- G. Tan, N. Sun, and G. R. Gao. A parallel dynamic programming algorithm on a multi-core architecture. In SPAA, 2007. Google ScholarDigital Library
- D. Wentzlaff and A. Agarwal. The Case for a Factored Operating System (fos). http://hdl.handle.net/1721.1/42894Google Scholar
Index Terms
- Dependency-aware reordering for parallelizing query optimization in multi-core CPUs
Recommendations
Parallelizing query optimization
Many commercial RDBMSs employ cost-based query optimization exploiting dynamic programming (DP) to efficiently generate the optimal query execution plan. However, optimization time increases rapidly for queries joining more than 10 tables. Randomized or ...
Efficient Query Processing on Many-core Architectures: A Case Study with Intel Xeon Phi Processor
SIGMOD '16: Proceedings of the 2016 International Conference on Management of DataRecently, Intel Xeon Phi is emerging as a many-core processor with up to 61 x86 cores. In this demonstration, we present PhiDB, an OLAP query processor with simultaneous multi-threading (SMT) capabilities on Xeon Phi as a case study for parallel ...
Parallelizing extensible query optimizers
SIGMOD '09: Proceedings of the 2009 ACM SIGMOD International Conference on Management of dataQuery optimization is the most computationally complex task in a database management systems. In many query optimizers, faster CPUs and increased RAM can translate directly to better query plans and thus better overall system performance. Although ...
Comments