skip to main content
10.1145/1559845.1559853acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Dependency-aware reordering for parallelizing query optimization in multi-core CPUs

Published:29 June 2009Publication History

ABSTRACT

The state of the art commercial query optimizers employ cost-based optimization and exploit dynamic programming (DP) to find the optimal query execution plan (QEP) without evaluating redundant sub-plans. The number of alternative QEPs enumerated by the DP query optimizer can increase exponentially, as the number of joins in the query increases. Recently, by exploiting the coming wave of multi-core processor architectures, a state of the art parallel optimization algorithm [14], referred to as PDPsva, has been proposed to parallelize the "time-consuming" DP query optimization process itself. While PDPsva significantly extends the practical use of DP to queries having up to 20-25 tables, it has several limitations: 1) supporting only the size-driven DP enumerator, 2) statically allocating search space, and 3) not fully exploiting parallelism. In this paper, we propose the first generic solution for parallelizing any type of bottom-up optimizer, including the graph-traversal driven type, and for supporting dynamic search allocation and full parallelism. This is a challenging problem, since recently developed, state of art DP optimizers such as DPcpp [21] and DPhyp [22] are very difficult to parallelize due to tangled dependencies in the join pairs they generate. Unless the solution is very carefully devised, a lot of synchronization conflicts are bound to occur. By viewing a serial bottom-up optimizer as one which generates a totally ordered sequence of join pairs in a streaming fashion, we propose a novel concept of dependency-aware reordering, which minimizes waiting time caused by dependencies of join pairs. To maximize parallelism, we also introduce a series of novel performance optimization techniques: 1) pipelining of join pair generation and plan generation; 2) the synchronization-free global MEMO; and 3) threading across dependencies. Through extensive experiments with various query topologies, we show that our solution supports any type of bottom up optimization, achieving linear speedup for each type. Despite the fact that our solution is generic, due to sophisticated optimization techniques, our generic parallel optimizer outperforms PDPsva tailored to size-driven enumeration. Experimental results also show that our solution is much more robust than PDPsva with respect to search space allocation.

References

  1. A. Agarwal, D. A. Kranz, and V. Natarajan. Automatic partitioning of parallel loops and data arrays for distributed shared-memory multiprocessors. IEEE TPDS, 6(9), 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. K. Agrawal, Y. He, and C. E. Leiserson. An empirical evaluation of work stealing with parallelism feedback. In ICDCS, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. C. E. R. Alves, E. Caceres, and F. K. H. A. Dehne. Parallel dynamic programming for solving the string editing problem on a cgm/bsp. In SPAA, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. K. P. Bennett, M. C. Ferris, and Y. E. Ioannidis. A genetic algorithm for database query optimization. In ICGA, 1991.Google ScholarGoogle Scholar
  5. G. E. Blelloch, P. B. Gibbons, and Y. Matias. Provably efficient scheduling for languages with fine-grained parallelism. J. ACM, 46(2):281--321, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. R. D. Blumofe and C. E. Leiserson. Scheduling multithreaded computations by work stealing. J. ACM, 46(5):720--748, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. C. Chekuri, W. Hasan, and R. Motwani. Scheduling problems in parallel query optimization. In PODS, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms. The MIT Press, 2nd edition, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. Elkihel and D. E. Baz. Load balancing in a parallel dynamic programming multi-method applied to the 0-1 knapsack problem. In PDP, pages 127--132, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. S. Englert, R. Glasstone, and W. Hasan. Parallelism and its price: A case study of nonstop sql/mp. SIGMOD Record, 24(4):61--71, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J. Erickson. Multicore and gpus: One tool, two processors. Dr. Dobb's Journal, 2007, http://www.ddj.com/hpc-highperformance-computing/199501192.Google ScholarGoogle Scholar
  12. M. B. et al. Pam: a novel performance/power aware meta-scheduler for multi-core systems. In SC, 2008.Google ScholarGoogle Scholar
  13. A. Grama, A. Gupta, G. Karypis, and V. Kumar. Introduction to Parallel Computing: Design and Analysis of Algorithms. McGraw-Hill, 3rd edition, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. W.-S. Han, W. Kwak, J. Lee, G. M. Lohman, and V. Markl. Parallelizing query optimization. In VLDB, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. W.-S. Han and J. Lee. Dependency-Aware Reordering for Parallelizing Query Optimization in Multi-Core CPUs. http://wshan.org/HanL09Reordering.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. W. Hong and M. Stonebraker. Optimization of parallel query execution plans in xprs. Distrib. Parallel Databases, 1(1):9--32, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. S.-H. S. Huang, H. Liu, and V. Viswanathan. Parallel dynamic programming. IEEE TPDS, 5(3), 1994.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. I. F. Ilyas, J. Rao, G. M. Lohman, D. Gao, and E. T. Lin. Estimating compilation time of a query optimizer. In SIGMOD, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Y. E. Ioannidis and Y. C. Kang. Randomized algorithms for optimizing large join queries. In SIGMOD, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. R. S. G. Lanzelotte, P. Valduriez, M. Zait, and M. Ziane. Industrial-strength parallel query optimization: issues and lessons. Inf. Syst., 19(4), 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. G. Moerkotte and T. Neumann. Analysis of two existing and one new dynamic programming algorithm for the generation of optimal bushy join trees without cross products. In VLDB, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. G. Moerkotte and T. Neumann. Dynamic programming strikes back. In SIGMOD, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. T. Morzy, M. Matysiak, and S. Salza. Tabu search optimization of large join queries. In M. Jarke, J. A. B. Jr., and K. G. Jeffery, editors, EDBT, volume 779, pages 309--322, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. K. Ono and G. M. Lohman. Measuring the complexity of join enumeration in query optimization. In VLDB, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. N. W. Paton, V. Raman, G. Swart, and I. Narang. Autonomic query parallelization using non-dedicated computers: An evaluation of adaptivity options. In ICAC, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Postgresql version 8.3. http://www.postgresql.org.Google ScholarGoogle Scholar
  27. F. Rastello and Y. Robert. Automatic partitioning of parallel loops with parallelepiped-shaped tiles. IEEE TPDS, 13(5):460--470, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. J. Reinders. Intel Threading Building Blocks. O'Reilly Media, Inc, Sebastopol, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. P. Stenstrom. Ipdps panel: Is the multi-core roadmap going to live up to its promises? IPDPS, 2007.Google ScholarGoogle Scholar
  30. H. Sutter and J. Larus. Software and the concurrency revolution. Queue, 3(7):54--62, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. A. N. Swami. Optimization of large join queries: Combining heuristic and combinatorial techniques. In SIGMOD, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. A. N. Swami and A. Gupta. Optimization of large join queries. In SIGMOD, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. G. Tan, S. Feng, and N. Sun. Biology - locality and parallelism optimization for dynamic programming algorithm in bioinformatics. In SC, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. G. Tan, N. Sun, and G. R. Gao. A parallel dynamic programming algorithm on a multi-core architecture. In SPAA, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. D. Wentzlaff and A. Agarwal. The Case for a Factored Operating System (fos). http://hdl.handle.net/1721.1/42894Google ScholarGoogle Scholar

Index Terms

  1. Dependency-aware reordering for parallelizing query optimization in multi-core CPUs

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGMOD '09: Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
      June 2009
      1168 pages
      ISBN:9781605585512
      DOI:10.1145/1559845

      Copyright © 2009 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 29 June 2009

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate785of4,003submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader