skip to main content
research-article
Artifacts Available / v1.1

An Efficient Transfer Learning Based Configuration Adviser for Database Tuning

Published:01 November 2023Publication History
Skip Abstract Section

Abstract

In recent years, a wide spectrum of database tuning systems have emerged to automatically optimize database performance. However, these systems require a significant number of workload runs to deliver a satisfactory level of database performance, which is time-consuming and resource-intensive. While many attempts have been made to address this issue by using advanced search optimizers, empirical studies have shown that no single optimizer can dominate the rest across tuning tasks with different characteristics. Choosing an inferior optimizer may significantly increase the tuning cost. Unfortunately, current practices typically adopt a single optimizer or follow simple heuristics without considering the task characteristics. Consequently, they fail to choose the most suitable optimizer for a specific task. Furthermore, constructing a compact search space can significantly improve the tuning efficiency. However, current practices neglect the setting of the value range for each knob and rely on a large number of workload runs to select important knobs, resulting in a considerable amount of unnecessary exploration in ineffective regions.

To pursue efficient database tuning, in this paper, we argue that it is imperative to have an approach that can judiciously determine a precise space and search optimizer for an arbitrary tuning task. To this end, we propose OpAdviser, which exploits the information learned from historical tuning tasks to guide the search space construction and search optimizer selection. Our design can greatly accelerate the tuning process and further reduce the required workload runs. Given a tuning task, OpAdviser learns the geometries of search space, including important knobs and their effective regions, from relevant previous tasks. It then constructs the target search space from the geometries according to the on-the-fly task similarity, which allows for adaptive adjustment of the target space. OpAdviser also employs a pairwise ranking model to capture the relationship from task characteristics to optimizer rankings. This ranking model is invoked during tuning and predicts the best optimizer to be used for the current iteration. We conduct extensive evaluations across a diverse set of workloads, where OpAdviser achieves 9.2% higher throughput and significantly reduces the number of workload runs with an average speedup of ~3.4x compared to state-of-the-art tuning systems.

References

  1. 2015. TPC-H benchmark. http://www.tpc.org/tpch/.Google ScholarGoogle Scholar
  2. 2022. InnoDB Startup Options and System Variables. https://dev.mysql.com/doc/refman/5.7/en/innodb-parameters.html.Google ScholarGoogle Scholar
  3. 2022. Server System Variables. https://dev.mysql.com/doc/refman/5.7/en/server-system-variables.html.Google ScholarGoogle Scholar
  4. Sanjay Agrawal, Surajit Chaudhuri, Lubor Kollár, Arunprasad P. Marathe, Vivek R. Narasayya, and Manoj Syamala. 2004. Database Tuning Advisor for Microsoft SQL Server 2005. In VLDB. Morgan Kaufmann, 1110--1121.Google ScholarGoogle Scholar
  5. Dana Van Aken, Andrew Pavlo, Geoffrey J. Gordon, and Bohan Zhang. 2017. Automatic Database Management System Tuning Through Large-scale Machine Learning. In SIGMOD Conference. ACM, 1009--1024.Google ScholarGoogle Scholar
  6. Dana Van Aken, Dongsheng Yang, Sebastien Brillard, Ari Fiorino, Bohan Zhang, Christian Billian, and Andrew Pavlo. 2021. An Inquiry into Machine Learning-based Automatic Configuration Tuning Services on Real-World Database Management Systems. Proc. VLDB Endow. 14, 7 (2021), 1241--1253.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Tianyi Bai, Yang Li, Yu Shen, Xinyi Zhang, Wentao Zhang, and Bin Cui. 2023. Transfer Learning for Bayesian Optimization: A Survey. CoRR abs/2302.05927 (2023).Google ScholarGoogle Scholar
  8. Christopher JC Burges. 2010. From ranknet to lambdarank to lambdamart: An overview. Learning 11, 23--581 (2010), 81.Google ScholarGoogle Scholar
  9. Baoqing Cai, Yu Liu, Ce Zhang, Guangyu Zhang, Ke Zhou, Li Liu, Chunhua Li, Bin Cheng, Jie Yang, and Jiashu Xing. 2022. HUNTER: An Online Cloud Database Hybrid Tuning System for Personalized Requirements. In SIGMOD Conference. ACM, 646--659.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Stefano Cereda, Stefano Valladares, Paolo Cremonesi, and Stefano Doni. 2021. CGPTuner: a Contextual Gaussian Process Bandit Approach for the Automatic Tuning of IT Configurations Under Varying Workload Conditions. Proc. VLDB Endow. 14, 8 (2021), 1401--1413.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Surajit Chaudhuri and Gerhard Weikum. 2006. Foundations of Automated Database Tuning. In VLDB. ACM, 1265.Google ScholarGoogle Scholar
  12. Mansheng Chen, Jia-Qi Lin, Xiang-Long Li, Bao-Yu Liu, Chang-Dong Wang, Dong Huang, and Jian-Huang Lai. 2022. Representation Learning in Multi-view Clustering: A Literature Review. Data Sci. Eng. 7, 3 (2022), 225--241.Google ScholarGoogle ScholarCross RefCross Ref
  13. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT (1). Association for Computational Linguistics, 4171--4186.Google ScholarGoogle Scholar
  14. Djellel Eddine Difallah, Andrew Pavlo, Carlo Curino, and Philippe Cudré-Mauroux. 2013. OLTP-Bench: An Extensible Testbed for Benchmarking Relational Databases. Proc. VLDB Endow. 7, 4 (2013), 277--288.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Songyun Duan, Vamsidhar Thummala, and Shivnath Babu. 2009. Tuning Database Configuration Parameters with iTuned. Proc. VLDB Endow. 2, 1 (2009), 1246--1257.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Yubin Duan, Ning Wang, and Jie Wu. 2022. Accelerating DAG-Style Job Execution via Optimizing Resource Pipeline Scheduling. J. Comput. Sci. Technol. 37, 4 (2022), 852--868.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Ayat Fekry, Lucian Carata, Thomas F. J.-M. Pasquier, Andrew Rice, and Andy Hopper. 2020. To Tune or Not to Tune?: In Search of Optimal Configurations for Data Analytics. In KDD. ACM, 2494--2504.Google ScholarGoogle Scholar
  18. Ralf Herbrich, Thore Graepel, and Klaus Obermayer. 1999. Support vector learning for ordinal regression. (1999).Google ScholarGoogle Scholar
  19. Chun Kit Jeffery Hou and Kamran Behdinan. 2022. Dimensionality Reduction in Surrogate Modeling: A Review of Combined Methods. Data Sci. Eng. 7, 4 (2022), 402--427.Google ScholarGoogle ScholarCross RefCross Ref
  20. Shiyue Huang, Yanzhao Qin, Xinyi Zhang, Yaofeng Tu, Zhongliang Li, and Bin Cui. 2023. Survey on performance optimization for database systems. Sci. China Inf. Sci. 66, 2 (2023).Google ScholarGoogle Scholar
  21. Shiyue Huang, Ziwei Wang, Xinyi Zhang, Yaofeng Tu, Zhongliang Li, and Bin Cui. 2023. DBPA: A Benchmark for Transactional Database Performance Anomalies. Proc. ACM Manag. Data 1, 1 (2023), 72:1--72:26.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Frank Hutter, Holger H. Hoos, and Kevin Leyton-Brown. 2011. Sequential Model-Based Optimization for General Algorithm Configuration. In LION (Lecture Notes in Computer Science), Vol. 6683. Springer, 507--523.Google ScholarGoogle Scholar
  23. Konstantinos Kanellis, Ramnatthan Alagappan, and Shivaram Venkataraman. 2020. Too Many Knobs to Tune? Towards Faster Database Tuning by Pre-selecting Important Knobs. In HotStorage. USENIX Association.Google ScholarGoogle Scholar
  24. Konstantinos Kanellis, Cong Ding, Brian Kroth, Andreas Müller, Carlo Curino, and Shivaram Venkataraman. 2022. LlamaTune: Sample-Efficient DBMS Configuration Tuning. Proc. VLDB Endow. 15, 11 (2022), 2953--2965.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Aaron Klein. 2017. RoBO : A Flexible and Robust Bayesian Optimization Framework in Python.Google ScholarGoogle Scholar
  26. Jan Kossmann and Rainer Schlosser. 2020. Self-driving database systems: a conceptual approach. Distributed Parallel Databases 38, 4 (2020), 795--817.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Mayuresh Kunjir and Shivnath Babu. 2020. Black or White? How to Develop an AutoTuner for Memory-based Analytics. In SIGMOD Conference. ACM, 1667--1683.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Viktor Leis, Andrey Gubichev, Atanas Mirchev, Peter A. Boncz, Alfons Kemper, and Thomas Neumann. 2015. How Good Are Query Optimizers, Really? Proc. VLDB Endow. 9, 3 (2015), 204--215.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Stefan Lessmann, Robert Stahlbock, and Sven F Crone. 2005. Optimizing hyper-parameters of support vector machines by genetic algorithms.. In IC-AI. 74--82.Google ScholarGoogle Scholar
  30. David D. Lewis and Jason Catlett. 1994. Heterogeneous Uncertainty Sampling for Supervised Learning. In ICML. Morgan Kaufmann, 148--156.Google ScholarGoogle Scholar
  31. Guoliang Li, Xuanhe Zhou, Shifu Li, and Bo Gao. 2019. QTune: A Query-Aware Database Tuning System with Deep Reinforcement Learning. Proc. VLDB Endow. 12, 12 (2019), 2118--2130.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Yang Li, Huaijun Jiang, Yu Shen, Yide Fang, Xiaofeng Yang, Danqing Huang, Xinyi Zhang, Wentao Zhang, Ce Zhang, Peng Chen, and Bin Cui. 2023. Towards General and Efficient Online Tuning for Spark. Proc. VLDB Endow. 16, 12 (2023), 3570--3583.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Jinqing Lian, Xinyi Zhang, Yingxia Shao, Zenglin Pu, Qingfeng Xiang, Yawen Li, and Bin Cui. 2023. ContTune: Continuous Tuning by Conservative Bayesian Optimization for Distributed Stream Data Processing Systems. CoRR abs/2309.12239 (2023).Google ScholarGoogle Scholar
  34. Hao Liao, Qi-Xin Liu, Ze-cheng Huang, Ke-Zhong Lu, Chi Ho Yeung, and Yi-Cheng Zhang. 2022. Accumulative Time Based Ranking Method to Reputation Evaluation in Information Networks. J. Comput. Sci. Technol. 37, 4 (2022), 960--974.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. 2016. Continuous control with deep reinforcement learning. In ICLR (Poster).Google ScholarGoogle Scholar
  36. Scott M. Lundberg and Su-In Lee. 2017. A Unified Approach to Interpreting Model Predictions. In NIPS. 4765--4774.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Lin Ma, Dana Van Aken, Ahmed Hefny, Gustavo Mezerhane, Andrew Pavlo, and Geoffrey J. Gordon. 2018. Query-based Workload Forecasting for Self-Driving Database Management Systems. In SIGMOD Conference. ACM, 631--645.Google ScholarGoogle Scholar
  38. Michael D. McKay. 1992. Latin Hypercube Sampling as a Tool in Uncertainty Analysis of Computer Models. In WSC. ACM Press, 557--564.Google ScholarGoogle Scholar
  39. Amin Nayebi, Alexander Munteanu, and Matthias Poloczek. 2019. A Framework for Bayesian Optimization in Embedded Subspaces. In ICML (Proceedings of Machine Learning Research), Vol. 97. PMLR, 4752--4761.Google ScholarGoogle Scholar
  40. Valerio Perrone and Huibin Shen. 2019. Learning search spaces for Bayesian optimization: Another view of hyperparameter transfer learning. In NeurIPS. 12751--12761.Google ScholarGoogle Scholar
  41. Omer Sagi and Lior Rokach. 2018. Ensemble learning: A survey. WIREs Data Mining Knowl. Discov. 8, 4 (2018).Google ScholarGoogle Scholar
  42. Dennis E. Shasha and Philippe Bonnet. 2002. Database Tuning: Principles, Experiments, and Troubleshooting Techniques. In VLDB. Morgan Kaufmann.Google ScholarGoogle Scholar
  43. Dennis E. Shasha and Steve Rozen. 1992. Database Tuning. In VLDB. Morgan Kaufmann, 313.Google ScholarGoogle Scholar
  44. Adam J. Storm, Christian Garcia-Arellano, Sam Lightstone, Yixin Diao, and Maheswaran Surendra. 2006. Adaptive Self-tuning Memory in DB2. In VLDB. ACM, 1081--1092.Google ScholarGoogle Scholar
  45. Immanuel Trummer. 2022. DB-BERT: A Database Tuning Tool that "Reads the Manual". In SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12 - 17, 2022, Zachary G. Ives, Angela Bonifati, and Amr El Abbadi (Eds.). ACM, 190--203. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Bing Wei, Limin Xiao, Yao Song, Guangjun Qin, Jinbin Zhu, Baicheng Yan, Chaobo Wang, and Zhisheng Huo. 2022. A self-tuning client-side metadata prefetching scheme for wide area network file systems. Sci. China Inf. Sci. 65, 3 (2022).Google ScholarGoogle Scholar
  47. Gerhard Weikum, Axel Monkeberg, Christof Hasse, and Peter Zabback. 2002. Self-tuning Database Technology and Information Services: from Wishful Thinking to Viable Engineering. In VLDB. Morgan Kaufmann, 20--31.Google ScholarGoogle Scholar
  48. David H. Wolpert and William G. Macready. 1997. No free lunch theorems for optimization. IEEE Trans. Evol. Comput. 1, 1 (1997), 67--82. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Huan Zhang, Liangxiao Jiang, and Chaoqun Li. 2022. Attribute augmented and weighted naive Bayes. Sci. China Inf. Sci. 65, 12 (2022).Google ScholarGoogle Scholar
  50. Ji Zhang, Yu Liu, Ke Zhou, Guoliang Li, Zhili Xiao, Bin Cheng, Jiashu Xing, Yangtao Wang, Tianheng Cheng, Li Liu, Minwei Ran, and Zekang Li. 2019. An End-to-End Automatic Cloud Database Tuning System Using Deep Reinforcement Learning. In SIGMOD Conference. ACM, 415--432.Google ScholarGoogle Scholar
  51. Xinyi Zhang, Zhuo Chang, Yang Li, Hong Wu, Jian Tan, Feifei Li, and Bin Cui. 2022. Facilitating Database Tuning with Hyper-Parameter Optimization: A Comprehensive Experimental Evaluation. Proc. VLDB Endow. 15, 9 (2022), 1808--1821.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Xinyi Zhang, Zhuo Chang, Hong Wu, Yang Li, Jia Chen, Jian Tan, Feifei Li, and Bin Cui. 2023. A Unified and Efficient Coordinating Framework for Autonomous DBMS Tuning. Proc. ACM Manag. Data 1, 2 (2023), 186:1--186:26.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Xinyi Zhang, Hong Wu, Zhuo Chang, Shuowei Jin, Jian Tan, Feifei Li, Tieying Zhang, and Bin Cui. 2021. ResTune: Resource Oriented Tuning Boosted by Meta-Learning for Cloud Databases. In SIGMOD Conference. ACM, 2102--2114.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Xinyi Zhang, Hong Wu, Yang Li, Jian Tan, Feifei Li, and Bin Cui. 2022. Towards Dynamic and Safe Configuration Tuning for Cloud Databases. CoRR abs/2203.14473 (2022).Google ScholarGoogle Scholar
  55. Xinyi Zhang, Hong Wu, Yang Li, Jian Tan, Feifei Li, and Bin Cui. 2022. Towards Dynamic and Safe Configuration Tuning for Cloud Databases. In SIGMOD Conference. ACM, 631--645.Google ScholarGoogle Scholar
  56. Yuqing Zhu, Jianxun Liu, Mengying Guo, Yungang Bao, Wenlong Ma, Zhuoyue Liu, Kunpeng Song, and Yingchun Yang. 2017. BestConfig: tapping the performance potential of systems via automatic configuration tuning. In SoCC. ACM, 338--350.Google ScholarGoogle Scholar
  57. Fuzhen Zhuang, Zhiyuan Qi, Keyu Duan, Dongbo Xi, Yongchun Zhu, Hengshu Zhu, Hui Xiong, and Qing He. 2021. A Comprehensive Survey on Transfer Learning. Proc. IEEE 109, 1 (2021), 43--76.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. An Efficient Transfer Learning Based Configuration Adviser for Database Tuning
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image Proceedings of the VLDB Endowment
          Proceedings of the VLDB Endowment  Volume 17, Issue 3
          November 2023
          353 pages
          ISSN:2150-8097
          Issue’s Table of Contents

          Publisher

          VLDB Endowment

          Publication History

          • Published: 1 November 2023
          Published in pvldb Volume 17, Issue 3

          Check for updates

          Qualifiers

          • research-article
        • Article Metrics

          • Downloads (Last 12 months)107
          • Downloads (Last 6 weeks)49

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader