skip to main content
research-article
Artifacts Available / v1.1

HyBench: A New Benchmark for HTAP Databases

Published:02 May 2024Publication History
Skip Abstract Section

Abstract

In this paper, we propose, HyBench, a new benchmark for HTAP databases. First, we generate the testing data by simulating a representative HTAP application. We particularly develop a time-dependent generation phase and an anomaly generation phase for testing HTAP with large cardinality and various anomalies. Second, we propose a set of hybrid workloads. Specifically, we design 18 read/write transactions, 13 analytical queries, and a mix workload of 6 analytical transactions and 6 interactive queries. We also develop a graph-based parameter curation method to control the access patterns including skew access and data contention of the hybrid workload. Third, we propose a unified metric for quantifying the overall HTAP performance. Particularly, we introduce a query-driven method that evaluates the data freshness (lag time between analytics and transactions). Then we introduce a three-phase execution rule to compute a unified metric, combining the performance of OLTP (TPS), OLAP (QPS), and OLXP (XPS) and data freshness. To verify the effectiveness of HyBench and to debunk the myth of different HTAP architectures, extensive experiments have been conducted over five HTAP databases.

References

  1. Peter A. Boncz, Thomas Neumann, and Orri Erling. 2013. TPC-H Analyzed: Hidden Messages and Lessons Learned from an Influential Benchmark. In TPCTC (Lecture Notes in Computer Science), Vol. 8391. Springer, 61--76.Google ScholarGoogle Scholar
  2. Fábio Coelho, João Paulo, Ricardo Vilaça, José Pereira, and Rui Oliveira. 2017. HTAPBench: Hybrid Transactional and Analytical Processing Benchmark. In Proceedings of the 8th ACM/SPEC on International Conference on Performance Engineering. 293--304.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Richard Cole, Florian Funke, Leo Giakoumakis, et al. 2011. The Mixed Workload CH-benCHmark. In Proceedings of the Fourth International Workshop on Testing Database Systems. 1--6.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Brian F Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking cloud serving systems with YCSB. In Proceedings of the 1st ACM symposium on Cloud computing. 143--154.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Markus Dreseler, Martin Boissier, Tilmann Rabl, and Matthias Uflacker. 2020. Quantifying TPC-H Choke Points and Their Optimizations. Proc. VLDB Endow. 13, 8 (2020), 1206--1220.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Orri Erling, Alex Averbuch, Josep Larriba-Pey, Hassan Chafi, Andrey Gubichev, Arnau Prat, Minh-Duc Pham, and Peter Boncz. 2015. The LDBC social network benchmark: Interactive workload. In SIGMOD. 619--630.Google ScholarGoogle Scholar
  7. Google AlloyDB. 2023. AlloyDB Omni overview. https://cloud.google.com/alloydb/docs/omniGoogle ScholarGoogle Scholar
  8. Jim Gray. 1993. Database and Transaction Processing Performance Handbook.Google ScholarGoogle Scholar
  9. Qingsong Guo, Jiaheng Lu, Chao Zhang, Calvin Sun, and Steven Yuan. 2020. Multi-model data query languages and processing paradigms. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 3505--3506.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Dongxu Huang, Qi Liu, Qiu Cui, Zhuhe Fang, Xiaoyu Ma, Fei Xu, Li Shen, Liu Tang, Yuxing Zhou, Menglong Huang, et al. 2020. TiDB: A Raft-based HTAP Database. Proceedings of the VLDB Endowment 13, 12 (2020), 3072--3084.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Guoxin Kang, Lei Wang, Wanling Gao, Fei Tang, and Jianfeng Zhan. 2022. OLxP-Bench: Real-time, Semantically Consistent, and Domain-specific are Essential in Benchmarking, Designing, and Implementing HTAP Systems. In ICDE. IEEE, 1822--1834.Google ScholarGoogle Scholar
  12. Tirthankar Lahiri, Shasank Chavan, Maria Colgan, Dinesh Das, Amit Ganesh, Mike Gleeson, Sanket Hase, Allison Holloway, Jesse Kamp, Teck-Hua Lee, et al. 2015. Oracle Database In-Memory: A Dual Format In-Memory Database. In ICDE. IEEE, 1253--1258.Google ScholarGoogle Scholar
  13. Per-Åke Larson, Adrian Birka, Eric N Hanson, Weiyun Huang, Michal Nowakiewicz, and Vassilis Papadimos. 2015. Real-Time Analytical Processing with SQL Server. VLDB 8, 12 (2015), 1740--1751.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Guoliang Li, Haowen Dong, and Chao Zhang. 2022. Cloud Databases: New Techniques, Challenges, and Opportunities. VLDB 15, 12 (2022), 3758--3761.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Guoliang Li and Chao Zhang. 2022. HTAP Databases: What is New and What is Next. In SIGMOD. 2483--2488.Google ScholarGoogle Scholar
  16. Ziqi Liu, Chaochao Chen, Xinxing Yang, Jun Zhou, Xiaolong Li, and Le Song. 2018. Heterogeneous Graph Neural Networks for Malicious Account Detection. In CIKM. ACM, 2077--2085.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Lerong Lu. 2018. How a little ant challenges giant banks? The rise of Ant Financial (Alipay)'s fintech empire and relevant regulatory concerns. International Company and Commercial Law Review (2018), Sweet & Maxwell, ISSN (2018), 0958--5214.Google ScholarGoogle Scholar
  18. Elena Milkai, Yannis Chronis, Kevin P. Gaffney, Zhihan Guo, Jignesh M. Patel, and Xiangyao Yu. 2022. How Good is My HTAP System?. In SIGMOD. ACM, 1810--1824.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. MySQL 8.0. 2023. Consistent Nonlocking Reads. https://dev.mysql.com/doc/refman/8.0/en/innodb-consistent-read.htmlGoogle ScholarGoogle Scholar
  20. MySQL Heatwave. 2021. Real-time Analytics for MySQL Database Service.Google ScholarGoogle Scholar
  21. Thomas Neumann, Tobias Mühlbauer, and Alfons Kemper. 2015. Fast Serializable Multi-Version Concurrency Control for Main-Memory Database Systems. In SIGMOD. 677--689.Google ScholarGoogle Scholar
  22. Patrick E. O'Neil, Elizabeth J. O'Neil, Xuedong Chen, and Stephen Revilak. 2009. The Star Schema Benchmark and Augmented Fact Table Indexing. In TPCTC (Lecture Notes in Computer Science), Vol. 5895. Springer, 237--252.Google ScholarGoogle Scholar
  23. Oracle 21c. 2023. Automating Management of In-Memory Objects. https://docs.oracle.com/en/database/oracle/oracle-database/21/inmem/configuring-memory-management.htmlGoogle ScholarGoogle Scholar
  24. Vijayshankar Raman, Gopi Attaluri, Ronald Barber, Naresh Chainani, David Kalmuk, Vincent KulandaiSamy, Jens Leenstra, Sam Lightstone, Shaorong Liu, Guy M Lohman, et al. 2013. DB2 with BLU Acceleration: So Much More Than Just A Column Store. VLDB 6, 11 (2013), 1080--1091.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Aunn Raza, Periklis Chrysogelos, Angelos Christos Anadiotis, and Anastasia Ailamaki. 2020. Adaptive HTAP Through Elastic Resource Scheduling. In SIGMOD. 2043--2054.Google ScholarGoogle Scholar
  26. Vishal Sikka, Franz Färber, Wolfgang Lehner, Sang Kyun Cha, Thomas Peh, and Christof Bornhövd. 2012. Efficient Transaction Processing in SAP HANA Database: The End of A Column Store Myth. In SIGMOD. 731--742.Google ScholarGoogle Scholar
  27. Snowflake Unistore. 2022. Getting Started with Transactional and Analytical data in Snowflake.Google ScholarGoogle Scholar
  28. Tecent. 2021. WeBank. https://segmentfault.com/a/1190000040792825/enGoogle ScholarGoogle Scholar
  29. Tecent. 2023. WeBank. https://www.webank.com/en/product/000001Google ScholarGoogle Scholar
  30. Tecent. 2023. WeBank. https://www.webank.com/en/characteristic/tech/bigdataGoogle ScholarGoogle Scholar
  31. Transaction Processing Performance Council. 2021. TPC-C.Google ScholarGoogle Scholar
  32. Transaction Processing Performance Council. 2021. TPC-H.Google ScholarGoogle Scholar
  33. Wikipedia. 2023. David DeWitt. https://en.wikipedia.org/wiki/David_DeWittGoogle ScholarGoogle Scholar
  34. Jiacheng Yang, Ian Rae, Jun Xu, et al. 2020. F1 Lightning: HTAP as a Service. Proceedings of the VLDB Endowment 13, 12 (2020), 3313--3325.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Zhenkun Yang, Chuanhui Yang, Fusheng Han, Mingqiang Zhuang, Bing Yang, Zhifeng Yang, Xiaojun Cheng, Yuzhong Zhao, Wenhui Shi, Huafeng Xi, Huang Yu, Bin Liu, Yi Pan, Boxue Yin, Junquan Chen, and Quanqing Xu. 2022. OceanBase: A 707 Million tpmC Distributed Relational Database System. Proceedings of the VLDB Endowment 15, 12 (2022), 3385--3397.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Chao Zhang and Jiaheng Lu. 2020. Selectivity estimation for relation-tree joins. In 32nd International Conference on Scientific and Statistical Database Management (SSDBM). 1--12.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Chao Zhang and Jiaheng Lu. 2021. Holistic evaluation in multi-model databases benchmarking. Distributed Parallel Databases 39, 1 (2021), 1--33.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Chao Zhang, Jiaheng Lu, Pengfei Xu, and Yuxing Chen. 2018. UniBench: A Benchmark for Multi-model Database Management Systems. In TPCTC, Vol. 11135. Springer, 7--23.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

  • Published in

    cover image Proceedings of the VLDB Endowment
    Proceedings of the VLDB Endowment  Volume 17, Issue 5
    January 2024
    233 pages
    ISSN:2150-8097
    Issue’s Table of Contents

    Publisher

    VLDB Endowment

    Publication History

    • Published: 2 May 2024
    Published in pvldb Volume 17, Issue 5

    Check for updates

    Qualifiers

    • research-article
  • Article Metrics

    • Downloads (Last 12 months)15
    • Downloads (Last 6 weeks)15

    Other Metrics

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader