research-article

HyBench: A New Benchmark for HTAP Databases

Authors:
Chao Zhang

Tsinghua University

Tsinghua University
View Profile

,
Guoliang Li

Tsinghua University

Tsinghua University
View Profile

,
Tao Lv

China Software Testing Center

China Software Testing Center
View Profile

Authors Info & Claims

Proceedings of the VLDB Endowment Volume 17 Issue 5pp 939–951https://doi.org/10.14778/3641204.3641206

Published:02 May 2024Publication History

Proceedings of the VLDB Endowment

Abstract

In this paper, we propose, HyBench, a new benchmark for HTAP databases. First, we generate the testing data by simulating a representative HTAP application. We particularly develop a time-dependent generation phase and an anomaly generation phase for testing HTAP with large cardinality and various anomalies. Second, we propose a set of hybrid workloads. Specifically, we design 18 read/write transactions, 13 analytical queries, and a mix workload of 6 analytical transactions and 6 interactive queries. We also develop a graph-based parameter curation method to control the access patterns including skew access and data contention of the hybrid workload. Third, we propose a unified metric for quantifying the overall HTAP performance. Particularly, we introduce a query-driven method that evaluates the data freshness (lag time between analytics and transactions). Then we introduce a three-phase execution rule to compute a unified metric, combining the performance of OLTP (TPS), OLAP (QPS), and OLXP (XPS) and data freshness. To verify the effectiveness of HyBench and to debunk the myth of different HTAP architectures, extensive experiments have been conducted over five HTAP databases.

References

Peter A. Boncz, Thomas Neumann, and Orri Erling. 2013. TPC-H Analyzed: Hidden Messages and Lessons Learned from an Influential Benchmark. In TPCTC (Lecture Notes in Computer Science), Vol. 8391. Springer, 61--76.Google Scholar
Fábio Coelho, João Paulo, Ricardo Vilaça, José Pereira, and Rui Oliveira. 2017. HTAPBench: Hybrid Transactional and Analytical Processing Benchmark. In Proceedings of the 8th ACM/SPEC on International Conference on Performance Engineering. 293--304.Google ScholarDigital Library
Richard Cole, Florian Funke, Leo Giakoumakis, et al. 2011. The Mixed Workload CH-benCHmark. In Proceedings of the Fourth International Workshop on Testing Database Systems. 1--6.Google ScholarDigital Library
Brian F Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking cloud serving systems with YCSB. In Proceedings of the 1st ACM symposium on Cloud computing. 143--154.Google ScholarDigital Library
Markus Dreseler, Martin Boissier, Tilmann Rabl, and Matthias Uflacker. 2020. Quantifying TPC-H Choke Points and Their Optimizations. Proc. VLDB Endow. 13, 8 (2020), 1206--1220.Google ScholarDigital Library
Orri Erling, Alex Averbuch, Josep Larriba-Pey, Hassan Chafi, Andrey Gubichev, Arnau Prat, Minh-Duc Pham, and Peter Boncz. 2015. The LDBC social network benchmark: Interactive workload. In SIGMOD. 619--630.Google Scholar
Google AlloyDB. 2023. AlloyDB Omni overview. https://cloud.google.com/alloydb/docs/omniGoogle Scholar
Jim Gray. 1993. Database and Transaction Processing Performance Handbook.Google Scholar
Qingsong Guo, Jiaheng Lu, Chao Zhang, Calvin Sun, and Steven Yuan. 2020. Multi-model data query languages and processing paradigms. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 3505--3506.Google ScholarDigital Library
Dongxu Huang, Qi Liu, Qiu Cui, Zhuhe Fang, Xiaoyu Ma, Fei Xu, Li Shen, Liu Tang, Yuxing Zhou, Menglong Huang, et al. 2020. TiDB: A Raft-based HTAP Database. Proceedings of the VLDB Endowment 13, 12 (2020), 3072--3084.Google ScholarDigital Library
Guoxin Kang, Lei Wang, Wanling Gao, Fei Tang, and Jianfeng Zhan. 2022. OLxP-Bench: Real-time, Semantically Consistent, and Domain-specific are Essential in Benchmarking, Designing, and Implementing HTAP Systems. In ICDE. IEEE, 1822--1834.Google Scholar
Tirthankar Lahiri, Shasank Chavan, Maria Colgan, Dinesh Das, Amit Ganesh, Mike Gleeson, Sanket Hase, Allison Holloway, Jesse Kamp, Teck-Hua Lee, et al. 2015. Oracle Database In-Memory: A Dual Format In-Memory Database. In ICDE. IEEE, 1253--1258.Google Scholar
Per-Åke Larson, Adrian Birka, Eric N Hanson, Weiyun Huang, Michal Nowakiewicz, and Vassilis Papadimos. 2015. Real-Time Analytical Processing with SQL Server. VLDB 8, 12 (2015), 1740--1751.Google ScholarDigital Library
Guoliang Li, Haowen Dong, and Chao Zhang. 2022. Cloud Databases: New Techniques, Challenges, and Opportunities. VLDB 15, 12 (2022), 3758--3761.Google ScholarDigital Library
Guoliang Li and Chao Zhang. 2022. HTAP Databases: What is New and What is Next. In SIGMOD. 2483--2488.Google Scholar
Ziqi Liu, Chaochao Chen, Xinxing Yang, Jun Zhou, Xiaolong Li, and Le Song. 2018. Heterogeneous Graph Neural Networks for Malicious Account Detection. In CIKM. ACM, 2077--2085.Google ScholarDigital Library
Lerong Lu. 2018. How a little ant challenges giant banks? The rise of Ant Financial (Alipay)'s fintech empire and relevant regulatory concerns. International Company and Commercial Law Review (2018), Sweet & Maxwell, ISSN (2018), 0958--5214.Google Scholar
Elena Milkai, Yannis Chronis, Kevin P. Gaffney, Zhihan Guo, Jignesh M. Patel, and Xiangyao Yu. 2022. How Good is My HTAP System?. In SIGMOD. ACM, 1810--1824.Google ScholarDigital Library
MySQL 8.0. 2023. Consistent Nonlocking Reads. https://dev.mysql.com/doc/refman/8.0/en/innodb-consistent-read.htmlGoogle Scholar
MySQL Heatwave. 2021. Real-time Analytics for MySQL Database Service.Google Scholar
Thomas Neumann, Tobias Mühlbauer, and Alfons Kemper. 2015. Fast Serializable Multi-Version Concurrency Control for Main-Memory Database Systems. In SIGMOD. 677--689.Google Scholar
Patrick E. O'Neil, Elizabeth J. O'Neil, Xuedong Chen, and Stephen Revilak. 2009. The Star Schema Benchmark and Augmented Fact Table Indexing. In TPCTC (Lecture Notes in Computer Science), Vol. 5895. Springer, 237--252.Google Scholar
Oracle 21c. 2023. Automating Management of In-Memory Objects. https://docs.oracle.com/en/database/oracle/oracle-database/21/inmem/configuring-memory-management.htmlGoogle Scholar
Vijayshankar Raman, Gopi Attaluri, Ronald Barber, Naresh Chainani, David Kalmuk, Vincent KulandaiSamy, Jens Leenstra, Sam Lightstone, Shaorong Liu, Guy M Lohman, et al. 2013. DB2 with BLU Acceleration: So Much More Than Just A Column Store. VLDB 6, 11 (2013), 1080--1091.Google ScholarDigital Library
Aunn Raza, Periklis Chrysogelos, Angelos Christos Anadiotis, and Anastasia Ailamaki. 2020. Adaptive HTAP Through Elastic Resource Scheduling. In SIGMOD. 2043--2054.Google Scholar
Vishal Sikka, Franz Färber, Wolfgang Lehner, Sang Kyun Cha, Thomas Peh, and Christof Bornhövd. 2012. Efficient Transaction Processing in SAP HANA Database: The End of A Column Store Myth. In SIGMOD. 731--742.Google Scholar
Snowflake Unistore. 2022. Getting Started with Transactional and Analytical data in Snowflake.Google Scholar
Tecent. 2021. WeBank. https://segmentfault.com/a/1190000040792825/enGoogle Scholar
Tecent. 2023. WeBank. https://www.webank.com/en/product/000001Google Scholar
Tecent. 2023. WeBank. https://www.webank.com/en/characteristic/tech/bigdataGoogle Scholar
Transaction Processing Performance Council. 2021. TPC-C.Google Scholar
Transaction Processing Performance Council. 2021. TPC-H.Google Scholar
Wikipedia. 2023. David DeWitt. https://en.wikipedia.org/wiki/David_DeWittGoogle Scholar
Jiacheng Yang, Ian Rae, Jun Xu, et al. 2020. F1 Lightning: HTAP as a Service. Proceedings of the VLDB Endowment 13, 12 (2020), 3313--3325.Google ScholarDigital Library
Zhenkun Yang, Chuanhui Yang, Fusheng Han, Mingqiang Zhuang, Bing Yang, Zhifeng Yang, Xiaojun Cheng, Yuzhong Zhao, Wenhui Shi, Huafeng Xi, Huang Yu, Bin Liu, Yi Pan, Boxue Yin, Junquan Chen, and Quanqing Xu. 2022. OceanBase: A 707 Million tpmC Distributed Relational Database System. Proceedings of the VLDB Endowment 15, 12 (2022), 3385--3397.Google ScholarDigital Library
Chao Zhang and Jiaheng Lu. 2020. Selectivity estimation for relation-tree joins. In 32nd International Conference on Scientific and Statistical Database Management (SSDBM). 1--12.Google ScholarDigital Library
Chao Zhang and Jiaheng Lu. 2021. Holistic evaluation in multi-model databases benchmarking. Distributed Parallel Databases 39, 1 (2021), 1--33.Google ScholarDigital Library
Chao Zhang, Jiaheng Lu, Pengfei Xu, and Yuxing Chen. 2018. UniBench: A Benchmark for Multi-model Database Management Systems. In TPCTC, Vol. 11135. Springer, 7--23.Google Scholar

Recommendations

Rethink Query Optimization in HTAP Databases
PACMMOD

The advent of data-intensive applications has fueled the evolution of hybrid transactional and analytical processing (HTAP). To support mixed workloads, distributed HTAP databases typically maintain two data copies that are specially tailored for data ...
Read More
How Good is My HTAP System?
SIGMOD '22: Proceedings of the 2022 International Conference on Management of Data

Hybrid Transactional and Analytical Processing (HTAP) systems have recently gained popularity as they combine OLAP and OLTP processing to reduce administrative and synchronization costs between dedicated systems. However, there is no precise ...
Read More
HTAP Databases: What is New and What is Next
SIGMOD '22: Proceedings of the 2022 International Conference on Management of Data

Processing the mixed workloads of transactions and analytical queries in a single database system can eliminate the ETL process and enable real-time data analysis on the transaction data. However, there is no free lunch. Such systems must balance the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
Proceedings of the VLDB Endowment Volume 17, Issue 5
January 2024
233 pages
ISSN:2150-8097
Editors:
Meihui Zhang
Beijing Institute of Technology
,
Cyrus Shahabi
University of Southern California
Issue’s Table of Contents
Sponsors
In-Cooperation
Publisher
VLDB Endowment
Publication History
- Published: 2 May 2024
Published in pvldb Volume 17, Issue 5

Check for updates
Badges
- Artifacts Available / v1.1
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 15
  Total Downloads
- Downloads (Last 12 months)15
- Downloads (Last 6 weeks)15
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HyBench: A New Benchmark for HTAP Databases

Proceedings of the VLDB Endowment

Abstract

References

Cited By

Recommendations

Rethink Query Optimization in HTAP Databases

How Good is My HTAP System?

HTAP Databases: What is New and What is Next