ABSTRACT
Key-value stores distribute data across several storage nodes to handle large amounts of parallel requests. Proper scheduling of these requests impacts the quality of service, as measured by achievable throughput and (tail) latencies. In addition to scheduling, performance heavily depends on the nature of the workload and the deployment environment. It is, unfortunately, difficult to evaluate different scheduling strategies consistently under the same operational conditions. Moreover, such strategies are often hard-coded in the system, limiting flexibility. We present Hector, a modular framework for implementing and evaluating scheduling policies in Apache Cassandra. Hector enables users to select among several options for key components of the scheduling workflow, from the request propagation via replica selection to the local ordering of incoming requests at a storage node. We demonstrate the capabilities of Hector by comparing strategies in various settings. For example, we find that leveraging cache locality effects may be of particular interest: we propose a new replica selection strategy, called Popularity-Aware, that supports 6 times the maximum throughput of the default algorithm under specific key access patterns. We also show that local scheduling policies have a significant effect when parallelism at each storage node is limited.
Supplemental Material
Available for Download
- 2012. Dynamic snitching in Cassandra: past, present, and future. https://www.datastax.com/blog/dynamic-snitching-cassandra-past-present-and-future. Accessed on 2022-11-10.Google Scholar
- Accessed on 2022-12-11. NoSQLBench Docs. https://docs.nosqlbench.io.Google Scholar
- Esmail Asyabi, Azer Bestavros, Erfan Sharafzadeh, and Timothy Zhu. 2020. Peafowl: In-application cpu scheduling to reduce power consumption of in-memory key-value stores. In Proceedings of the 11th ACM Symposium on Cloud Computing. 150–164.Google ScholarDigital Library
- Berk Atikoglu, Yuehai Xu, Eitan Frachtenberg, Song Jiang, and Mike Paleczny. 2012. Workload analysis of a large-scale key-value store. In Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE joint international conference on Measurement and Modeling of Computer Systems. 53–64.Google ScholarDigital Library
- Vaibhav Bajpai, Mirja Kühlewind, Jörg Ott, Jürgen Schönwälder, Anna Sperotto, and Brian Trammell. 2017. Challenges with reproducibility. In Proceedings of the Reproducibility Workshop. 1–4.Google ScholarDigital Library
- Oana Balmau, Florin Dinu, Willy Zwaenepoel, Karan Gupta, Ravishankar Chandhiramoorthi, and Diego Didona. 2019. SILK: Preventing Latency Spikes in Log-Structured Merge Key-Value Stores. In 2019 USENIX Annual Technical Conference (USENIX ATC 19). 753–766.Google Scholar
- Daniel Balouek, Alexandra Carpen Amarie, Ghislain Charrier, Frédéric Desprez, Emmanuel Jeannot, Emmanuel Jeanvoine, Adrien Lèbre, David Margery, Nicolas Niclausse, Lucas Nussbaum, 2012. Adding virtualization capabilities to the Grid’5000 testbed. In International Conference on Cloud Computing and Services Science. Springer, 3–20.Google Scholar
- Denis M Cavalcante, Victor AE de Farias, Flávio RC Sousa, Manoel Rui P Paula, Javam C Machado, and José Neuman de Souza. 2018. PopRing: A Popularity-aware Replica Placement for Distributed Key-Value Store.CLOSER 2018 (2018), 440–447.Google Scholar
- Brian F Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking cloud serving systems with YCSB. In Proceedings of the 1st ACM symposium on Cloud computing. 143–154.Google ScholarDigital Library
- Jeffrey Dean and Luiz André Barroso. 2013. The tail at scale. Commun. ACM 56, 2 (2013), 74–80.Google ScholarDigital Library
- Frank Golatowski, Jens Hildebrandt, Jan Blumenthal, and Dirk Timmermann. 2002. Framework for validation, test and analysis of real-time scheduling algorithms and scheduler implementations. In 13th IEEE International Workshop on Rapid System Prototyping. IEEE, 146–152.Google ScholarCross Ref
- Vikas Jaiman, Sonia Ben Mokhtar, and Etienne Rivière. 2020. TailX: Scheduling heterogeneous multiget queries to improve tail latencies in key-value stores. In IFIP International Conference on Distributed Applications and Interoperable Systems(DAIS). Springer, 73–92.Google ScholarDigital Library
- Vikas Jaiman, Sonia Ben Mokhtar, Vivien Quéma, Lydia Y Chen, and Etienne Rivìere. 2018. Héron: taming tail latencies in key-value stores under heterogeneous workloads. In 2018 IEEE 37th Symposium on Reliable Distributed Systems (SRDS). IEEE, 191–200.Google ScholarCross Ref
- Wanchun Jiang, Yujia Qiu, Fa Ji, Yongjia Zhang, Xiangqian Zhou, and Jianxin Wang. 2022. AMS: Adaptive Multiget Scheduling Algorithm for Distributed Key-Value Stores. IEEE Transactions on Cloud Computing (2022).Google Scholar
- Eddie Kohler, Robert Morris, Benjie Chen, John Jannotti, and M Frans Kaashoek. 2000. The Click modular router. ACM Transactions on Computer Systems (TOCS) 18, 3 (2000), 263–297.Google ScholarDigital Library
- Baptiste Lepers, Redha Gouicem, Damien Carver, Jean-Pierre Lozi, Nicolas Palix, Maria-Virginia Aponte, Willy Zwaenepoel, Julien Sopena, Julia Lawall, and Gilles Muller. 2020. Provable multicore schedulers with Ipanema: application to work conservation. In Proceedings of the Fifteenth European Conference on Computer Systems. 1–16.Google ScholarDigital Library
- Ashraf Mahgoub, Alexander Michaelson Medoff, Rakesh Kumar, Subrata Mitra, Ana Klimovic, Somali Chaterji, and Saurabh Bagchi. 2020. OPTIMUSCLOUD: Heterogeneous configuration optimization for distributed databases in the cloud. In 2020 USENIX Annual Technical Conference (USENIX ATC 20). 189–203.Google Scholar
- Gilles Muller, Julia L Lawall, and Hervé Duchesne. 2005. A framework for simplifying the development of kernel schedulers: Design and performance evaluation. In Ninth IEEE International Symposium on High-Assurance Systems Engineering (HASE’05). IEEE, 56–65.Google ScholarDigital Library
- Patrick O’Neil, Edward Cheng, Dieter Gawlick, and Elizabeth O’Neil. 1996. The log-structured merge-tree (LSM-tree). Acta Informatica 33 (1996), 351–385.Google ScholarDigital Library
- Chandandeep Singh Pabla. 2009. Completely fair scheduler. Linux Journal 2009, 184 (2009), 4.Google ScholarDigital Library
- Anastasios Papagiannis, Giorgos Saloustros, Pilar González-Férez, and Angelos Bilas. 2016. Tucana: Design and implementation of a fast and efficient scale-up key-value store. In 2016 USENIX Annual Technical Conference (USENIX ATC 16). 537–550.Google ScholarDigital Library
- Waleed Reda, Marco Canini, Lalith Suresh, Dejan Kostić, and Sean Braithwaite. 2017. Rein: Taming tail latency in key-value stores via multiget scheduling. In Proceedings of the Twelfth European Conference on Computer Systems. 95–110.Google ScholarDigital Library
- Lalith Suresh, Marco Canini, Stefan Schmid, and Anja Feldmann. 2015. C3: Cutting tail latency in cloud data stores via adaptive replica selection. In 12th USENIX Symposium on Networked Systems Design and Implementation (NSDI 15). 513–527.Google Scholar
- Matt Welsh, David E. Culler, and Eric A. Brewer. 2001. SEDA: An Architecture for Well-Conditioned, Scalable Internet Services. In Proceedings of the 18th ACM Symposium on Operating System Principles, SOSP. 230–243.Google ScholarDigital Library
- Xingbo Wu, Yuehai Xu, Zili Shao, and Song Jiang. 2015. LSM-trie: An LSM-tree-based Ultra-Large Key-Value Store for Small Data Items. In 2015 USENIX Annual Technical Conference (USENIX ATC 15). 71–82.Google ScholarDigital Library
- Zhe Wu, Curtis Yu, and Harsha V Madhyastha. 2015. CosTLO: Cost-Effective Redundancy for Lower Latency Variance on Cloud Storage Services. In 12th USENIX Symposium on Networked Systems Design and Implementation (NSDI 15). 543–557.Google Scholar
- Chen Xu, Mohamed A Sharaf, Minqi Zhou, Aoying Zhou, and Xiaofang Zhou. 2013. Adaptive query scheduling in key-value data stores. In International Conference on Database Systems for Advanced Applications. Springer, 86–100.Google ScholarCross Ref
- Kai Zhang, Kaibo Wang, Yuan Yuan, Lei Guo, Rubao Lee, and Xiaodong Zhang. 2015. Mega-KV: A case for GPUs to maximize the throughput of in-memory key-value stores. Proceedings of the VLDB Endowment 8, 11 (2015), 1226–1237.Google ScholarDigital Library
Index Terms
- Hector: A Framework to Design and Evaluate Scheduling Strategies in Persistent Key-Value Stores
Recommendations
An Efficient Memory-Mapped Key-Value Store for Flash Storage
SoCC '18: Proceedings of the ACM Symposium on Cloud ComputingPersistent key-value stores have emerged as a main component in the data access path of modern data processing systems. However, they exhibit high CPU and I/O overhead. Today, due to power limitations it is important to reduce CPU overheads for data ...
Building Efficient Key-Value Stores via a Lightweight Compaction Tree
Special Issue on MSST 2017 and Regular PapersLog-Structure Merge tree (LSM-tree) has been one of the mainstream indexes in key-value systems supporting a variety of write-intensive Internet applications in today’s data centers. However, the performance of LSM-tree is seriously hampered by ...
Fast key-value stores: An idea whose time has come and gone
HotOS '19: Proceedings of the Workshop on Hot Topics in Operating SystemsRemote, in-memory key-value (RINK) stores such as Memcached [6] and Redis [7] are widely used in industry and are an active area of academic research. Coupled with stateless application servers to execute business logic and a databaselike system to ...
Comments