NVHT: An efficient key–value storage library for non-volatile memory

https://doi.org/10.1016/j.jpdc.2018.02.013Get rights and content

Highlights

  • An efficient NVM-optimized key–value store named NVHT is proposed.

  • Non-volatile pointer is designed to address the invalid pointer problem.

  • A wear-out-aware pointer-free buddy allocator is developed.

  • A log-based mechanism is adopted to promise data consistency.

  • NVHT outperforms existing key-value stores by 1.3x-5x for two operations.

Abstract

Non-Volatile Memory (NVM) promises persistence, byte-addressability and DRAM-like read/write latency. These properties indicate that NVM has the potential to be incorporated with key–value stores to achieve high performance and durability simultaneously. Specifically, data can be stored in NVM inherently without DRAM buffering, which eliminates expensive disk I/Os and data format transformation cost. However, several challenges such as data inconsistency and write endurance arise along with the benefits. We propose a library named NVHT to provide APIs for NVM-based key–value store operations. In NVHT, we introduce non-volatile pointer to solve the dynamic address mapping problem and design a wear-out-aware memory allocator for NVM. The core of NVHT is a novel NVM-friendly hash table structure. NVHT guarantees consistency using a log-based mechanism. The experimental results show that compared with LevelDB and BerkeleyDB running on a DRAM-based file system, NVHT achieves more than 2x and 4x speedup for insert and search operation respectively. Compared with in-memory key–value store system Redis, NVHT still achieves higher transaction performance in terms of random update throughput (up to 1.5x and 2.5x for RDB scheme and AOF scheme, respectively).

Introduction

Key–value storage systems have received substantial attention for today’s data-intensive applications [9]. To achieve low latency and high performance, most well-known key–value stores including Memcached [26] and Redis [29] are implemented as in-memory hash tables. Meanwhile, back-ups (i.e., checkpoints) for in-memory data are naturally maintained on disk for durability and failure recovery. However, such implementation brings about expensive disk I/O cost for data transferring between disk and memory. Furthermore, in-memory hash tables have to be converted into disk-oriented data representation during transformation and vice versa [38]. Both the disk I/O cost and data format transformation overhead become significant bottlenecks of key–value stores in the overall performance [12].

Fortunately, non-volatile memory (NVM) has been gaining popularity in both research community and industry. Examples of advanced NVM technologies include Spin Transfer Torque RAM [16], Magnetic RAM [10], Memristors [39], Phase Change Memory [19] and Intel’s 3D XPoint [17], which promise non-volatility, byte-addressability, with comparable read and write latency as DRAM. All these excellent features lead us to believe that NVM is able to substitute both DRAM and disk in the near future [[4], [5], [36]] and hence encourage the incorporation of NVM with key–value stores. However, the memory cells of NVM wear out at every program-and-erase operation. Therefore, wear-leveling techniques should also be considered when using NVM for main memory.

An intuitive approach for NVM-based key–value storage systems is to replace disk with NVM directly without additional modifications to the existing key–value stores. Specifically, we can leverage NVM-based in-memory file systems such as PMFS [11], SCMFS [40] and BPFS [7] to maintain backups for the in-memory hash tables in NVM. Notably, the byte-addressability of NVM allows to eliminate high data transformation cost involved in the original key–value stores. However, the DRAM–NVM data redundancy problem still remains. In other words, both DRAM and NVM contain a copy of hash table, which not only leads to memory waste but also degrades the system performance.

To address the problem, an alternative solution is to treat NVM as a memory device and persist hash table in NVM inherently, where all the operations will be applied to the in-NVM hash table directly without DRAM buffering. For one thing, many studies have made effort to design NVM-based key–value stores [[4], [5], [36]]. But instead of hash table, they all considered B-tree as the core data structure. For another, several implementations of NVM-optimized persistent hash table have been proposed. Schwalb [33] presented NVC-Hashmap and addressed the challenges of the volatile CPU cache and reordering. Debnath [8] proposed a PCM-friendly hash table PFHT, which focused on read–write asymmetry of NVM and improved cuckoo hashing to avoid cascading writes. While these designs are efficient and NVM-optimized, they are still insufficient to be adopted directly in NVM-based key–value stores. The reasons are three-fold as follows.

First, most studies perform direct access to NVM via memory mapping system call mmap. Each call to the traditional mmap function will map physical address to virtual address with a new base address. Therefore, the pointers which are represented as the value of virtual address, will be mapped to different (and possibly invalid) physical address when calling mmap after rebooting. Specifically, if hash table entries contain keys or values as pointers, it is not able to retrieve persistent data upon system failures. Some researches [37] addressed the issue by establishing static address mapping rather than perform mmap of NVM data in a dynamic way. Unfortunately, this method will fall short in the presence of address space randomization in modern operation systems [[21], [34]].

Second, key–value stores may maintain keys and values with small size, e.g., several bytes, in some cases. However, current operating systems perform memory allocation with page-size granularity (4 KB or more) to incorporate block storage devices. The direct use of the page-size memory allocation will result in severe internal fragmentation for NVM-based key–value stores. Therefore, providing a memory allocator with finer granularity to facilitate NVM memory utilization is of great importance. What is more, NVM’s life expectancy is affected by its wear-leveling mechanism, which means the memory allocator should also take wear-outs of NVM memory into consideration.

Finally, prior work on NVM-based hash tables rarely address the consistency problem. In particular, operations towards the hash tables typically consist of multiple dependent units of work and hence cannot be performed via atomic operations. Upon system failures, these operations may be corrupted and become halfway completed, resulting in an inconsistent status of key–value stores.

In this paper, we develop an efficient key–value storage library named NVHT (standing for Non-Volatile Hash Table). The ultimate goal of NVHT is to allow various applications to persist key–value data in NVM efficiently. To achieve this, NVHT provides easy-to-use interfaces for user applications. The core of NVHT is a novel NVM-friendly hash table. We adopt MurmurHash3 hash function [2] and provide linear probing with a maximum probing length to resolve hash collisions. NVHT relies on a wear-out-aware memory allocator and guarantees consistency using log-based mechanism.

Rather than simply evaluate using DRAM, We employ NVDIMM to simulate NVM and compare the performance of NVHT with two advanced storage libraries, LevelDB and BerkeleyDB run on DRAM-based file system. The experimental results show that NVHT achieves 1.3x–4x speedup for insert operation and 4x–5x speedup for search operation compared with LevelDB and BerkeleyDB. We also perform evaluations for NVHT in a multi-threaded environment and observe that NVHT outperforms LevelDB and BerkeleyDB in an insert-search hybrid case, achieving 1.3x–4.9x speedup. The source code of NVHT is available on Github.1

We summarize the contributions of this paper as follows.

  • We propose NVHT, an efficient key–value storage library for NVM. The core of NVHT is an NVM-friendly hash table structure. We adopt MurmurHash3 hash function and provide linear probing with a maximum probing length to resolve hash collisions.

  • In NVHT, we introduce a novel structure called non-volatile pointer (NVP) to solve the invalid pointer problem in the case of dynamic address mapping, which guarantees that NVHT can work well with both static and dynamic implementations of mmap.

  • To address the internal fragmentation problem and wear-leveling problem, we develop a wear-out-aware memory allocator for NVM called pointer-free buddy allocator (PFBA).

  • We introduce a log-based consistency mechanism for NVHT and present recovery approach for various inconsistency scenarios when using our library.

  • We conduct a comprehensive performance study using NVDIMM [27] to simulate NVM and validate the effectiveness and efficiency of NVHT.

The rest of the paper is organized as follows. Section 2 introduces the principle of hybrid DRAM–NVM memory management systems. Our library is built on top of such systems. Section 3 presents the architecture of NVHT. We provide implementation details of NVHT in Section 4 and evaluate its performance in Section 5. We survey related work in Section 6 and conclude this paper in Section 7.

Section snippets

Background

In this section, we first describe hybrid DRAM–NVM memory management. We then summarize a set of functions provided by the underlying memory management systems, which are the basis of NVHT.

NVHT overview

In this section, we provide an overview of NVHT. We first list our design goals and then show the architecture of NVHT. Lastly, we discuss about the application program interfaces provided by NVHT.

NVHT implementation

In this section, we provide details about how we design and implement NVHT. We first introduce several important modules in NVHT, including non-volatile pointer, pointer-free buddy allocator and core hash table. We then demonstrate our consistency mechanism and present how NVHT performs failure recovery. At last we introduce the optimizations for NVHT in cache design and wear-leveling.

Evaluation

In this section, we present our analysis of our NVHT implementation. We evaluate NVHT in two dimensions. One dimension is the effect of different parameters in NVHT itself, including maximum probing length, load factor and the initial capacity of hash table. The other dimension is the single-thread and multi-thread I/O performance of NVHT, compared with three popular key–value storage systems, Redis, LevelDB and BerkeleyDB. To make the comparison experiment fair and reasonable, we test the

Related work

Non-volatile memory system. Previous researches have discussed different abstractions for the usage of NVM. Some of them treat NVM as a memory store and design specific interfaces to access NVM. Others are prone to develop NVM-based in-memory file systems. Mnemosyne [37] provides a set of primitives for operating data in persistent regions. For transaction, it uses write-ahead logging at the word granularity. NV-heaps [6] proposes a persistent object heap and an object-based interface to NVM.

Conclusion

The emerging non-volatile memory presents new opportunities for key–value data storage. In this paper, we propose NVHT, a high-performance key–value storage library for NVM. Except for the simple APIs provided to programmers, NVHT uses non-volatile pointer to support dynamic memory mapping. To reduce internal fragmentation, NVHT introduces a pointer-free buddy allocator. We propose an NVM-friendly hash table as the core implementation of NVHT and design a log-based consistency mechanism to

Acknowledgments

The work was supported by the National High-tech R&D Program of China (863 Program) [grant number 2015AA015303]; and the National Natural Science Foundation of China [grant number 61472241].

Kaixin Huang is currently a Ph.D. student at Shanghai Jiao Tong University, China. He received his bachelor degree from University of Electronic Science and Technology of China, in 2016. His research interests include non-volatile memory computing and in-memory database systems.

References (42)

  • ZhengS. et al.

    Hmvfs: A hybrid memory versioning file system

  • Adl-TabatabaiA.-R. et al.

    Compiler and runtime support for efficient software transactional memory

    SIGPLAN Not.

    (2006)
  • A. Appleby, Murmurhash 2.0,...
  • ChenS. et al.

    TPC-E vs. TPC-C: Characterizing the new TPC-E benchmark via an I/O comparison study

    SIGMOD Rec.

    (2011)
  • ChenS. et al.

    Persistent B+-trees in non-volatile main memory

    Proc. VLDB Endow.

    (2015)
  • ChiP. et al.

    Making B+-tree efficient in PCM-based main memory

  • CoburnJ. et al.

    NV-heaps: Making persistent objects fast and safe with next-generation, non-volatile memories

    SIGPLAN Not.

    (2011)
  • ConditJ. et al.

    Better I/O through byte-addressable, persistent memory

  • DebnathB. et al.

    Revisiting hash table design for phase change memory

  • DeCandiaG. et al.

    Dynamo: Amazon’s highly available key-value store

    SIGOPS Oper. Syst. Rev.

    (2007)
  • X. Dong, X. Wu, G. Sun, Y. Xie, H. Li, Y. Chen, Circuit and microarchitecture evaluation of 3D stacking magnetic RAM...
  • DulloorS.R. et al.

    System software for persistent memory

  • FangR. et al.

    High performance database logging using storage class memory

  • Free list,...
  • Free space bitmap,...
  • HarrisT. et al.

    Optimizing Memory Transactions, Vol. 41

    (2006)
  • HosomiM. et al.

    A novel nonvolatile memory with spin torque transfer magnetization switching: spin-ram

  • Intel, micron reveal xpoint, a new memory architecture that could outclass DDR4 and NAND,...
  • KirschA. et al.

    More robust hashing: Cuckoo hashing with a stash

    SIAM J. Comput.

    (2009)
  • LeeB.C. et al.

    Phase-Change technology and the future of main memory

    IEEE Micro

    (2010)
  • LeeK. et al.

    A 90 nm 1.8 V 512 Mb Diade-switch PRAM with 226 MB/s read throughput

    IEEE J.Solid-State Circ.

    (2007)
  • Cited by (7)

    • Quail: Using NVM write monitor to enable transparent wear-leveling

      2020, Journal of Systems Architecture
      Citation Excerpt :

      Memory recycling is designed to improve the space occupation when only a small proportion of bits on a physical NVM page are worn out in a page-oriented memory management system [32,33]. As for wear-leveling, it improves NVM endurance by dint of evening the writes over the managed NVM space, thus preventing the NVM pages from being worn out quickly by extensive write accesses [8–12,14,16–18,34,35]. These three types of techniques introduced above are orthogonal to and cooperate with one another [15].

    • Hotspot-Aware Hybrid Memory Management for In-Memory Key-Value Stores

      2020, IEEE Transactions on Parallel and Distributed Systems
    • Revisiting Persistent Hash Table Design for Commercial Non-Volatile Memory

      2020, Proceedings of the 2020 Design, Automation and Test in Europe Conference and Exhibition, DATE 2020
    • Forca: Fast and Atomic Remote Direct Access to Persistent Memory

      2019, Proceedings - 2018 IEEE 36th International Conference on Computer Design, ICCD 2018
    • A Log-Structured Key-Value Store Based on Non-Volatile Memory

      2018, Jisuanji Yanjiu yu Fazhan/Computer Research and Development
    View all citing articles on Scopus

    Kaixin Huang is currently a Ph.D. student at Shanghai Jiao Tong University, China. He received his bachelor degree from University of Electronic Science and Technology of China, in 2016. His research interests include non-volatile memory computing and in-memory database systems.

    Jie Zhou is currently a master at Shanghai Jiao Tong University, China. He received his bachelor degree from University of Science and Technology of China in 2014. His research interests include in-memory computing, memory management and key–value store system.

    Linpeng Huang received his M.S. and Ph.D. degrees in computer science from Shanghai Jiao Tong University in 1989 and 1992, respectively. He is a professor of computer science in the department of computer science and engineering, Shanghai Jiao Tong University. His research interests lie in the area of distributed systems and service oriented computing.

    Yanyan Shen is currently an assistant professor at Shanghai Jiao Tong University, China. She received her B.Sc. degree from Peking University, China, in 2010 and her Ph.D. degree in computer science from National University of Singapore in 2015. Her research interests include distributed systems, efficient data processing techniques and data integration.

    View full text