Abstract
Data-intensive applications executing on NVM-based storage systems experience serious bottlenecks when moving data between DRAM and NVM. We advocate for the use of the long-existing but recently neglected on-chip DMA to expedite data movement with three contributions. First, we explore new latency-oriented optimization directions, driven by a comprehensive DMA study, to design a high-performance DMA module, which significantly lowers the I/O size threshold to observe benefits. Second, we propose a new data movement engine,
- 2023. Filebench. https://github.com/filebench/filebench. [Online; accessed Jan-2023].Google Scholar
- 2023. Graph500. https://graph500.org/. [Online; accessed Jan-2023].Google Scholar
- 2023. MySQL. https://github.com/mysql. [Online; accessed Jan-2023].Google Scholar
- 2023. PMDK. https://github.com/pmem/pmdk. [Online; accessed Jan-2023].Google Scholar
- 2023. TPC Benchamrk C. http://tpc.org/tpcc/. [Online; accessed Jan-2023].Google Scholar
- Hiroyuki Akinaga and Hisashi Shima. 2010. Resistive random access memory (ReRAM) based on metal oxides. Proc. IEEE 98, 12 (2010), 2237–2251.Google ScholarCross Ref
- Thomas E Anderson, Marco Canini, Jongyul Kim, Dejan Kostić, Youngjin Kwon, Simon Peter, Waleed Reda, Henry N Schuh, and Emmett Witchel. 2020. Assise: Performance and Availability via Client-local NVM in a Distributed File System. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20). 1011–1027.Google Scholar
- Jens Axboe. 2023. FIO. https://github.com/axboe/fio. [Online; accessed Jan-2023].Google Scholar
- Lawrence Benson, Hendrik Makait, and Tilmann Rabl. 2021. Viper: An Efficient Hybrid PMem-DRAM Key-Value Store. Proc. VLDB Endow. 14, 9 (may 2021), 1544–1556. https://doi.org/10.14778/3461535.3461543Google ScholarDigital Library
- Jungsik Choi, Jiwon Kim, and Hwansoo Han. 2017. Efficient Memory Mapped File I/O for In-Memory File Systems. In 9th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage 17). Santa Clara, CA. https://www.usenix.org/conference/hotstorage17/program/presentation/choiGoogle ScholarDigital Library
- CXL Consortium. 2022. Compute Express Link: The Breakthrough CPU-to-Device Interconnect. https://www.computeexpresslink.org/. [Online; accessed Jan-2023].Google Scholar
- Björn Daase, Lars Jonas Bollmeier, Lawrence Benson, and Tilmann Rabl. 2021. Maximizing Persistent Memory Bandwidth Utilization for OLAP Workloads. In Proceedings of the 2021 International Conference on Management of Data (Virtual Event, China) (SIGMOD/PODS ’21). New York, NY, USA, 339–351. https://doi.org/10.1145/3448016.3457292Google ScholarDigital Library
- Subramanya R. Dulloor, Amitabha Roy, Zheguang Zhao, Narayanan Sundaram, Nadathur Satish, Rajesh Sankaran, Jeff Jackson, and Karsten Schwan. 2016. Data Tiering in Heterogeneous Memory Systems. In Proceedings of the Eleventh European Conference on Computer Systems (London, United Kingdom) (EuroSys ’16). Association for Computing Machinery, New York, NY, USA, Article 15, 16 pages. https://doi.org/10.1145/2901318.2901344Google ScholarDigital Library
- Alireza Farshin, Amir Roozbeh, Gerald Q Maguire Jr, and Dejan Kostić. 2020. Reexamining Direct Cache Access to Optimize {I/O} Intensive Applications for Multi-hundred-gigabit Networks. In 2020 USENIX Annual Technical Conference (USENIX ATC 20). 673–689.Google Scholar
- Christina Giannoula, Kailong Huang, Jonathan Tang, Nectarios Koziris, Georgios Goumas, Zeshan Chishti, and Nandita Vijaykumar. 2023. DaeMon: Architectural Support for Efficient Data Movement in Fully Disaggregated Systems. Proceedings of the ACM on Measurement and Analysis of Computing Systems 7, 1(2023), 1–36.Google ScholarDigital Library
- Shashank Gugnani, Arjun Kashyap, and Xiaoyi Lu. 2020. Understanding the Idiosyncrasies of Real Persistent Memory. Proc. VLDB Endow. 14, 4 (Dec. 2020), 626–639. https://doi.org/10.14778/3436905.3436921Google ScholarDigital Library
- Frank T Hady, Annie Foong, Bryan Veal, and Dan Williams. 2017. Platform storage performance with 3D XPoint technology. Proc. IEEE 105, 9 (2017), 1822–1833.Google ScholarCross Ref
- Intel. 2023. Intel I/O Acceleration Technology. https://www.intel.com/content/www/us/en/wireless-network/accel-technology.html. [Online; accessed Jan-2023].Google Scholar
- Intel. 2023. Intel Vtune Profiler. https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler.html#gs.3f5fmb. [Online; accessed Jun-2023].Google Scholar
- Dave Jiang. [n. d.]. libnvdimm: add DMA supported blk-mq pmem driver. https://lore.kernel.org/linux-nvdimm/150412628764.69288.12074115435918322858.stgit@djiang5-desk3.ch.intel.com/#r. [Online; accessed Jan-2023].Google Scholar
- Myoungsoo Jung. 2022. Hello Bytes, Bye Blocks: PCIe Storage Meets Compute Express Link for Memory Expansion (CXL-SSD). In Proceedings of the 14th ACM Workshop on Hot Topics in Storage and File Systems (Virtual Event) (HotStorage ’22). Association for Computing Machinery, New York, NY, USA, 45–51. https://doi.org/10.1145/3538643.3539745Google ScholarDigital Library
- Rohan Kadekodi, Saurabh Kadekodi, Soujanya Ponnapalli, Harshad Shirwadkar, Gregory R. Ganger, Aasheesh Kolli, and Vijay Chidambaram. 2021. WineFS: A Hugepage-Aware File System for Persistent Memory That Ages Gracefully. In Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles (Virtual Event, Germany) (SOSP ’21). Association for Computing Machinery, New York, NY, USA, 804–818. https://doi.org/10.1145/3477132.3483567Google ScholarDigital Library
- Rohan Kadekodi, Se Kwon Lee, Sanidhya Kashyap, Taesoo Kim, Aasheesh Kolli, and Vijay Chidambaram. 2019. SplitFS: Reducing Software Overhead in File Systems for Persistent Memory. In Proceedings of the 27th ACM Symposium on Operating Systems Principles (Huntsville, Ontario, Canada) (SOSP ’19). New York, NY, USA, 494–508. https://doi.org/10.1145/3341301.3359631Google ScholarDigital Library
- Anuj Kalia, David Andersen, and Michael Kaminsky. 2020. Challenges and Solutions for Fast Remote Persistent Memory Access. In Proceedings of the 11th ACM Symposium on Cloud Computing (Virtual Event, USA) (SoCC ’20). New York, NY, USA, 105–119. https://doi.org/10.1145/3419111.3421294Google ScholarDigital Library
- Sudarsun Kannan, Nitish Bhat, Ada Gavrilovska, Andrea Arpaci-Dusseau, and Remzi Arpaci-Dusseau. 2018. Redesigning LSMs for Nonvolatile Memory with NoveLSM. In 2018 USENIX Annual Technical Conference (USENIX ATC 18). 993–1005.Google ScholarDigital Library
- Yoshihisa Kato, Yukihiro Kaneko, Hiroyuki Tanaka, Kazuhiro Kaibara, Shinzo Koyama, Kazunori Isogai, Takayoshi Yamada, and Yasuhiro Shimada. 2007. Overview and future challenge of ferroelectric random access memory technologies. Japanese Journal of Applied Physics 46, 4S (2007), 2157.Google ScholarCross Ref
- Ana Khorguani, Thomas Ropars, and Noel De Palma. 2022. ResPCT: Fast Checkpointing in Non-Volatile Memory for Multi-Threaded Applications. In Proceedings of the Seventeenth European Conference on Computer Systems (Rennes, France) (EuroSys ’22). Association for Computing Machinery, New York, NY, USA, 525–540. https://doi.org/10.1145/3492321.3519590Google ScholarDigital Library
- Jongyul Kim, Insu Jang, Waleed Reda, Jaeseong Im, Marco Canini, Dejan Kostić, Youngjin Kwon, Simon Peter, and Emmett Witchel. 2021. LineFS: Efficient SmartNIC Offload of a Distributed File System with Pipeline Parallelism. In Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles (Virtual Event, Germany) (SOSP ’21). New York, NY, USA, 756–771. https://doi.org/10.1145/3477132.3483565Google ScholarDigital Library
- Juno Kim, Yun Joon Soh, Joseph Izraelevitz, Jishen Zhao, and Steven Swanson. 2020. SubZero: Zero-Copy IO for Persistent Main Memory File Systems. In Proceedings of the 11th ACM SIGOPS Asia-Pacific Workshop on Systems (Tsukuba, Japan) (APSys ’20). New York, NY, USA, 1–8. https://doi.org/10.1145/3409963.3410489Google ScholarDigital Library
- Wonbae Kim, Chanyeol Park, Dongui Kim, Hyeongjun Park, Young ri Choi, Alan Sussman, and Beomseok Nam. 2022. ListDB: Union of Write-Ahead Logs and Persistent SkipLists for Incremental Checkpointing on Persistent Memory. In 16th USENIX Symposium on Operating Systems Design and Implementation (OSDI 22). USENIX Association, Carlsbad, CA, 161–177. https://www.usenix.org/conference/osdi22/presentation/kimGoogle Scholar
- Wook-Hee Kim, R. Madhava Krishnan, Xinwei Fu, Sanidhya Kashyap, and Changwoo Min. 2021. PACTree: A High Performance Persistent Range Index Using PAC Guidelines. In Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles (Virtual Event, Germany) (SOSP ’21). New York, NY, USA, 424–439. https://doi.org/10.1145/3477132.3483589Google ScholarDigital Library
- Reese Kuper, Ipoom Jeong, Yifan Yuan, Jiayu Hu, Ren Wang, Narayan Ranganathan, and Nam Sung Kim. 2023. A Quantitative Analysis and Guideline of Data Streaming Accelerator in Intel 4th Gen Xeon Scalable Processors. arXiv preprint arXiv:2305.02480(2023).Google Scholar
- Ruibin Li, Xiang Ren, Xu Zhao, Siwei He, Michael Stumm, and Ding Yuan. 2022. ctFS: Replacing File Indexing with Hardware Memory Translation through Contiguous File Allocation for Persistent Memory. In 20th USENIX Conference on File and Storage Technologies (FAST 22). USENIX Association, Santa Clara, CA, 35–50. https://www.usenix.org/conference/fast22/presentation/liGoogle ScholarDigital Library
- Linux. 2014. Add support for NV-DIMMs to ext4. https://lwn.net/Articles/613384/. [Online; accessed Jan-2023].Google Scholar
- Linux. 2015. xfs: DAX support. https://lwn.net/Articles/635514/. [Online; accessed Jan-2023].Google Scholar
- Linux. 2023. Device Mapper. https://www.kernel.org/doc/Documentation/device-mapper/. [Online; accessed Jan-2023].Google Scholar
- Linux. 2023. DMAEngine framework. https://www.kernel.org/doc/Documentation/driver-api/dmaengine/. [Online; accessed Jan-2023].Google Scholar
- Youyou Lu, Jiwu Shu, Youmin Chen, and Tao Li. 2017. Octopus: an RDMA-enabled Distributed Persistent Memory File System. In 2017 USENIX Annual Technical Conference (USENIX ATC 17). USENIX Association, Santa Clara, CA, 773–785. https://www.usenix.org/conference/atc17/technical-sessions/presentation/luGoogle ScholarDigital Library
- Maciej Maciejewski. 2016. How to emulate Persistent Memory. https://pmem.io/blog/2016/02/how-to-emulate-persistent-memory/. [Online; accessed Jan-2023].Google Scholar
- Ian Neal, Gefei Zuo, Eric Shiple, Tanvir Ahmed Khan, Youngjin Kwon, Simon Peter, and Baris Kasikci. 2021. Rethinking File Mapping for Persistent Memory. In 19th USENIX Conference on File and Storage Technologies (FAST 21). 97–111. https://www.usenix.org/conference/fast21/presentation/nealGoogle Scholar
- Philip Ng. 2019. Accelerating Intra-Host PVRDMA Storage Traffic in a Future Dell AMD Server. Talk at VMWorld 2019. [Online; accessed Jan-2023].Google Scholar
- Anastasios Papagiannis, Manolis Marazakis, and Angelos Bilas. 2021. Memory-Mapped I/O on Steroids. In Proceedings of the Sixteenth European Conference on Computer Systems. New York, NY, USA, 277–293. https://doi.org/10.1145/3447786.3456242Google ScholarDigital Library
- Jonathan Prout. 2022. Expanding Beyond Limits With CXL-based Memory. [Online; accessed Jan-2023].Google Scholar
- Simone Raoux, Geoffrey W Burr, Matthew J Breitwisch, Charles T Rettner, Y-C Chen, Robert M Shelby, Martin Salinga, Daniel Krebs, S-H Chen, H-L Lung, et al. 2008. Phase-change random access memory: A scalable technology. IBM Journal of Research and Development 52, 4.5 (2008), 465–479.Google ScholarDigital Library
- Amanda Raybuck, Tim Stamler, Wei Zhang, Mattan Erez, and Simon Peter. 2021. HeMem: Scalable Tiered Memory Management for Big Data Applications and Real NVM. In Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles (Virtual Event, Germany) (SOSP ’21). New York, NY, USA, 392–407. https://doi.org/10.1145/3477132.3483550Google ScholarDigital Library
- Thomas Rueckes. 2011. High density, high reliability carbon nanotube NRAM. In Flash Memory Summit.Google Scholar
- Stackoverflow. 2022. Why are SIMD instructions not used in kernel?https://stackoverflow.com/questions/46677676/why-are-simd-instructions-not-used-in-kernel. [Online; accessed Jan-2023].Google Scholar
- Timothy Stamler, Deukyeon Hwang, Amanda Raybuck, Wei Zhang, and Simon Peter. 2022. zIO: Accelerating IO-Intensive Applications with Transparent Zero-Copy IO. In 16th USENIX Symposium on Operating Systems Design and Implementation (OSDI 22). USENIX Association, Carlsbad, CA, 431–445. https://www.usenix.org/conference/osdi22/presentation/stamlerGoogle Scholar
- Yan Sun, Yifan Yuan, Zeduo Yu, Reese Kuper, Chihun Song, Jinghan Huang, Houxiang Ji, Siddharth Agarwal, Jiaqi Lou, Ipoom Jeong, Ren Wang, Jung Ho Ahn, Tianyin Xu, and Nam Sung Kim. 2023. Demystifying CXL Memory with Genuine CXL-Ready Systems and Devices. In Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture (, Toronto, ON, Canada,) (MICRO ’23). Association for Computing Machinery, New York, NY, USA, 105–121. https://doi.org/10.1145/3613424.3614256Google ScholarDigital Library
- AA Tulapurkar, Y Suzuki, A Fukushima, H Kubota, H Maehara, K Tsunekawa, DD Djayaprawira, N Watanabe, and S Yuasa. 2005. Spin-torque diode effect in magnetic tunnel junctions. Nature 438, 7066 (2005), 339–342.Google ScholarCross Ref
- K. Vaidyanathan, L. Chai, W. Huang, and D. K. Panda. 2007. Efficient asynchronous memory copy operations on multi-core systems and I/OAT. In 2007 IEEE International Conference on Cluster Computing. 159–168. https://doi.org/10.1109/CLUSTR.2007.4629228Google ScholarDigital Library
- K. Vaidyanathan, W. Huang, L. Chai, and D. K. Panda. 2007. Designing Efficient Asynchronous Memory Operations Using Hardware Copy Engine: A Case Study with I/OAT. In 2007 IEEE International Parallel and Distributed Processing Symposium. 1–8. https://doi.org/10.1109/IPDPS.2007.370479Google ScholarCross Ref
- Karthikeyan Vaidyanathan and Dhabaleswar K Panda. 2007. Benefits of I/O acceleration technology (I/OAT) in clusters. In 2007 IEEE International Symposium on Performance Analysis of Systems & Software. IEEE, 220–229.Google ScholarCross Ref
- Rui Wang, Yongkun Li, Hong Xie, Yinlong Xu, and John CS Lui. 2020. Graphwalker: An i/o-efficient and resource-friendly graph analytic system for fast and scalable random walks. In 2020 USENIX Annual Technical Conference (USENIX ATC 20). 559–571.Google Scholar
- Kan Wu, Kaiwei Tu, Yuvraj Patel, Rathijit Sen, Kwanghyun Park, Andrea Arpaci-Dusseau, and Remzi Arpaci-Dusseau. 2022. NyxCache: Flexible and Efficient Multi-tenant Persistent Memory Caching. In 20th USENIX Conference on File and Storage Technologies (FAST 22). USENIX Association, Santa Clara, CA, 1–16. https://www.usenix.org/conference/fast22/presentation/wuGoogle Scholar
- Lingfeng Xiang, Xingsheng Zhao, Jia Rao, Song Jiang, and Hong Jiang. 2022. Characterizing the Performance of Intel Optane Persistent Memory: A Close Look at Its on-DIMM Buffering. In Proceedings of the Seventeenth European Conference on Computer Systems (Rennes, France) (EuroSys ’22). Association for Computing Machinery, New York, NY, USA, 488–505. https://doi.org/10.1145/3492321.3519556Google ScholarDigital Library
- Jian Xu, Juno Kim, Amirsaman Memaripour, and Steven Swanson. 2019. Finding and Fixing Performance Pathologies in Persistent Memory Software Stacks. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems (Providence, RI, USA) (ASPLOS ’19). New York, NY, USA, 427–439. https://doi.org/10.1145/3297858.3304077Google ScholarDigital Library
- Jian Xu and Steven Swanson. 2016. NOVA: A Log-structured File System for Hybrid Volatile/Non-volatile Main Memories. In 14th USENIX Conference on File and Storage Technologies (FAST 16). Santa Clara, CA, 323–338. https://www.usenix.org/conference/fast16/technical-sessions/presentation/xuGoogle ScholarDigital Library
- Zi Yan. 2019. Accelerate page migration and use memcg for PMEM management. https://lwn.net/Articles/784925/. [Online; accessed Jan-2023].Google Scholar
- Jian Yang, Juno Kim, Morteza Hoseinzadeh, Joseph Izraelevitz, and Steve Swanson. 2020. An Empirical Guide to the Behavior and Use of Scalable Persistent Memory. In 18th USENIX Conference on File and Storage Technologies (FAST 20). Santa Clara, CA, 169–182. https://www.usenix.org/conference/fast20/presentation/yangGoogle ScholarDigital Library
- Jifei Yi, Benchao Dong, Mingkai Dong, Ruizhe Tong, and Haibo Chen. 2022. MT2: Memory Bandwidth Regulation on Hybrid NVM/DRAM Platforms. In 20th USENIX Conference on File and Storage Technologies (FAST 22). USENIX Association, Santa Clara, CA, 199–216. https://www.usenix.org/conference/fast22/presentation/yi-mt2Google Scholar
- Diyu Zhou, Yuchen Qian, Vishal Gupta, Zhifei Yang, Changwoo Min, and Sanidhya Kashyap. 2022. ODINFS: Scaling PM Performance with Opportunistic Delegation. In 16th USENIX Symposium on Operating Systems Design and Implementation (OSDI 22). USENIX Association, Carlsbad, CA, 179–193. https://www.usenix.org/conference/osdi22/presentation/zhou-diyuGoogle Scholar
Index Terms
- Fastmove: A Comprehensive Study of On-Chip DMA and its Demonstration for Accelerating Data Movement in NVM-based Storage Systems
Recommendations
Revitalizing the forgotten on-chip DMA to expedite data movement in NVM-based storage systems
FAST'23: Proceedings of the 21st USENIX Conference on File and Storage TechnologiesData-intensive applications executing on NVM-based storage systems experience serious bottlenecks when moving data between DRAM and NVM. We advocate for the use of the long-existing but recently neglected on-chip DMA to expedite data movement with three ...
System evaluation of the Intel optane byte-addressable NVM
MEMSYS '19: Proceedings of the International Symposium on Memory SystemsByte-addressable non-volatile memory (NVM) features high density, DRAM comparable performance, and persistence. These characteristics position NVM as a promising new tier in the memory hierarchy. Nevertheless, NVM has asymmetric read and write ...
NVM duet: unified working memory and persistent store architecture
ASPLOS '14: Proceedings of the 19th international conference on Architectural support for programming languages and operating systemsEmerging non-volatile memory (NVM) technologies have gained a lot of attention recently. The byte-addressability and high density of NVM enable computer architects to build large-scale main memory systems. NVM has also been shown to be a promising ...
Comments