ABSTRACT
Power consumption of the main memory in modern heterogeneous high-performance computing (HPC) constitutes a significant part of the total power consumption of a node. This motivates energy-efficient solutions targeting the memory domain as well. Practitioners need reliable energy measurement techniques for analyzing energy and power consumption of applications and performance optimizations. Running Average Power Limit (RAPL) is a common choice, as it provides uncomplicated access to the energy measurements. While RAPL's accuracy has been studied and validated on homogeneous memory platforms, no work we are aware of investigated its accuracy on heterogeneous memory platforms, specifically with high-capacity memory (HCM). This paper describes the process of measuring the memory power consumption externally using riser cards in detail. We validate RAPL's accuracy by comparing results obtained from Intel's Ice Lake-SP system equipped with DDR4 DRAM and Intel Optane Persistent Memory Modules (PMM). In addition, we verify the accuracy of our instrumentation setup by comparing the results from an older Broadwell system with the results in the literature. We show that the RAPL values on a heterogeneous memory system report a higher offset from the reference measurements. The difference is more pronounced at lower memory load for all memory types. Also, we find that RAPL readings are inconsistent between multiple sockets and over time. Based on the evaluated scenarios, we conclude that RAPL overestimates the actual power consumption on heterogeneous memory systems and provide a discussion on the possible causes of this effect.
- Spencer Desrochers, Chad Paradis, and Vincent M. Weaver. 2016. A Validation of DRAM RAPL Power Measurements. In Proceedings of the Second International Symposium on Memory Systems. ACM, Alexandria VA USA, 455--470. isbn: 978- 1--4503--4305--3. doi: 10.1145/2989081.2989088.Google ScholarDigital Library
- Corey Gough, Ian Steiner, and Winston Saunders. 2015. Energy Efficient Servers: Blueprints for Data Center Optimization. Apress, Berkeley, CA. isbn: 978--1- 4302--6638--9. doi: 10.1007/978--1--4302--6638--9.Google ScholarCross Ref
- Daniel Hackenberg, Robert Schone, Thomas Ilsche, Daniel Molka, Joseph Schuchart, and Robin Geyer. 2015. An Energy Efficiency Feature Survey of the Intel Haswell Processor. In 2015 IEEE International Parallel and Distributed Processing Symposium Workshop. IEEE, Hyderabad, India, 896--904. isbn: 978--1- 4673--7684--6. doi: 10.1109/IPDPSW.2015.70.Google ScholarDigital Library
- Thomas Ilsche. 2020. Energy Measurements of High Performance Computing Systems: From Instrumentation to Analysis. Ph.D. Dissertation. Technical University Dresden. Retrieved June 12, 2023 from https://nbn-resolving.org/urn:n bn:de:bsz:14-qucosa2--716000.Google Scholar
- Thomas Ilsche, Daniel Hackenberg, Stefan Graul, Robert Schone, and Joseph Schuchart. 2015. Power measurements for compute nodes: Improving sampling rates, granularity and accuracy. In 2015 Sixth International Green and Sustainable Computing Conference (IGSC). IEEE, Las Vegas, NV, USA, 1--8. isbn: 978--1--5090-0172--9. doi: 10.1109/IGCC.2015.7393710.Google ScholarDigital Library
- Infineon. 2023. TDA38640 OptiMOS iPOL 40A Single-voltage Synchronous Buck Regulator with SVID and I2C. Retrieved Feb. 10, 2024 from https://www.infineon.com/dgdl/Infineon-TDA38640-0000-DataSheet-v02_08-EN.pdf?fileId=8ac78c8c80027ecd018042f2337f00c9.Google Scholar
- Infineon. 2020. VR13 and VR12.5 Multi-rail / Multiphase Digital Controllers. Infineon, (July 20, 2020). Retrieved Oct. 15, 2023 from https://www.infineon.co m/dgdl/Infineon-Multiphase_digital_controllers_PXE1_PXM1-DataSheet-v0 1_00-EN.pdf?fileId=5546d46272e49d2a01736b7cd95b3c4a.Google Scholar
- Texas Instruments. 2023. CS-AMPLIFIER-ERROR-TOOL Calculation tool. Retrieved Aug. 16, 2023 from https://www.ti.com/tool/CS-AMPLIFIER-ERRORTOOL.Google Scholar
- Texas Instruments. 2022. INAx180 Low- and High-Side Voltage Output, Current- Sense Amplifiers. Retrieved Oct. 3, 2023 from https://www.ti.com/lit/ds/symli nk/ina2180.pdf.Google Scholar
- Intel. 2022. 3rd Gen Intel® Xeon® Scalable Processor, Codename Ice Lake-SP Datasheet, Volume One: Electrical. Retrieved Aug. 17, 2023 from https://www.i ntel.com/content/www/us/en/content-details/732800/3rd-gen-intel-xeon-sc alable-processor-codename-ice-lake-sp-datasheet-volume-one-electrical.ht ml.Google Scholar
- Intel. 2023. Intel® 64 and IA-32 Architectures Software Developer's Manual Volume 3B: System Programming Guide, Part 2. en. (2023). Retrieved Feb. 10, 2024 from https://www.intel.com/content/www/us/en/developer/articles/tech nical/intel-sdm.html.Google Scholar
- Intel. 2021. Intel® Memory Latency Checker v3.9a. Retrieved July 16, 2023 from https://www.intel.com/content/www/us/en/developer/articles/tool/intelr-m emory-latency-checker.html.Google Scholar
- Intel. 2023. Intel® Server Board M50CYP2SB Technical Product Specification -- Rev. 1.42. Retrieved Feb. 10, 2024 from https://www.intel.com/content/dam/su pport/us/en/documents/server-products/single-node-servers/m50cyp2sb-ser ver-board-tps.pdf.Google Scholar
- Intel. 2022. Running Average Power Limit Energy Reporting / CVE-2020--8694 , CVE-2020--8695 / INTEL-SA-00389. Retrieved Aug. 25, 2023 from https://www .intel.com/content/www/cn/zh/developer/articles/technical/software-securit y-guidance/advisory-guidance/running-average-power-limit-energy-report ing.html.Google Scholar
- Intel. 2009. Voltage Regulator Module (VRM) and Enterprise Voltage Regulator- Down (EVRD) 11.1 - Design Guidelines. Retrieved Sept. 2, 2023 from https://w ww.intel.it/content/dam/doc/design-guide/voltage-regulator-module-enterp rise-voltage-regulator-down-11--1-guidelines.pdf.Google Scholar
- JEDEC. 2019. JEDEC Standard No. 21C, Release 29 - DDR4 SDRAM Registered DIMM Design Specification. Retrieved Sept. 6, 2023 from https://www.jedec.or g/standards-documents/docs/module4_20_28.Google Scholar
- JEDEC. 2022. JEDEC Standard No. 305 DDR5 Load Reduced (LRDIMM) and Registered Dual Inline Memory Module (RDIMM) Common Specification. (Jan. 2022). Retrieved Oct. 18, 2023 from https://www.jedec.org/standards-documen ts/docs/jesd305.Google Scholar
- Kashif Nizam Khan, Mikael Hirki, Tapio Niemi, Jukka K. Nurminen, and Zhonghong Ou. 2018. RAPL in Action: Experiences in Using RAPL for Power Measurements. ACM Transactions on Modeling and Performance Evaluation of Computing Systems, 3, 2, 1--26. doi: 10.1145/3177754.Google ScholarDigital Library
- Rahul Khanna, Fadi Zuhayri, Murugasamy Nachimuthu, Christian Le, and Mohan J Kumar. 2011. Unified extensible firmware interface: An innovative approach to DRAM power control. In 2011 International Conference on Energy Aware Computing. IEEE, Istanbul, Turkey, 1--6. doi: 10.1109/ICEAC.2011.61367 03.Google ScholarCross Ref
- Anara Kozhokanova, BoWang, Christian Terboven, and Matthias Mueller. 2023. Power-aware Computing with Optane Persistent Memory Modules. In 2023 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). IEEE, St. Petersburg, FL, USA, 26--31. doi: 10.1109/IPDPSW59300.20 23.00017.Google ScholarCross Ref
- Linux. 2023. Numa(3) - Linux manual page. Retrieved Nov. 14, 2023 from https://man7.org/linux/man-pages/man3/numa.3.html.Google Scholar
- Linux. 2023. Numactl(8) - Linux manual page. Retrieved Nov. 14, 2023 from https://man7.org/linux/man-pages/man8/numactl.8.html.Google Scholar
- Linux. 2023. Perf: linux profiling with performance counters. (2023). Retrieved Nov. 14, 2023 from https://perf.wiki.kernel.org/index.php/Main_Page.Google Scholar
- Moritz Lipp, Andreas Kogler, David Oswald, Michael Schwarz, Catherine Easdon, Claudio Canella, and Daniel Gruss. 2021. PLATYPUS: Software-based Power Side-Channel Attacks on x86. In 2021 IEEE Symposium on Security and Privacy (SP). IEEE, San Francisco, CA, USA, 355--371. isbn: 978--1--72818--934--5. doi: 10.1109/SP40001.2021.00063.Google ScholarCross Ref
- Heike McCraw, James Ralph, Anthony Danalis, and Jack Dongarra. 2014. Power monitoring with PAPI for extreme scale architectures and dataflow-based programming models. In 2014 IEEE International Conference on Cluster Computing (CLUSTER). IEEE, Madrid, Spain, 385--391. isbn: 978--1--4799--5548-0. doi: 10.1109/CLUSTER.2014.6968672.Google ScholarCross Ref
- Measurement Computing Corporation. 2022. MCC128 - 16-bit Voltage Measurement DAQ HAT for Raspberry Pi. Measurement Computing, (Dec. 2022). Retrieved June 15, 2023 from https://www.mccdaq.com/PDFs/specs/DS-MCC-128.pdf.Google Scholar
- Microchip Technology, Inc. 2017. MCP2221 USB 2.0 to I2C/UART Protocol Converter with GPIO. Retrieved Feb. 10, 2024 from http://ww1.microchip.com /downloads/en/devicedoc/20005292c.pdf.Google Scholar
- Ivy B. Peng, Maya B. Gokhale, and Eric W. Green. 2019. System evaluation of the Intel optane byte-addressable NVM. en. In Proceedings of the International Symposium on Memory Systems. ACM, Washington District of Columbia USA, 304--315. isbn: 978--1--4503--7206-0. doi: 10.1145/3357526.3357568.Google ScholarDigital Library
- PMem.io. 2023. Libvmmalloc | PMDK. Retrieved Nov. 14, 2023 from https://pm em.io/pmdk/manpages/linux/v1.3/libvmmalloc.3/.Google Scholar
- PMem.io. 2023. MEMKIND. Retrieved Nov. 14, 2023 from https://pmem.io/me mkind/manpages/memkind.3/.Google Scholar
- Renesas. 2018. ISL69133 Digital, Dual Output, 4-Phase Configurable, VR13/IMVP8 PWM Controller. Renesas, (Feb. 2018). Retrieved Oct. 15, 2023 from https://www.renesas.com/us/en/document/sds/isl69133-data-short.Google Scholar
- Christopher R. Robertson. 2008. Fundamental Electrical and Electronic Principles. (3rd ed.). Newnes, Amsterdam. isbn: 978-0--7506--8737--9.Google Scholar
- Robert Schone, Thomas Ilsche, Mario Bielert, Markus Velten, Markus Schmidl, and Daniel Hackenberg. 2021. Energy Efficiency Aspects of the AMD Zen 2 Architecture. In 2021 IEEE International Conference on Cluster Computing (CLUSTER). IEEE, Portland, OR, USA, 562--571. isbn: 978--1--72819--666--4. doi: 10.1109/Cluster48925.2021.00087.Google ScholarCross Ref
- SK hynix. 2016. DDR4 SDRAM Registered DIMM Based on 4Gb A-die. Retrieved Feb. 10, 2024 from https://www.datasheets360.com/part/detail/hma42gr7afr4nuh/ 1817397095979284182/.Google Scholar
- Texas Instruments. 2022. TPS544C26 4-V to 16-V Input, 35-A Synchronous Buck Converter With SVID And I2C Interfaces. Texas Instruments, (Sept. 2022). Retrieved Sept. 3, 2023 from https://www.ti.com/lit/ds/symlink/tps544c26.pdf.Google Scholar
- Jan Treibig, Georg Hager, and Gerhard Wellein. 2010. Likwid: a lightweight performance-oriented tool suite for x86 multicore environments. In 2010 39th International Conference on Parallel Processing Workshops, 207--216. doi: 10.110 9/ICPPW.2010.38.Google ScholarDigital Library
- Vincent M. Weaver. 2015. RAPL Userspace Access without perf - Source Code. Retrieved Sept. 3, 2023 from https://web.eece.maine.edu/~vweaver/projects/ra pl/rapl-read.c.Google Scholar
- Mario Willeit. 2020. Solutions for Powering Intel and AMD SoCs. (2020). Retrieved Feb. 10, 2024 from https://media.monolithicpower.com/mps_cms_docu ment/m/p/mps_solutions_for_powering_intel_and_amd_socs_22.04.2020.pdf.Google Scholar
- Jian Yang, Juno Kim, Morteza Hoseinzadeh, Joseph Izraelevitz, and Steven Swanson. 2020. An empirical guide to the behavior and use of scalable persistent memory. In Proceedings of the 18th USENIX Conference on File and Storage Technologies (FAST'20). USENIX Association, Santa Clara, CA, USA, 169--182. isbn: 9781939133120.Google ScholarDigital Library
Index Terms
- An Experimental Setup to Evaluate RAPL Energy Counters for Heterogeneous Memory
Recommendations
Heterogeneous HMC+DDRx Memory Management for Performance-Temperature Tradeoffs
Three-dimensional DRAMs (3D-DRAMs) are emerging as a promising solution to address the memory wall problem in computer systems. However, high fabrication cost per bit and thermal issues are the main reasons that prevent architects from using 3D-DRAM ...
Reliability and Performance Trade-off Study of Heterogeneous Memories
MEMSYS '16: Proceedings of the Second International Symposium on Memory SystemsHeterogeneous memories, organized as die-stacked in-package and off-package memory, have been a focus of attention by the computer architects to improve memory bandwidth and capacity. Researchers have explored methods and organizations to optimize ...
How much power does your server consume? Estimating wall socket power using RAPL measurements
Full system electricity intake from the wall socket is important for understanding and budgeting the power consumption of large scale data centers. Measuring full system power, however, requires extra instrumentation with external physical devices, ...
Comments