ABSTRACT
Rising energy consumption is of growing concern for cloud data center providers. Modern processors try to counteract this problem through low-power idle states that save energy in phases with little demand for compute resources. Making proper use of this feature, however, requires knowledge about the properties of these states for the very processors used in a specific setup; most importantly, the energy consumed in each idle state and the latency for resuming normal operation. Unfortunately, hardware vendors usually do not provide this critical information.
In this paper, we propose a scheme for automatically analyzing the idle states of modern Intel processors. Our open-source implementation uses an extensible Linux kernel module to measure the energy and latency implications of a system's processor without any manual intervention or external equipment. We demonstrate the practical applicability of our approach by analyzing two Intel processors from the Haswell and Skylake generation ---an Intel Core i7-4790 and an Intel Core i7-6700K, respectively. The results show that our implementation yields reliable, precise, and reproducible measurements for the energy and latency implications of each processor's various idle states.
- G. Antoniou, H. Volos, D. B. Bartolini, T. Rollet, Y. Sazeides, and J. H. Yahya. 2022. AgilePkgC: an agile system idle state architecture for energy proportional datacenter servers. (2022). Google ScholarDigital Library
- L. A. Barroso, U. Hölzle, and P. Ranganathan. 2019. The Datacenter as a Computer: Designing Warehouse-scale Machines. Springer Nature.Google ScholarCross Ref
- J. Chapel. 2020. The cloud is booming --- but so is cloud waste. (Mar. 4, 2020). Retrieved July 23, 2023 from https://devops.com/the-cloud-is-booming-but-so-is-cloud-waste/.Google Scholar
- M. Colmant, M. Kurpicz, P. Felber, L. Huertas, R. Rouvoy, and A. Sobe. 2015. Process-level power estimation in VM-based systems. In Proceedings of the Tenth European Conference on Computer Systems (EuroSys '15). European Conference on Computer Systems. ACM, Bordeaux, France, (Apr. 2015). isbn: 978-1-4503-3238-5. Google ScholarDigital Library
- V. Costan and S. Devadas. 2016. Intel SGX explained. (2016). https://eprint.iacr.org/2016/086.Google Scholar
- S. Daud, R. B. Ahmad, O. B. Lynn, Z. I. Abd Kareem, L. Munirah Kamarudin, P. Ehkan, M. N. M. Warip, and R. R. Othman. 2014. The effects of CPU load & idle state on embedded processor energy usage. In 2014 2nd International Conference on Electronic Design (ICED), 30--35. Google ScholarCross Ref
- L. Duan, D. Zhan, and J. Hohnerlein. 2015. Optimizing Cloud Data Center Energy Efficiency via Dynamic Prediction of CPU Idle Intervals. In 2015 IEEE 8th International Conference on Cloud Computing, 985--988. Google ScholarDigital Library
- D. Hackenberg, R. Schöne, T. Ilsche, D. Molka, J. Schuchart, and R. Geyer. 2015. An energy efficiency feature survey of the Intel Haswell processor. In (IPDPSW '15). IEEE International Parallel and Distributed Processing Symposium Workshop. IEEE, Hyderabad, India, (May 2015), 896--904. isbn: 978-1-4673-7684-6. Google ScholarDigital Library
- M. Hähnel, B. Döbel, M. Völp, and H. Härtig. 2012. Measuring energy consumption for short code paths using RAPL. ACM SIGMETRICS Performance Evaluation Review, 40, 3, (Jan. 2012), 13--17. Google ScholarDigital Library
- D. Hardy, M. Kleanthous, I. Sideris, A. G. Saidi, E. Ozer, and Y. Sazeides. 2013. An analytical framework for estimating TCO and exploring data center design space. In 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). IEEE, 54--63.Google Scholar
- C.-H. Hsu, Q. Deng, J. Mars, and L. Tang. 2018. Smoothoperator: reducing power fragmentation and improving power utilization in large-scale datacenters. In Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, 535--548.Google Scholar
- T. Ilsche, M. Hähnel, R. Schöne, M. Bielert, and D. Hackenberg. 2017. Powernightmares: the challenge of efficiently using sleep states on multi-core systems. In Proceedings of the Workshop on Runtime and Operating Systems for the Many-Core Era (ROME '17). Workshop on Runtime and Operating Systems for the Many-Core Era. Springer, Santiago de Compostela, Spain, (Aug. 2017), 623--635. isbn: 978-3-319-75177-1. Google ScholarCross Ref
- T. Ilsche, R. Schöne, P. Joram, M. Bielert, and A. Gocht. 2018. System monitoring with lo2s: power and runtime impact of C-state transitions. In (IPDPSW '18). IEEE International Parallel and Distributed Processing Symposium Workshops. IEEE, Vancouver, BC, Canada, (May 2018), 712--715. isbn: 978-1-5386-5555-9. Google ScholarCross Ref
- Intel Corporation. 2022. 6th Generation Intel® Core™ Processor Family: Datasheet - Volume 1. (Feb. 2022). 164 pp. Retrieved July 23, 2023 from https://www.intel.com/content/www/us/en/content-details/332687/6th-generation-intel-core-processor-family-datasheet-volume-1.html.Google Scholar
- Intel Corporation. 2015. Desktop 4th Generation Intel® Core™ Processor Family, Desktop Intel® Pentium® Processor Family, and Desktop Intel® Celeron® Processor Family: Datasheet - Volume 1 of 2. (Mar. 2015). 125 pp. Retrieved July 23, 2023 from https://cdrdv2.intel.com/v1/dl/getContent/328897?fileName=4th-gen-core-family-desktop-vol-1-datasheet.pdf.Google Scholar
- Intel Corporation. 2004. IA-PC HPET (High Precision Event Timers) Specification. (Oct. 2004). 33 pp. https://www.intel.com/content/dam/www/public/us/en/documents/technical-specifications/software-developers-hpet-spec-1-0a.pdf.Google Scholar
- Intel Corporation. 2022. Intel 64 and IA-32 Architectures Software Developer's Manual. (Dec. 2022). 5060 pp. https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4.Google Scholar
- J. Koomey, K. Brill, P. Turner, J. Stanley, and B. Taylor. 2007. A Simple Model for Determining True Total Cost of Ownership for Data Centers. Uptime Institute White Paper, Version, 2, 2007.Google Scholar
- M. Koot and F. Wijnhoven. 2021. Usage impact on data center electricity needs: a system dynamic forecasting model. Applied Energy, 291, 116798. Google ScholarCross Ref
- N. Kurd et al. 2015. Haswell: A Family of IA 22 nm Processors. IEEE Journal of Solid-State Circuits, 50, 1, 49--58. Google ScholarCross Ref
- A. Mazouz, A. Laurent, B. Pradelle, and W. Jalby. 2014. Evaluation of CPU frequency transition latency. Computer Science-Research and Development, 29, 3--4, 187--195.Google ScholarDigital Library
- P. R. Panda, B. V. N. Silpa, A. Shrivastava, and K. Gummidipudi. 2010. Power-Efficient System Design. Springer Science & Business Media.Google Scholar
- A. Paya and D. C. Marinescu. 2017. Energy-aware load balancing and application scaling for the cloud ecosystem. IEEE Transactions on Cloud Computing, 5, 1, 15--27. Google ScholarCross Ref
- Rafael J. Wysocki. 2017. CPU performance scaling --- the Linux kernel documentation. (2017). https://www.kernel.org/doc/html/latest/admin-guide/pm/cpufreq.html.Google Scholar
- R. Schöne, T. Ilsche, M. Bielert, A. Gocht, and D. Hackenberg. 2019. Energy efficiency features of the Intel Skylake-SP processor and their impact on performance. In International Conference on High Performance Computing & Simulation (HPCS '19). IEEE, Dublin, Ireland, (July 2019), 399--406. isbn: 978-1-72814-484-9. Google ScholarCross Ref
- R. Schöne, T. Ilsche, M. Bielert, M. Velten, M. Schmidl, and D. Hackenberg. 2021. Energy Efficiency Aspects of the AMD Zen 2 Architecture. In 2021 IEEE International Conference on Cluster Computing (CLUSTER), 562--571. Google ScholarCross Ref
- R. Schöne, D. Molka, and M. Werner. 2015. Wake-up latencies for processor idle states on current x86 processors. Computer Science - Research and Development, 30, 2, (May 2015), 219--227. Google ScholarDigital Library
- T. Smejkal, M. Hähnel, T. Ilsche, M. Roitzsch, W. E. Nagel, and H. Härtig. 2017. E-Team: practical energy accounting for multi-core systems. In Proceedings of the 2017 USENIX Conference on Usenix Annual Technical Conference (USENIX ATC '17). USENIX Annual Technical Conference. USENIX Association, Santa Clara, CA, USA, (July 2017), 589--601. isbn: 978-1-931971-38-6. https://www.usenix.org/conference/atc17/technical-sessions/presentation/smejkal.Google Scholar
- UEFI Forum, Inc. 2022. Advanced Configuration and Power Interface (ACPI) Specification. (Release 6.5 ed.). (Aug. 29, 2022). 1126 pp. https://uefi.org/sites/default/files/resources/ACPI_Spec_6_5_Aug29.pdf.Google Scholar
- R. J. Wysocki. 2018. CPU idle time management --- the Linux kernel documentation. (2018). Retrieved July 23, 2023 from https://www.kernel.org/doc/html/latest/admin-guide/pm/cpuidle.html.Google Scholar
- J. H. Yahya et al. 2022. AgileWatts: an energy-efficient CPU core idle-state architecture for latency-sensitive server applications. In 2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO), 835--850. Google ScholarDigital Library
Index Terms
- Sleep Well: Pragmatic Analysis of the Idle States of Intel Processors
Recommendations
The Investigation of the ARMv7 and Intel Haswell Architectures Suitability for Performance and Energy-Aware Computing
High Performance ComputingAbstractThe reduction of the CPU frequency and voltage is a well-known approach to improve energy consumption of memory-bound applications. This is based on the conception that the performance of the main memory sees little or no degradation at reduced ...
Dynamic MIPS Rate Stabilization for Complex Processors
Modern microprocessor cores reach their high performance levels with the help of high clock rates, parallel and speculative execution of a large number of instructions, and vast cache hierarchies. Modern cores also have adaptive features to regulate ...
Adaptive front-end throttling for superscalar processors
ISLPED '14: Proceedings of the 2014 international symposium on Low power electronics and designTo achieve high performance, conventional superscalar processors maintain maximum front-end instruction delivery bandwidth, which is often suboptimal when program behavior and priority metrics change. This paper proposes an adaptive front-end throttling ...
Comments