ABSTRACT
As the core count in processor chips grows, so do the on-die, shared resources such as on-chip communication fabric and shared cache, which are of paramount importance for chip performance and power. This paper presents a method for dynamic voltage/frequency scaling of networks-on-chip and last level caches in multicore processor designs, where the shared resources form a single voltage/frequency domain. Several new techniques for monitoring and control are developed, and validated through full system simulations on the PARSEC benchmarks. These techniques reduce energy-delay product by 56% compared to a state-of-the-art prior work.
- Bienia, C., Kumar, S., Singh, J. P., and Li, K. The PARSEC Benchmark Suite: Characterization and Architectural Implications. In PACT. 2008. Google ScholarDigital Library
- Binkert, N., Beckmann, B., Black, G., Reinhardt, S. K., et al. The gem5 simulator. ACM Computer Architecture News, 39(2):1--7, May 2011. Google ScholarDigital Library
- Bogdan, P., Marculescu, R., Jain, S., and Gavila, R. T. An optimal control approach to power management for multi-voltage and frequency islands multiprocessor platforms under highly variable workloads. In NOCS, pages 35--42. 2012. Google ScholarDigital Library
- Chen, X., Xu, Z., Kim, H., Gratz, P., et al. In-network monitoring and control policy for dvfs of cmp networks-on-chip and last level caches. In NOCS, pages 43--50. 2012. Google ScholarDigital Library
- Flautner, K., Kim, N. S., Martin, S., Blaauw, D., et al. Drowsy caches: simple techniques for reducing leakage power. In ISCA, pages 148--157. 2002. Google ScholarDigital Library
- Guang, L., Nigussie, E., Koskinen, L., and Tenhunen, H. Autonomous DVFS on supply islands for energy-constrained NoC communication. Lecture Notes in Computer Science: Architecture of Computing Systems, 5455/2009:183--194, 2009. Google ScholarDigital Library
- Kahng, A. B., Li, B., Peh, L. S., and Samadi, K. ORION 2.0: a power-area simulator for interconnection networks. TVLSI, 20(1):191--196, January 2012. Google ScholarDigital Library
- Kirolos, S. and Massoud, Y. Adaptive SRAM design for dynamic voltage scaling VLSI systems. In MWSCAS, pages 1297--1300. 2007.Google ScholarCross Ref
- Konstantakopoulos, T., Eastep, J., Psota, J., and Agarwal, A. Energy scalability of on-chip interconnection networks in multicore architectures. Technical report, MIT Computer Science and Artificial Intelligence Laboratory, November 2007.Google Scholar
- Kowaliski, C. Gelsinger reveals details of Nehalem, Larrabee, Dunnington, 2008.Google Scholar
- Kumar, R. and Hinton, G. A family of 45nm IA processors. In ISSCC, pages 58--59. 2009.Google ScholarCross Ref
- Liang, G. and Jantsch, A. Adaptive power management for the on-chip communication network. In Proeedings of the Euromicro Conference on Digital System Design. 2006. Google ScholarDigital Library
- Mishra, A. K., Das, R., Eachempati, S., Iyer, R., et al. A case for dynamic frequency tuning in on-chip networks. In MICRO, pages 292--303. 2009. Google ScholarDigital Library
- Muralimanohar, N., Balasubramonian, R., and Jouppi, N. P. CACTI 6.0: a tool to model large caches. Technical report, HP Laboratories, 2009.Google Scholar
- Ogras, U. Y., Marculescu, R., and Marculescu, D. Variation-adaptive feedback control for networks-on-chip with multiple clock domains. In DAC, pages 614--619. 2008. Google ScholarDigital Library
- Rahimi, A., Salehi, M. E., Mohammadi, S., and Fakhraie, S. M. Low-energy GALS NoC with FIFO-monitoring dynamic voltage scaling. Microelectronics Journal, 42(6):889--896, June 2011. Google ScholarDigital Library
- Shang, L., Peh, L., and Jha, N. K. Power-efficient interconnection networks: dynamic voltage scaling with links. IEEE Computer Architecture Letters, 1(1), 2002. Google ScholarDigital Library
- Son, S. W., Malkowski, K., Chen, G., Kandemir, M., et al. Integrated link/CPU voltage scaling for reducing energy consumption of parallel sparse maxtrix applications. In IPDPS. 2006. Google ScholarDigital Library
- Wang, H., Peh, L. S., and Malik, S. Power-driven design of router microarchitectures in on-chip networks. In MICRO, pages 105--116. 2003. Google ScholarDigital Library
Index Terms
- Dynamic voltage and frequency scaling for shared resources in multicore processor designs
Recommendations
Resource Sharing Centric Dynamic Voltage and Frequency Scaling for CMP Cores, Uncore, and Memory
With the breakdown of Dennard’s scaling over the past decade, performance growth of modern microprocessor design has largely relied on scaling core count in chip multiprocessors (CMPs). The challenge of chip power density, however, remains and demands ...
Managing shared last-level cache in a heterogeneous multicore processor
PACT '13: Proceedings of the 22nd international conference on Parallel architectures and compilation techniquesHeterogeneous multicore processors that integrate CPU cores and data-parallel accelerators such as GPU cores onto the same die raise several new issues for sharing various on-chip resources. The shared last-level cache (LLC) is one of the most important ...
Performance-Energy Considerations for Shared Cache Management in a Heterogeneous Multicore Processor
Heterogeneous multicore processors that integrate CPU cores and data-parallel accelerators such as graphic processing unit (GPU) cores onto the same die raise several new issues for sharing various on-chip resources. The shared last-level cache (LLC) is ...
Comments