ABSTRACT
In this paper, we treat multi-core processor design space exploration as an application-driven machine learning problem. We develop two machine learning-based techniques for efficiently exploring the processor design space. We observe that these techniques result in multi-core processors whose performance is comparable (within 1%) to a processor design that requires an exhaustive exploration of the design space. These techniques often take orders of magnitude (a factor of 3800 at the minimum) less time for coming up with these processors. The benefits are up to 13% over intelligent search techniques that have been adapted to do multi-core design space exploration.
We leverage the knowledge gained in this research to develop Magellan -- a framework for accelerating multi-core design space exploration and optimization. Magellan can be used to find the highest throughput processors of a given type for a given area, power, or time budget. It can be used to aid even experienced processor designers that prefer to rely on intuition by allowing fast refinements to an input design.
- http://www.cse.ucsd.edu/calder/simpoint/.Google Scholar
- D. Fischer, J. Teich, M. Thies, and R. Weper. Efficient architecture/compiler co-exploration for asips. In International Conference on Compilers, Architecture, and Synthesis for Embedded Systems, 2002. Google ScholarDigital Library
- E. Ïpek, S. A. McKee, R. Caruana, B. R. de Supinski, and M. Schulz. Efficiently exploring architectural design spaces via predictive modeling. In ASPLOS-XII: Proceedings of the 12th international conference on Architectural support for programming languages and operating systems, pages 195--206, New York, NY, USA, 2006. ACM. Google ScholarDigital Library
- R. Kumar, D. M. Tullsen, and N. P. Jouppi. Core architecture optimization for heterogeneous chip multiprocessors. In PACT '06: Proceedings of the 15th international conference on Parallel architectures and compilation techniques, pages 23--32, New York, NY, USA, 2006. ACM Press. Google ScholarDigital Library
- B. C. Lee and D. M. Brooks. Accurate and efficient regression modeling for microarchitectural performance and power prediction. ASPLOS, 2006. Google ScholarDigital Library
- P. Shivakumar and N. Jouppi. CACTI 3.0: An integrated cache timing, power and area model. In Technical Report 2001/2, Compaq Computer Corporation, Aug. 2001.Google Scholar
- D. Tullsen. Simulation and modeling of a simultaneous multithreading processor. In 22nd Annual Computer Measurement Group Conference, Dec. 1996.Google ScholarDigital Library
Index Terms
- Magellan: a search and machine learning-based framework for fast multi-core design space exploration and optimization
Recommendations
Magellan: toward building entity matching management systems over data science stacks
Entity matching (EM) has been a long-standing challenge in data management. Most current EM works, however, focus only on developing matching algorithms. We argue that far more efforts should be devoted to building EM systems. We discuss the limitations ...
Evaluation of Rodinia Codes on Intel Xeon Phi
ISMS '13: Proceedings of the 2013 4th International Conference on Intelligent Systems, Modelling and SimulationHigh performance computing (HPC) is a niche area where various parallel benchmarks are constantly used to explore and evaluate the performance of Heterogeneous computing systems on the horizon. The Rodinia benchmark suite, a collection of parallel ...
On the Efficacy of a Fused CPU+GPU Processor (or APU) for Parallel Computing
SAAHPC '11: Proceedings of the 2011 Symposium on Application Accelerators in High-Performance ComputingThe graphics processing unit (GPU) has made significant strides as an accelerator in parallel computing. However, because the GPU has resided out on PCIe as a discrete device, the performance of GPU applications can be bottlenecked by data transfers ...
Comments