Predicting New Workload or CPU Performance by Analyzing Public Datasets

Authors:
Yu Wang

Harvard University, Cambridge, MA, US

Harvard University, Cambridge, MA, US
View Profile

,
Victor Lee

Intel Corporation, Santa Clara, CA, US

Intel Corporation, Santa Clara, CA, US
View Profile

,
Gu-Yeon Wei

Harvard University, Cambridge, MA, US

Harvard University, Cambridge, MA, US
View Profile

,
David Brooks

Harvard University, Cambridge, MA, US

Harvard University, Cambridge, MA, US
View Profile

ACM Transactions on Architecture and Code Optimization Volume 15 Issue 4Article No.: 53pp 1–21https://doi.org/10.1145/3284127

Published:08 January 2019Publication History

ACM Transactions on Architecture and Code Optimization

Abstract

The marketplace for general-purpose microprocessors offers hundreds of functionally similar models, differing by traits like frequency, core count, cache size, memory bandwidth, and power consumption. Their performance depends not only on microarchitecture, but also on the nature of the workloads being executed. Given a set of intended workloads, the consumer needs both performance and price information to make rational buying decisions. Many benchmark suites have been developed to measure processor performance, and their results for large collections of CPUs are often publicly available. However, repositories of benchmark results are not always helpful when consumers need performance data for new processors or new workloads. Moreover, the aggregate scores for benchmark suites designed to cover a broad spectrum of workload types can be misleading. To address these problems, we have developed a deep neural network (DNN) model, and we have used it to learn the relationship between the specifications of Intel CPUs and their performance on the SPEC CPU2006 and Geekbench 3 benchmark suites. We show that we can generate useful predictions for new processors and new workloads. We also cross-predict the two benchmark suites and compare their performance scores. The results quantify the self-similarity of these suites for the first time in the literature. This work should discourage consumers from basing purchasing decisions exclusively on Geekbench 3, and it should encourage academics to evaluate research using more diverse workloads than the SPEC CPU suites alone.

References

Newsha Ardalani, Clint Lestourgeon, Karthikeyan Sankaralingam, and Xiaojin Zhu. 2015. Cross-architecture performance prediction (XAPP) using CPU code to predict GPU performance. In Proceedings of the 48th International Symposium on Microarchitecture. ACM, 725--737. Google ScholarDigital Library
Simone Campanoni, Kevin Brownell, Svilen Kanev, Timothy M. Jones, Gu-Yeon Wei, and David Brooks. 2014. HELIX-RC: An architecture-compiler co-design for automatic parallelization of irregular programs. In Proceedings of the 41st Annual International Symposium on Computer Architecuture (ISCA’14). 217--228. Google ScholarDigital Library
Xi E. Chen and Tor M. Aamodt. 2011. Hybrid analytical modeling of pending cache hits, data prefetching, and MSHRs. ACM Transactions on Architecture and Code Optimization (TACO) 8, 3 (2011), 10. Google ScholarDigital Library
Vladimir Cherkassky and Yunqian Ma. 2004. Comparison of loss functions for linear regression. In Proceedings of the 2004 IEEE International Joint Conference on Neural Networks. Vol. 1. IEEE, 395--400.Google ScholarCross Ref
George Cybenko. 1989. Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals and Systems 2, 4 (1989), 303--314.Google ScholarCross Ref
Christina Delimitrou and Christos Kozyrakis. 2013. Paragon: QoS-aware scheduling for heterogeneous datacenters. In Proceedings of the 18th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’13), Vol. 48. ACM, 77--88. Google ScholarDigital Library
Christina Delimitrou and Christos Kozyrakis. 2014. Quasar: Resource-efficient and QoS-aware cluster management. In Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’14). ACM, 127--144. Google ScholarDigital Library
Werner Dubitzky, Martin Granzow, and Daniel P. Berrar. 2007. Fundamentals of Data Mining in Genomics and Proteomics. Springer Science 8 Business Media. Google ScholarDigital Library
Phillip Ein-Dor and Jacob Feldmesser. 1987. Attributes of the performance of central processing units: A relative performance prediction model. Communications of the ACM 30, 4 (1987), 308--317. Google ScholarDigital Library
Stijn Eyerman, Lieven Eeckhout, Tejas Karkhanis, and James E. Smith. 2009. A mechanistic performance model for superscalar out-of-order processors. ACM Transactions on Computer Systems (TOCS) 27, 2 (2009), 3. Google ScholarDigital Library
Stijn Eyerman, Kenneth Hoste, and Lieven Eeckhout. 2011. Mechanistic-empirical processor performance modeling for constructing CPI stacks on real hardware. In Proceedings of the 2011 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS’11). IEEE, 216--226. Google ScholarDigital Library
Allan Hartstein and Thomas R. Puzak. 2002. The optimum pipeline depth for a microprocessor. In Proceedings of the 29th Annual International Symposium on Computer Architecture (ISCA’02). IEEE Computer Society, 7--13. Google ScholarDigital Library
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770--778.Google ScholarCross Ref
Kenneth Hoste, Lieven Eeckhout, and Hendrik Blockeel. 2007. Analyzing commercial processor performance numbers for predicting performance of applications of interest. In Proceedings of the 2007 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems. ACM, 375--376. Google ScholarDigital Library
Kenneth Hoste, Aashish Phansalkar, Lieven Eeckhout, Andy Georges, Lizy K John, and Koen De Bosschere. 2006. Performance prediction based on inherent program similarity. In Proceedings of the 2006 International Conference on Parallel Architectures and Compilation Techniques (PACT’06). IEEE, 114--122. Google ScholarDigital Library
Engin Ipek, Bronis R. De Supinski, Martin Schulz, and Sally A. McKee. 2005. An approach to performance prediction for parallel applications. In European Conference on Parallel Processing. Springer, 196--205. Google ScholarDigital Library
Engin Ïpek, Sally A. McKee, Rich Caruana, Bronis R. de Supinski, and Martin Schulz. 2006. Efficiently exploring architectural design spaces via predictive modeling. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’06). ACM, 195--206. Google ScholarDigital Library
Tejas S. Karkhanis and James E. Smith. 2004. A first-order superscalar processor model. In Proceedings of the 31st Annual International Symposium on Computer Architecture, 2004. IEEE, 338--349. Google ScholarDigital Library
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Proceedings of Advances in Neural Information Processing Systems (NIPS’12). 1097--1105. Google ScholarDigital Library
Benjamin C. Lee et al. 2006. Accurate and efficient regression modeling for microarchitectural performance and power prediction. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, 185--194. Google ScholarDigital Library
Ang Li, Xuanran Zong, Srikanth Kandula, Xiaowei Yang, and Ming Zhang. 2011. CloudProphet: Towards application performance prediction in cloud. In Proceedings of the ACM SIGCOMM 2011 conference (SIGCOMM’11), 426--427. Google ScholarDigital Library
Kevin P. Murphy. 2012. Machine Learning: A Probabilistic Perspective. MIT Press. Google ScholarDigital Library
Aashish Phansalkar, Ajay Joshi, Lieven Eeckhout, and Lizy Kurian John. 2005. Measuring program similarity: Experiments with SPEC CPU benchmark suites. In IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS’05). IEEE, 10--20. Google ScholarDigital Library
Aashish Phansalkar, Ajay Joshi, and Lizy K. John. 2007. Analysis of redundancy and application balance in the SPEC CPU2006 benchmark suite. In Proceedings of the 34th Annual International Symposium on Computer Architecture. ACM, 412--423. Google ScholarDigital Library
Aashish Phansalkar, Ajay Joshi, and Lizy K. John. 2007. Subsetting the SPEC CPU2006 benchmark suite. ACM SIGARCH Computer Architecture News 35, 1 (2007), 69--76. Google ScholarDigital Library
Aashish Shreedhar Phansalkar. 2007. Measuring Program Similarity for Efficient Benchmarking and Performance Analysis of Computer Systems. The University of Texas at Austin.Google Scholar
Beau Piccart, Andy Georges, Hendrik Blockeel, and Lieven Eeckhout. 2011. Ranking commercial machines through data transposition. In Proceedings of the 2011 IEEE International Symposium on Workload Characterization (IISWC’11). IEEE, 3--14. Google ScholarDigital Library
Sameh Sharkawi, Don Desota, Raj Panda, Rajeev Indukuru, Stephen Stevens, Valerie Taylor, and Xingfu Wu. 2009. Performance projection of HPC applications using SPEC CFP2006 benchmarks. In Proceedings of the IEEE International Symposium on Parallel 8 Distributed Processing (IPDPS’09). IEEE, 1--12. Google ScholarDigital Library
Daniel Shelepov, Juan Carlos Saez Alcaide, Stacey Jeffery, Alexandra Fedorova, Nestor Perez, Zhi Feng Huang, Sergey Blagodurov, and Viren Kumar. 2009. HASS: A scheduler for heterogeneous multicore systems. ACM SIGOPS Operating Systems Review 43 (2009), 66--75. Google ScholarDigital Library
D. Shepelow and A. Fedorova. 2008. Scheduling on heterogeneous multicore processors using architectural signatures. In Proceedings of the WIOSCA Workshop of the 35th Annual International Symposium on Computer Architecture (ISCA’08).Google Scholar
Karan Singh, Engin İpek, Sally A. McKee, Bronis R. de Supinski, Martin Schulz, and Rich Caruana. 2007. Predicting parallel application performance via machine learning approaches. Concurrency and Computation: Practice and Experience 19, 17 (2007), 2219--2235. Google ScholarDigital Library
Benjamin Thirey and Randal Hickman. 2015. Distribution of Euclidean distances between randomly distributed Gaussian points in n-space. arXiv:1508.02238.Google Scholar
Sam Van den Steen, Sander De Pestel, Moncef Mechri, Stijn Eyerman, Trevor Carlson, David Black-Schaffer, Erik Hagersten, and Lieven Eeckhout. 2015. Micro-architecture independent analytical processor performance and power modeling. In Proceedings of the 2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS’15). IEEE, 32--41.Google ScholarCross Ref
Sam Van den Steen, Stijn Eyerman, Sander De Pestel, Moncef Mechri, Trevor E. Carlson, David Black-Schaffer, Erik Hagersten, and Lieven Eeckhout. 2016. Analytical processor performance and power modeling using micro-architecture independent characteristics. In IEEE Transaction on Computers 65, 12 (Dec. 2016), 3537--3551. Google ScholarDigital Library
Yuxuan Wang and DeLiang Wang. 2013. Towards scaling up classification-based speech separation. IEEE Transactions on Audio, Speech, and Language Processing 21, 7 (2013), 1381--1390. Google ScholarDigital Library
Xinnian Zheng, Lizy K. John, and Andreas Gerstlauer. 2016. Accurate phase-level cross-platform power and performance estimation. In Proceedings of the 2016 53rd ACM/EDAC/IEEE Design Automation Conference (DAC’16). IEEE, 1--6. Google ScholarDigital Library
Xinnian Zheng, Haris Vikalo, Shuang Song, Lizy K. John, and Andreas Gerstlauer. 2017. Sampling-based binary-level cross-platform performance estimation. In Proceedings of the Conference on Design, Automation 8 Test in Europe. European Design and Automation Association, 1713--1718. Google ScholarDigital Library

Index Terms

Predicting New Workload or CPU Performance by Analyzing Public Datasets
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Neural networks

Recommendations

Analyzing commercial processor performance numbers for predicting performance of applications of interest
SIGMETRICS '07: Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems

Current practice in benchmarking commercial computer systems is to run a number of industry-standard benchmarks and to report performance numbers. The huge amount of machines and the large number of benchmarks for which performance numbers are published ...
Read More
Analyzing commercial processor performance numbers for predicting performance of applications of interest
SIGMETRICS '07 Conference Proceedings

Current practice in benchmarking commercial computer systems is to run a number of industry-standard benchmarks and to report performance numbers. The huge amount of machines and the large number of benchmarks for which performance numbers are published ...
Read More
Modeling and predicting performance of high performance computing applications on hardware accelerators

Hybrid-core systems speedup applications by offloading certain compute operations that can run faster on hardware accelerators. However, such systems require significant programming and porting effort to gain a performance benefit from the accelerators. ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Architecture and Code Optimization Volume 15, Issue 4
December 2018
706 pages
ISSN:1544-3566
EISSN:1544-3973
DOI:10.1145/3284745
Editor:
Koen De Bosschere
Ghent University
Issue’s Table of Contents
Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 8 January 2019
- Accepted: 1 September 2018
- Revised: 1 August 2018
- Received: 1 February 2018
Published in taco Volume 15, Issue 4

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Performance prediction
benchmarking
data mining
performance comparison
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 8
  Total Citations
  View Citations
- 6,022
  Total Downloads
- Downloads (Last 12 months)1,160
- Downloads (Last 6 weeks)122
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format