Abstract
The marketplace for general-purpose microprocessors offers hundreds of functionally similar models, differing by traits like frequency, core count, cache size, memory bandwidth, and power consumption. Their performance depends not only on microarchitecture, but also on the nature of the workloads being executed. Given a set of intended workloads, the consumer needs both performance and price information to make rational buying decisions. Many benchmark suites have been developed to measure processor performance, and their results for large collections of CPUs are often publicly available. However, repositories of benchmark results are not always helpful when consumers need performance data for new processors or new workloads. Moreover, the aggregate scores for benchmark suites designed to cover a broad spectrum of workload types can be misleading. To address these problems, we have developed a deep neural network (DNN) model, and we have used it to learn the relationship between the specifications of Intel CPUs and their performance on the SPEC CPU2006 and Geekbench 3 benchmark suites. We show that we can generate useful predictions for new processors and new workloads. We also cross-predict the two benchmark suites and compare their performance scores. The results quantify the self-similarity of these suites for the first time in the literature. This work should discourage consumers from basing purchasing decisions exclusively on Geekbench 3, and it should encourage academics to evaluate research using more diverse workloads than the SPEC CPU suites alone.
- Newsha Ardalani, Clint Lestourgeon, Karthikeyan Sankaralingam, and Xiaojin Zhu. 2015. Cross-architecture performance prediction (XAPP) using CPU code to predict GPU performance. In Proceedings of the 48th International Symposium on Microarchitecture. ACM, 725--737. Google ScholarDigital Library
- Simone Campanoni, Kevin Brownell, Svilen Kanev, Timothy M. Jones, Gu-Yeon Wei, and David Brooks. 2014. HELIX-RC: An architecture-compiler co-design for automatic parallelization of irregular programs. In Proceedings of the 41st Annual International Symposium on Computer Architecuture (ISCA’14). 217--228. Google ScholarDigital Library
- Xi E. Chen and Tor M. Aamodt. 2011. Hybrid analytical modeling of pending cache hits, data prefetching, and MSHRs. ACM Transactions on Architecture and Code Optimization (TACO) 8, 3 (2011), 10. Google ScholarDigital Library
- Vladimir Cherkassky and Yunqian Ma. 2004. Comparison of loss functions for linear regression. In Proceedings of the 2004 IEEE International Joint Conference on Neural Networks. Vol. 1. IEEE, 395--400.Google ScholarCross Ref
- George Cybenko. 1989. Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals and Systems 2, 4 (1989), 303--314.Google ScholarCross Ref
- Christina Delimitrou and Christos Kozyrakis. 2013. Paragon: QoS-aware scheduling for heterogeneous datacenters. In Proceedings of the 18th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’13), Vol. 48. ACM, 77--88. Google ScholarDigital Library
- Christina Delimitrou and Christos Kozyrakis. 2014. Quasar: Resource-efficient and QoS-aware cluster management. In Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’14). ACM, 127--144. Google ScholarDigital Library
- Werner Dubitzky, Martin Granzow, and Daniel P. Berrar. 2007. Fundamentals of Data Mining in Genomics and Proteomics. Springer Science 8 Business Media. Google ScholarDigital Library
- Phillip Ein-Dor and Jacob Feldmesser. 1987. Attributes of the performance of central processing units: A relative performance prediction model. Communications of the ACM 30, 4 (1987), 308--317. Google ScholarDigital Library
- Stijn Eyerman, Lieven Eeckhout, Tejas Karkhanis, and James E. Smith. 2009. A mechanistic performance model for superscalar out-of-order processors. ACM Transactions on Computer Systems (TOCS) 27, 2 (2009), 3. Google ScholarDigital Library
- Stijn Eyerman, Kenneth Hoste, and Lieven Eeckhout. 2011. Mechanistic-empirical processor performance modeling for constructing CPI stacks on real hardware. In Proceedings of the 2011 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS’11). IEEE, 216--226. Google ScholarDigital Library
- Allan Hartstein and Thomas R. Puzak. 2002. The optimum pipeline depth for a microprocessor. In Proceedings of the 29th Annual International Symposium on Computer Architecture (ISCA’02). IEEE Computer Society, 7--13. Google ScholarDigital Library
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770--778.Google ScholarCross Ref
- Kenneth Hoste, Lieven Eeckhout, and Hendrik Blockeel. 2007. Analyzing commercial processor performance numbers for predicting performance of applications of interest. In Proceedings of the 2007 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems. ACM, 375--376. Google ScholarDigital Library
- Kenneth Hoste, Aashish Phansalkar, Lieven Eeckhout, Andy Georges, Lizy K John, and Koen De Bosschere. 2006. Performance prediction based on inherent program similarity. In Proceedings of the 2006 International Conference on Parallel Architectures and Compilation Techniques (PACT’06). IEEE, 114--122. Google ScholarDigital Library
- Engin Ipek, Bronis R. De Supinski, Martin Schulz, and Sally A. McKee. 2005. An approach to performance prediction for parallel applications. In European Conference on Parallel Processing. Springer, 196--205. Google ScholarDigital Library
- Engin Ïpek, Sally A. McKee, Rich Caruana, Bronis R. de Supinski, and Martin Schulz. 2006. Efficiently exploring architectural design spaces via predictive modeling. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’06). ACM, 195--206. Google ScholarDigital Library
- Tejas S. Karkhanis and James E. Smith. 2004. A first-order superscalar processor model. In Proceedings of the 31st Annual International Symposium on Computer Architecture, 2004. IEEE, 338--349. Google ScholarDigital Library
- Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Proceedings of Advances in Neural Information Processing Systems (NIPS’12). 1097--1105. Google ScholarDigital Library
- Benjamin C. Lee et al. 2006. Accurate and efficient regression modeling for microarchitectural performance and power prediction. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, 185--194. Google ScholarDigital Library
- Ang Li, Xuanran Zong, Srikanth Kandula, Xiaowei Yang, and Ming Zhang. 2011. CloudProphet: Towards application performance prediction in cloud. In Proceedings of the ACM SIGCOMM 2011 conference (SIGCOMM’11), 426--427. Google ScholarDigital Library
- Kevin P. Murphy. 2012. Machine Learning: A Probabilistic Perspective. MIT Press. Google ScholarDigital Library
- Aashish Phansalkar, Ajay Joshi, Lieven Eeckhout, and Lizy Kurian John. 2005. Measuring program similarity: Experiments with SPEC CPU benchmark suites. In IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS’05). IEEE, 10--20. Google ScholarDigital Library
- Aashish Phansalkar, Ajay Joshi, and Lizy K. John. 2007. Analysis of redundancy and application balance in the SPEC CPU2006 benchmark suite. In Proceedings of the 34th Annual International Symposium on Computer Architecture. ACM, 412--423. Google ScholarDigital Library
- Aashish Phansalkar, Ajay Joshi, and Lizy K. John. 2007. Subsetting the SPEC CPU2006 benchmark suite. ACM SIGARCH Computer Architecture News 35, 1 (2007), 69--76. Google ScholarDigital Library
- Aashish Shreedhar Phansalkar. 2007. Measuring Program Similarity for Efficient Benchmarking and Performance Analysis of Computer Systems. The University of Texas at Austin.Google Scholar
- Beau Piccart, Andy Georges, Hendrik Blockeel, and Lieven Eeckhout. 2011. Ranking commercial machines through data transposition. In Proceedings of the 2011 IEEE International Symposium on Workload Characterization (IISWC’11). IEEE, 3--14. Google ScholarDigital Library
- Sameh Sharkawi, Don Desota, Raj Panda, Rajeev Indukuru, Stephen Stevens, Valerie Taylor, and Xingfu Wu. 2009. Performance projection of HPC applications using SPEC CFP2006 benchmarks. In Proceedings of the IEEE International Symposium on Parallel 8 Distributed Processing (IPDPS’09). IEEE, 1--12. Google ScholarDigital Library
- Daniel Shelepov, Juan Carlos Saez Alcaide, Stacey Jeffery, Alexandra Fedorova, Nestor Perez, Zhi Feng Huang, Sergey Blagodurov, and Viren Kumar. 2009. HASS: A scheduler for heterogeneous multicore systems. ACM SIGOPS Operating Systems Review 43 (2009), 66--75. Google ScholarDigital Library
- D. Shepelow and A. Fedorova. 2008. Scheduling on heterogeneous multicore processors using architectural signatures. In Proceedings of the WIOSCA Workshop of the 35th Annual International Symposium on Computer Architecture (ISCA’08).Google Scholar
- Karan Singh, Engin İpek, Sally A. McKee, Bronis R. de Supinski, Martin Schulz, and Rich Caruana. 2007. Predicting parallel application performance via machine learning approaches. Concurrency and Computation: Practice and Experience 19, 17 (2007), 2219--2235. Google ScholarDigital Library
- Benjamin Thirey and Randal Hickman. 2015. Distribution of Euclidean distances between randomly distributed Gaussian points in n-space. arXiv:1508.02238.Google Scholar
- Sam Van den Steen, Sander De Pestel, Moncef Mechri, Stijn Eyerman, Trevor Carlson, David Black-Schaffer, Erik Hagersten, and Lieven Eeckhout. 2015. Micro-architecture independent analytical processor performance and power modeling. In Proceedings of the 2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS’15). IEEE, 32--41.Google ScholarCross Ref
- Sam Van den Steen, Stijn Eyerman, Sander De Pestel, Moncef Mechri, Trevor E. Carlson, David Black-Schaffer, Erik Hagersten, and Lieven Eeckhout. 2016. Analytical processor performance and power modeling using micro-architecture independent characteristics. In IEEE Transaction on Computers 65, 12 (Dec. 2016), 3537--3551. Google ScholarDigital Library
- Yuxuan Wang and DeLiang Wang. 2013. Towards scaling up classification-based speech separation. IEEE Transactions on Audio, Speech, and Language Processing 21, 7 (2013), 1381--1390. Google ScholarDigital Library
- Xinnian Zheng, Lizy K. John, and Andreas Gerstlauer. 2016. Accurate phase-level cross-platform power and performance estimation. In Proceedings of the 2016 53rd ACM/EDAC/IEEE Design Automation Conference (DAC’16). IEEE, 1--6. Google ScholarDigital Library
- Xinnian Zheng, Haris Vikalo, Shuang Song, Lizy K. John, and Andreas Gerstlauer. 2017. Sampling-based binary-level cross-platform performance estimation. In Proceedings of the Conference on Design, Automation 8 Test in Europe. European Design and Automation Association, 1713--1718. Google ScholarDigital Library
Index Terms
- Predicting New Workload or CPU Performance by Analyzing Public Datasets
Recommendations
Analyzing commercial processor performance numbers for predicting performance of applications of interest
SIGMETRICS '07: Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systemsCurrent practice in benchmarking commercial computer systems is to run a number of industry-standard benchmarks and to report performance numbers. The huge amount of machines and the large number of benchmarks for which performance numbers are published ...
Analyzing commercial processor performance numbers for predicting performance of applications of interest
SIGMETRICS '07 Conference ProceedingsCurrent practice in benchmarking commercial computer systems is to run a number of industry-standard benchmarks and to report performance numbers. The huge amount of machines and the large number of benchmarks for which performance numbers are published ...
Modeling and predicting performance of high performance computing applications on hardware accelerators
Hybrid-core systems speedup applications by offloading certain compute operations that can run faster on hardware accelerators. However, such systems require significant programming and porting effort to gain a performance benefit from the accelerators. ...
Comments