Abstract
In this research, we explore the defragmentation of allocated compute resources so as to conserve energy on an IBM Blue Gene/Q. We examine a real trace from a new four-rack system and explore through simulation three heuristics to minimise energy waste through defragmentation. We describe a number of heuristics for detecting when it is desirable from an energy standpoint to defragment the computing resource through checkpoint/restart. Using heuristics, we were able to gain a simulated saving of 4.36 % of total system power. When applied to all BlueGene/Qs on the Top500 list, this is the equivalent of running the average US household for 698.5 years per annum.
Similar content being viewed by others
References
Aziz A, El-Rewini H (2009) Power aware scheduling in computational grids. In: Proceedings of the 2009 international conference on parallel and distributed processing techniques and applications, CSREA Press, Las Vegas
Bautista-Gomez L, Komatitsch D, Maruyama N, Tsuboi S, Cappello F, Matsuoka S (2011) FTI: high performance fault tolerance interface for hybrid systems. In: Proceedings of international conference for high performance computing, networking, storage and analysis, Seattle, WA
Chari S (2011) Ibm blue gene/q: the most energy efficient green solution for high performance computing. Cabot Partners Group Inc., Danbury
Elnozahy E, Kistler M, Rajamony R (2003) Energy-efficient server clusters, chapter: Power-aware computer systems. In: Proceedings of lecture notes in computer science, Springer, Berlin, pp 179–197
Freeh VW, Pan F, Kappiah N, Lowenthal DK, Springer R (2005) Exploring the energy-time tradeoff in MPI programs on a power-scalable cluster. In: Proceedings of the 19th IEEE international parallel and distributed processing symposium (IPDPS’05), IPDPS ’05, IEEE Computer Society, Washington, DC, USA, 2005, p 4a
Gara A, Blumrich MA, Chen D, Chiu GL-T, Coteus P, Giampapa ME, Haring RA, Heidelberger P, Hoenicke D, Kopcsay GV, Liebsch TA, Ohmacht M, Steinmacher-Burow BD, Takken T, Vranas P (2009) Overview of the blue gene/l system architecture. IBM J Res Dev 49(2/3):195–212
Gilge M (2012) Ibm system blue gene solution: Blue gene/q application development. Technical report SG24-7948-00, International Business Machines Corporation, 2012
Harada F, Ushio T, Nakamoto Y (2006) Power-aware resource allocation with fair QoSguarantee. In: Proceedings of the 12th IEEE international conference on embedded and real-time computing systems and applications, RTCSA ’06, IEEE Computer Society, Washington, DC, USA, 2006, pp 287–293
Heath T, Diniz B, Carrera EV, Meira W Jr, Bianchini R (2005) Energy conservation in heterogeneous server clusters. In: Proceedings of the tenth ACM SIGPLAN symposium on principles and practice of parallel programming, PPoPP ’05, ACM, New York, NY, USA, 2005, pp 186–195
Hsu CH, Feng W (2005) A feasibility analysis of power awareness in commodity-based high-performance clusters. In: Proceedings of 7th IEEE international conference on cluster computing (CLUSTER’05), Boston, Massachusetts, Sept 2005
Hsu CH, Feng W (2005) A power-aware run-time system for high-performance computing. In: Proceedings of ACM/IEEE SC2005, the international conference on high-performance computing, networking, and storage, Seattle, Washington, Nov 2005
Hsu CH, Feng W, Archuleta JS (2005) Towards efficient supercomputing: a quest for the right metric. In: Proceedings of 1st IEEE workshop on high-performance, power-aware computing (in conjunction with the 19th international parallel and distributed processing symposium), Denver, Colorado, April 2005
Khan SU (2009) A game theoretical energy efficient resource allocation technique for large distributed computing systems. In: Proceedings of the 2009 international conference on parallel and distributed processing techniques and applications, CSREA Press, Las Vegas
Khargharia B, Hariri S, Yousif MS (2008) Autonomic power and performance management for computing systems. Clust Comput 11(2):167–181
Lawrence Livermore National Laboratory. https://computation-rnd.llnl.gov/scr/. Retrieved 03 July 2014
Meuer H, Strohmaier E, Dongarra J, Simon H (2013) Top 500. Retrieved from http://s.top500.org/static/lists/2013/06/TOP500_201306.xls. Accessed 19 June 2013
Pinheiro E, Bianchini R, Carrera E, Heath T (2001) Load balancing and unbalancing for power and performance in cluster-based systems. In: Proceedings of the workshop on compilers and operating systems for low power (COLP’01), Sept 2001
Rajachandrasekar R, Moody A, Mohror K, Panda DK (2013) A 1 PB/s file system to checkpoint three million MPI tasks. In: Proceedings of the ACM international symposium on high-performance parallel and distributed computing (HPDC’13)
Rusu C, Ferreira A, Scordino C, Watson A (2006) Energy-efficient real-time heterogeneous server clusters. In: Proceedings of the 12th IEEE real-time and embedded technology and applications symposium, RTAS ’06, IEEE Computer Society, Washington, DC, USA, 2006, pp 418–428
Springer R, Lowenthal DK, Rountree B, Freeh VW (2006) Minimizing execution time in MPI programs on an energy-constrained, power-scalable cluster. In: Proceedings of the eleventh ACM SIGPLAN symposium on principles and practice of parallel programming, PPoPP ’06, ACM, New York, NY, USA, 2006, pp 230–238
The Blue Gene/P Team (2008) Overview of the ibm bluegene/p project. IBM J Res Dev 52(1/2):199–220
US Department of Energy. http://www.eia.gov/tools/faqs/faq.cfm?id=97&t=3. Retrieved 28 Mar 2013
Vahdat A, Lebeck A, Ellis CS (2000) Every joule is precious: the case for revisiting operating system design for energy efficiency. In: Proceedings of the 9th workshop on ACM SIGOPS European workshop, EW 9, ACM, New York, NY, USA, pp 31–36
Velte TJ, Velte A, Elsenpeter R (2008) Green IT: reduce your information system’s environmental impact while adding to the bottom line. McGraw-Hill, New York
Verma A, Ahuja P, Neogi A (2008) Power-aware dynamic placement of HPC applications. In: Proceedings of the 22nd annual international conference on Supercomputing, ICS ’08, ACM, New York, NY, USA, pp 175–184
Yoshii K, Iskra K, Gupt R, Beckman P, Vishwanath V, Yu C, Coghlan S (2012) Evaluating power monitoring capabilities on ibm blue gene/p and blue gene/q. In: Proceedings of IEEE international conference on cluster computing (CLUSTER)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Lynar, T.M., Nelson, M.D. Blue Gene/Q defragmentation for energy waste minimisation. J Supercomput 71, 202–216 (2015). https://doi.org/10.1007/s11227-014-1293-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-014-1293-8