Abstract
Over the last decade, Grid computing paved the way for a new level of large scale distributed systems. This infrastructure made it possible to securely and reliably take advantage of widely separated computational resources that are part of several different organizations. Resources can be incorporated to the Grid, building a theoretical virtual supercomputer. In time, cloud computing emerged as a new type of large scale distributed system, inheriting and expanding the expertise and knowledge that have been obtained so far. Some of the main characteristics of Grids naturally evolved into clouds, others were modified and adapted and others were simply discarded or postponed. Regardless of these technical specifics, both Grids and clouds together can be considered as one of the most important advances in large scale distributed computing of the past ten years; however, this step in distributed computing has came along with a completely new level of complexity. Grid and cloud management mechanisms play a key role, and correct analysis and understanding of the system behavior are needed. Large scale distributed systems must be able to self-manage, incorporating autonomic features capable of controlling and optimizing all resources and services. Traditional distributed computing management mechanisms analyze each resource separately and adjust specific parameters of each one of them. When trying to adapt the same procedures to Grid and cloud computing, the vast complexity of these systems can make this task extremely complicated. But large scale distributed systems complexity could only be a matter of perspective. It could be possible to understand the Grid or cloud behavior as a single entity, instead of a set of resources. This abstraction could provide a different understanding of the system, describing large scale behavior and global events that probably would not be detected analyzing each resource separately. In this work we define a theoretical framework that combines both ideas, multiple resources and single entity, to develop large scale distributed systems management techniques aimed at system performance optimization, increased dependability and Quality of Service (QoS). The resulting synergy could be the key to address the most important difficulties of Grid and cloud management.
Similar content being viewed by others
References
TOP500 Supercomputing Sites: http://www.top500.org/ (online). Accessed Jul 2012
Kesselman, C., Foster, I. (eds.): L The Grid: Blueprint for a New Computing Infrastructure. Morgan Kaufmann, San Mateo, CA (1998)
Weiss, A.: Computing in the clouds. Networker 11(4), 16 (2007)
Foster, I.: What is the Grid? A three point checklist. Grid Today 1(6). http://www-fp.mcs.anl.gov/~foster/Articles/WhatIsTheGrid.pdf. (2002)
Foster, I.T., Zhao, Y., Raicu, I., Lu, S.: Cloud computing and Grid computing 360-degree compared. CoRR abs/0901.0131 (2009)
IBM: An Architectural Blueprint for Autonomic Computing, 4th edn. IBM Autonomic Computing White Paper (2006)
Buyya, R., Abramson, D., Giddy, J.: Nimrod/G: an architecture for a resource management and scheduling system in a global computational Grid. In: Proceedings of the Fourth International Conference/Exhibition on High Performance Computing in the Asia-Pacific Region, vol. 1, pp. 283–289 (2000)
Krauter, K., Buyya, R., Maheswaran, M.: A taxonomy and survey of Grid resource management systems for distributed computing. Softw. Pract. Exp. 32(2), 135 (2002)
Siddiqui, M., Fahringer, T.: GridARM: Askalon’s Grid resource management system. In: Advances in Grid Computing - EGC 2005 - Revised Selected Papers. Lecture Notes in Computer Science, vol. 3470, pp. 122–131. Springer Verlag GmbH, Amsterdam, Netherlands. ISBN 3-540-26918-5 (2005)
Sánchez, A., Montes, J., Pérez, M.S., Cortes, T.: An autonomic framework for enhancing the quality of data Grid services. Future Gener. Comput. Syst. 28(7), 1005 (2012)
Maurer, M., Breskovic, I., Emeakaroha, V., Brandic, I.: Revealing the MAPE loop for the autonomic management of cloud infrastructures. In: 2011 IEEE Symposium on Computers and Communications (ISCC), pp. 147–152 (2011)
Solomon, B., Ionescu, D., Litoiu, M., Iszlai, G.: Designing autonomic management systems for cloud computing. In: 2010 International Joint Conference on Computational Cybernetics and Technical Informatics (ICCC-CONTI), pp. 631–636 (2010)
Open Science Grid: https://www.opensciencegrid.org/bin/view (online). Accessed Jul 2012
TeraGrid Archives: https://www.xsede.org/tg-archives (online). Accessed Jul 2012
Gagliardi, F., Jones, B., Grey, F., Bgin, M.E., Heikkurinen, M.: Building an infrastructure for scientific Grid computing: status and goals of the egee project. Philos. Trans. R. Soc. A: Math. Phys. Eng. Sci. 363(1833), 1729 (2005)
XSEDE – Home: https://www.xsede.org/ (online). Accessed Jul 2012
EGI-InSPIRE: http://www.egi.eu/about/egi-inspire/ (Online). Accessed Jun 2012
Dongarra, J.J., Gentzsch, W. (eds.): Computer Benchmarks. Elsevier Science Publishers B. V., Amsterdam, The Netherlands, The Netherlands (1993)
Dikaiakos, M.D.: Grid benchmarking: vision, challenges, and current status: research articles. Concurr. Comput.-Pract. Exp. 19(1), 89 (2007)
Frumkin, M., der Wijngaart, R.F.V.: NAS Grid benchmarks: a tool for Grid space exploration. Cluster Comput. 5(3), 247 (2002)
Tsouloupas, G., Dikaiakos, M.D.: Gridbench: a tool for the interactive performance exploration of Grid infrastructures. J. Parallel Distrib. Comput. 67(9), 1029 (2007)
Ogura, D.R., Midorikawa, E.T.: Characterization of scientific and transactional applications under multi-core architectures on cloud computing environment. In: Proceedings of the 2010 13th IEEE International Conference on Computational Science and Engineering. CSE ’10, pp. 314–320. IEEE Computer Society, Washington, DC, USA (2010)
Bailey, D.H., Barszcz, E., Barton, J.T., Browning, D.S., Carter, R.L., Dagum, L., Fatoohi, R.A., Frederickson, P.O., Lasinski, T.A., Schreiber, R.S., Simon, H.D., Venkatakrishnan, V., Weeratunga, S.K.: The nas parallel benchmarks - summary and preliminary results. In: Proceedings of the 1991 ACM/IEEE Conference on Supercomputing. Supercomputing ’91, pp. 158–165. ACM, New York, NY, USA (1991)
TPC: TPC-H http://www.tpc.org/tpch/ (Online). Accessed Jul 2012
van der Aalst, W.M.P., Bratosin, C., Sidorova, N., Trcka, N.: A reference model for Grid architectures and its validation. Concurr. Comput.-Pract. Exp. 22(11), 1365 (2010)
Chan, W.K., Mei, L., Zhang, Z.: Modeling and testing of cloud applications. In Kirchberg, M., Hung, P.C.K., Carminati, B., Chi, C.H., Kanagasabai, R., Valle, E.D., Lan, K.C., Chen L.J. (eds.) APSCC, pp. 111–118. IEEE (2009)
Östberg, P.O., Elmroth, E.: Increasing flexibility and abstracting complexity in service-based Grid and cloud software. In: Proceedings of CLOSER 2011 - International Conference on Cloud Computing and Services Science, pp. 240–249. SciTePress (2011)
Stockinger, H.: Defining the Grid: a snapshot on the current view. J. Supercomput. 42(1), 3 (2007)
Twenty-One Experts Define Cloud Computing: http://cloudcomputing.sys-con.com/node/612375/print (online). Accessed Jul 2012
Vaquero, L.M., Rodero-Merino, L., Caceres, J., Lindner, M.: A break in the clouds: towards a cloud definition. Comput. Commun. Rev. 39(1), 50 (2009)
The Globus Alliance: http://www.globus.org (online). Accessed Jul 2012
gLite: Lightweight Middleware for Grid Computing. http://glite.cern.ch/ (online). Accessed Sept 2011
Seti@home: The Search for ExtraTerrestrial Inteligence. http://setiathome.ssl.berkeley.edu (online). Accessed Jul 2012
Jégou, Y., Lantéri, S., Leduc, J., Noredine, M., Mornet, G., Namyst, R., Primet, P., Quetier, B., Richard, O., Talbi, E.G., Iréa, T.: Grid’5000: a large scale and highly reconfigurable experimental Grid testbed. Int. J. High Perform. Comput. Appl. 20(4), 481 (2006)
Nurmi, D., Wolski, R., Grzegorczyk, C., Obertelli, G., Soman, S., Youseff, L., Zagorodnov, D.: The eucalyptus open-source cloud-computing system. In: Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid 2009), pp. 124–131. IEEE Computer Society (2009)
Varia, J.: Architecting for the Cloud: Best Practices. Amazon White Paper (2010)
Avizienis, A., Laprie, J.C., Randell, B., Landwehr, C.: Basic concepts and taxonomy of dependable and secure computing. IEEE Trans. Dependable Secure Compu. 1(1), 11 (2004)
Google App Engine - Google Code: http://code.google.com/appengine/ (online). Accessed Jul 2012
Hopcroft, J.E., Motwani, R., Ullman, J.D.: Introduction to Automata Theory, Languages, and Computation, 2nd edn. Addison Wesley (2000)
Carroll, J., Long, D.: Theory of Finite Automata with an Introduction to Formal Languages. Prentice-Hall, Inc., Upper Saddle River, NJ, USA (1989)
Sotomayor, B., Montero, R.S., Llorente, I.M., Foster, I.: Virtual infrastructure management in private and hybrid clouds. IEEE Iternet Comput. 13, 14 (2009)
Keahey, K., Figueiredo, R., Fortes, J., Freeman, T., Tsugawa, M.: Science clouds: early experiences in cloud computing for scientific applications. In: Proceedings of the 2008 Cloud Computing and Its Applications 2008 (CCA-08) (2008)
Rodero-Merino, L., Vaquero, L.M., Gil, V., Galn, F., Fontn, J., Montero, R.S., Llorente, I.M.: From infrastructure delivery to service management in clouds. Future Gener. Comput. Syst. 26(8), 1226 (2010)
Rochwerger, B., Breitgand, D., Levy, E., Galis, A., Nagin, K., Llorente, I.M., Montero, R., Wolfsthal, Y., Elmroth, E., Cáceres, J., Ben-Yehuda, M., Emmerich, W., Galán, F.: The RESERVOIR model and architecture for open federated cloud computing. IBM J. Res. Develop. 53, 535 (2009)
Montes, J., Sánchez, A., Valdés, J.J., Pérez, M.S., Herrero, P.: Finding order in chaos: a behavior model of the whole Grid. Concurr. Comput.-Pract. Exp. 22, 1386 (2010)
Valdés, J.J.: Similarity-based heterogeneous neurons in the context of general observational models. Neural Netw. World 12, 499 (2002)
Valdés, J.J.: Virtual reality representation of information systems and decision rules. Lect. Notes Artif. Intell. 2639, 615 (2003)
Montes, J., Sánchez, A., Pérez, M.S.: Grid global behavior prediction. In: Proceedings of the 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2011), pp. 124–133. IEEE Computer Society (2011)
Montes, J., Nicolae, B., Antoniu, G., Sánchez, A., Pérez, M.S.: Using global behavior modeling to improve qos in cloud data storage services. In: 2nd IEEE International Conference on Cloud Computing Technology and Science (CloudCom 2010), pp. 304–311. IEEE Computer Society (2010)
Nicolae, B., Antoniu, G., Bougé, L., Moise, D., Carpen-Amarie, A.: Blobseer: next-generation data management for large scale infrastructures. J. Parallel Distrib. Comput. 71, 169 (2011)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Montes, J., Sánchez, A. & Pérez, M.S. Riding Out the Storm: How to Deal with the Complexity of Grid and Cloud Management. J Grid Computing 10, 349–366 (2012). https://doi.org/10.1007/s10723-012-9225-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10723-012-9225-4