Skip to main content
Log in

Riding Out the Storm: How to Deal with the Complexity of Grid and Cloud Management

  • Published:
Journal of Grid Computing Aims and scope Submit manuscript

Abstract

Over the last decade, Grid computing paved the way for a new level of large scale distributed systems. This infrastructure made it possible to securely and reliably take advantage of widely separated computational resources that are part of several different organizations. Resources can be incorporated to the Grid, building a theoretical virtual supercomputer. In time, cloud computing emerged as a new type of large scale distributed system, inheriting and expanding the expertise and knowledge that have been obtained so far. Some of the main characteristics of Grids naturally evolved into clouds, others were modified and adapted and others were simply discarded or postponed. Regardless of these technical specifics, both Grids and clouds together can be considered as one of the most important advances in large scale distributed computing of the past ten years; however, this step in distributed computing has came along with a completely new level of complexity. Grid and cloud management mechanisms play a key role, and correct analysis and understanding of the system behavior are needed. Large scale distributed systems must be able to self-manage, incorporating autonomic features capable of controlling and optimizing all resources and services. Traditional distributed computing management mechanisms analyze each resource separately and adjust specific parameters of each one of them. When trying to adapt the same procedures to Grid and cloud computing, the vast complexity of these systems can make this task extremely complicated. But large scale distributed systems complexity could only be a matter of perspective. It could be possible to understand the Grid or cloud behavior as a single entity, instead of a set of resources. This abstraction could provide a different understanding of the system, describing large scale behavior and global events that probably would not be detected analyzing each resource separately. In this work we define a theoretical framework that combines both ideas, multiple resources and single entity, to develop large scale distributed systems management techniques aimed at system performance optimization, increased dependability and Quality of Service (QoS). The resulting synergy could be the key to address the most important difficulties of Grid and cloud management.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. TOP500 Supercomputing Sites: http://www.top500.org/ (online). Accessed Jul 2012

  2. Kesselman, C., Foster, I. (eds.): L The Grid: Blueprint for a New Computing Infrastructure. Morgan Kaufmann, San Mateo, CA (1998)

    Google Scholar 

  3. Weiss, A.: Computing in the clouds. Networker 11(4), 16 (2007)

    Article  Google Scholar 

  4. Foster, I.: What is the Grid? A three point checklist. Grid Today 1(6). http://www-fp.mcs.anl.gov/~foster/Articles/WhatIsTheGrid.pdf. (2002)

  5. Foster, I.T., Zhao, Y., Raicu, I., Lu, S.: Cloud computing and Grid computing 360-degree compared. CoRR abs/0901.0131 (2009)

  6. IBM: An Architectural Blueprint for Autonomic Computing, 4th edn. IBM Autonomic Computing White Paper (2006)

  7. Buyya, R., Abramson, D., Giddy, J.: Nimrod/G: an architecture for a resource management and scheduling system in a global computational Grid. In: Proceedings of the Fourth International Conference/Exhibition on High Performance Computing in the Asia-Pacific Region, vol. 1, pp. 283–289 (2000)

  8. Krauter, K., Buyya, R., Maheswaran, M.: A taxonomy and survey of Grid resource management systems for distributed computing. Softw. Pract. Exp. 32(2), 135 (2002)

    Article  MATH  Google Scholar 

  9. Siddiqui, M., Fahringer, T.: GridARM: Askalon’s Grid resource management system. In: Advances in Grid Computing - EGC 2005 - Revised Selected Papers. Lecture Notes in Computer Science, vol. 3470, pp. 122–131. Springer Verlag GmbH, Amsterdam, Netherlands. ISBN 3-540-26918-5 (2005)

    Chapter  Google Scholar 

  10. Sánchez, A., Montes, J., Pérez, M.S., Cortes, T.: An autonomic framework for enhancing the quality of data Grid services. Future Gener. Comput. Syst. 28(7), 1005 (2012)

    Article  Google Scholar 

  11. Maurer, M., Breskovic, I., Emeakaroha, V., Brandic, I.: Revealing the MAPE loop for the autonomic management of cloud infrastructures. In: 2011 IEEE Symposium on Computers and Communications (ISCC), pp. 147–152 (2011)

  12. Solomon, B., Ionescu, D., Litoiu, M., Iszlai, G.: Designing autonomic management systems for cloud computing. In: 2010 International Joint Conference on Computational Cybernetics and Technical Informatics (ICCC-CONTI), pp. 631–636 (2010)

  13. Open Science Grid: https://www.opensciencegrid.org/bin/view (online). Accessed Jul 2012

  14. TeraGrid Archives: https://www.xsede.org/tg-archives (online). Accessed Jul 2012

  15. Gagliardi, F., Jones, B., Grey, F., Bgin, M.E., Heikkurinen, M.: Building an infrastructure for scientific Grid computing: status and goals of the egee project. Philos. Trans. R. Soc. A: Math. Phys. Eng. Sci. 363(1833), 1729 (2005)

    Article  Google Scholar 

  16. XSEDE – Home: https://www.xsede.org/ (online). Accessed Jul 2012

  17. EGI-InSPIRE: http://www.egi.eu/about/egi-inspire/ (Online). Accessed Jun 2012

  18. Dongarra, J.J., Gentzsch, W. (eds.): Computer Benchmarks. Elsevier Science Publishers B. V., Amsterdam, The Netherlands, The Netherlands (1993)

    MATH  Google Scholar 

  19. Dikaiakos, M.D.: Grid benchmarking: vision, challenges, and current status: research articles. Concurr. Comput.-Pract. Exp. 19(1), 89 (2007)

    Article  MathSciNet  Google Scholar 

  20. Frumkin, M., der Wijngaart, R.F.V.: NAS Grid benchmarks: a tool for Grid space exploration. Cluster Comput. 5(3), 247 (2002)

    Article  Google Scholar 

  21. Tsouloupas, G., Dikaiakos, M.D.: Gridbench: a tool for the interactive performance exploration of Grid infrastructures. J. Parallel Distrib. Comput. 67(9), 1029 (2007)

    Article  MATH  Google Scholar 

  22. Ogura, D.R., Midorikawa, E.T.: Characterization of scientific and transactional applications under multi-core architectures on cloud computing environment. In: Proceedings of the 2010 13th IEEE International Conference on Computational Science and Engineering. CSE ’10, pp. 314–320. IEEE Computer Society, Washington, DC, USA (2010)

    Chapter  Google Scholar 

  23. Bailey, D.H., Barszcz, E., Barton, J.T., Browning, D.S., Carter, R.L., Dagum, L., Fatoohi, R.A., Frederickson, P.O., Lasinski, T.A., Schreiber, R.S., Simon, H.D., Venkatakrishnan, V., Weeratunga, S.K.: The nas parallel benchmarks - summary and preliminary results. In: Proceedings of the 1991 ACM/IEEE Conference on Supercomputing. Supercomputing ’91, pp. 158–165. ACM, New York, NY, USA (1991)

    Chapter  Google Scholar 

  24. TPC: TPC-H http://www.tpc.org/tpch/ (Online). Accessed Jul 2012

  25. van der Aalst, W.M.P., Bratosin, C., Sidorova, N., Trcka, N.: A reference model for Grid architectures and its validation. Concurr. Comput.-Pract. Exp. 22(11), 1365 (2010)

    Google Scholar 

  26. Chan, W.K., Mei, L., Zhang, Z.: Modeling and testing of cloud applications. In Kirchberg, M., Hung, P.C.K., Carminati, B., Chi, C.H., Kanagasabai, R., Valle, E.D., Lan, K.C., Chen L.J. (eds.) APSCC, pp. 111–118. IEEE (2009)

  27. Östberg, P.O., Elmroth, E.: Increasing flexibility and abstracting complexity in service-based Grid and cloud software. In: Proceedings of CLOSER 2011 - International Conference on Cloud Computing and Services Science, pp. 240–249. SciTePress (2011)

  28. Stockinger, H.: Defining the Grid: a snapshot on the current view. J. Supercomput. 42(1), 3 (2007)

    Article  MathSciNet  Google Scholar 

  29. Twenty-One Experts Define Cloud Computing: http://cloudcomputing.sys-con.com/node/612375/print (online). Accessed Jul 2012

  30. Vaquero, L.M., Rodero-Merino, L., Caceres, J., Lindner, M.: A break in the clouds: towards a cloud definition. Comput. Commun. Rev. 39(1), 50 (2009)

    Article  Google Scholar 

  31. The Globus Alliance: http://www.globus.org (online). Accessed Jul 2012

  32. gLite: Lightweight Middleware for Grid Computing. http://glite.cern.ch/ (online). Accessed Sept 2011

  33. Seti@home: The Search for ExtraTerrestrial Inteligence. http://setiathome.ssl.berkeley.edu (online). Accessed Jul 2012

  34. Jégou, Y., Lantéri, S., Leduc, J., Noredine, M., Mornet, G., Namyst, R., Primet, P., Quetier, B., Richard, O., Talbi, E.G., Iréa, T.: Grid’5000: a large scale and highly reconfigurable experimental Grid testbed. Int. J. High Perform. Comput. Appl. 20(4), 481 (2006)

    Article  Google Scholar 

  35. Nurmi, D., Wolski, R., Grzegorczyk, C., Obertelli, G., Soman, S., Youseff, L., Zagorodnov, D.: The eucalyptus open-source cloud-computing system. In: Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid 2009), pp. 124–131. IEEE Computer Society (2009)

  36. Varia, J.: Architecting for the Cloud: Best Practices. Amazon White Paper (2010)

  37. Avizienis, A., Laprie, J.C., Randell, B., Landwehr, C.: Basic concepts and taxonomy of dependable and secure computing. IEEE Trans. Dependable Secure Compu. 1(1), 11 (2004)

    Article  Google Scholar 

  38. Google App Engine - Google Code: http://code.google.com/appengine/ (online). Accessed Jul 2012

  39. Hopcroft, J.E., Motwani, R., Ullman, J.D.: Introduction to Automata Theory, Languages, and Computation, 2nd edn. Addison Wesley (2000)

  40. Carroll, J., Long, D.: Theory of Finite Automata with an Introduction to Formal Languages. Prentice-Hall, Inc., Upper Saddle River, NJ, USA (1989)

    Google Scholar 

  41. Sotomayor, B., Montero, R.S., Llorente, I.M., Foster, I.: Virtual infrastructure management in private and hybrid clouds. IEEE Iternet Comput. 13, 14 (2009)

    Article  Google Scholar 

  42. Keahey, K., Figueiredo, R., Fortes, J., Freeman, T., Tsugawa, M.: Science clouds: early experiences in cloud computing for scientific applications. In: Proceedings of the 2008 Cloud Computing and Its Applications 2008 (CCA-08) (2008)

  43. Rodero-Merino, L., Vaquero, L.M., Gil, V., Galn, F., Fontn, J., Montero, R.S., Llorente, I.M.: From infrastructure delivery to service management in clouds. Future Gener. Comput. Syst. 26(8), 1226 (2010)

    Article  Google Scholar 

  44. Rochwerger, B., Breitgand, D., Levy, E., Galis, A., Nagin, K., Llorente, I.M., Montero, R., Wolfsthal, Y., Elmroth, E., Cáceres, J., Ben-Yehuda, M., Emmerich, W., Galán, F.: The RESERVOIR model and architecture for open federated cloud computing. IBM J. Res. Develop. 53, 535 (2009)

    Article  Google Scholar 

  45. Montes, J., Sánchez, A., Valdés, J.J., Pérez, M.S., Herrero, P.: Finding order in chaos: a behavior model of the whole Grid. Concurr. Comput.-Pract. Exp. 22, 1386 (2010)

    Google Scholar 

  46. Valdés, J.J.: Similarity-based heterogeneous neurons in the context of general observational models. Neural Netw. World 12, 499 (2002)

    Google Scholar 

  47. Valdés, J.J.: Virtual reality representation of information systems and decision rules. Lect. Notes Artif. Intell. 2639, 615 (2003)

    Google Scholar 

  48. Montes, J., Sánchez, A., Pérez, M.S.: Grid global behavior prediction. In: Proceedings of the 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2011), pp. 124–133. IEEE Computer Society (2011)

  49. Montes, J., Nicolae, B., Antoniu, G., Sánchez, A., Pérez, M.S.: Using global behavior modeling to improve qos in cloud data storage services. In: 2nd IEEE International Conference on Cloud Computing Technology and Science (CloudCom 2010), pp. 304–311. IEEE Computer Society (2010)

  50. Nicolae, B., Antoniu, G., Bougé, L., Moise, D., Carpen-Amarie, A.: Blobseer: next-generation data management for large scale infrastructures. J. Parallel Distrib. Comput. 71, 169 (2011)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jesús Montes.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Montes, J., Sánchez, A. & Pérez, M.S. Riding Out the Storm: How to Deal with the Complexity of Grid and Cloud Management. J Grid Computing 10, 349–366 (2012). https://doi.org/10.1007/s10723-012-9225-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10723-012-9225-4

Keywords

Navigation