Skip to main content

Resilient Techniques Against Disruptions of Volatile Cloud Resources

  • Chapter
  • First Online:
Guide to Disaster-Resilient Communication Networks

Abstract

On-Demand cloud resources are highly available and reliable since most common cloud service providers organize their clouds as a network of several regions (data centres) and multiple availability zones in each region. This redundant and highly distributed resource pool guarantees users high availability and reliability, even in case of disasters. In order to increase revenues, cloud service providers offer their unused computing resources for much cheaper prices than On-Demand resources, in the form of volatile cloud resources. The trade-off for the high discount is their volatile ability, i.e. lower availability and lower reliability. This means that a user can lose part or all volatile resources at any time, similar to a large-scale technology-related massive failure (disaster). This chapter introduces volatile cloud resources, their life cycle, pros and cons. It also presents several resilient techniques against volatile cloud resources’ disruptions and multiple failures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Agmon Ben-Yehuda O, Ben-Yehuda M, Schuster A, Tsafrir D (2013) Deconstructing Amazon EC2 spot instance pricing. ACM Trans Econ Comput 1(3):1–20. https://doi.org/10.1145/2509413.2509416

  2. Amazon EC2: Spot Instance. URL https://aws.amazon.com/ec2/spot/. Accessed on 5th March

  3. Benoit A, Hakem M, Robert Y (2008) Fault tolerant scheduling of precedence task graphs on heterogeneous platforms. In: 2008 IEEE International Symposium on Parallel and Distributed Processing, pp 1–8

    Google Scholar 

  4. Bux M, Leser U (2015) DynamicCloudSim: simulating heterogeneity in computational clouds. Fut Gen Comp Syst 46:85–99. https://doi.org/10.1016/j.future.2014.09.007

    Article  Google Scholar 

  5. Calheiros RN, Buyya R (2014) Meeting deadlines of scientific workflows in public clouds with tasks replication. IEEE Trans Parallel Distrib Syst 25(7):1787–1796

    Article  Google Scholar 

  6. Chard R, Chard K, Bubendorfer K, Lacinski L, Madduri R, Foster I (2015) Cost-aware cloud provisioning. In: IEEE 11th International Conference on e-Science, pp 136–144. https://doi.org/10.1109/eScience.2015.67

  7. Chard R, Chard K, Wolski R, Madduri R, Ng B, Bubendorfer K, Foster I (2017) Cost-aware cloud profiling, prediction, and provisioning as a service. IEEE Cloud Comput 4(4):48–59. https://doi.org/10.1109/MCC.2017.3791025

    Article  Google Scholar 

  8. Chen J, Yang Y (2007) Adaptive selection of necessary and sufficient checkpoints for dynamic verification of temporal constraints in grid workflow systems. ACM Trans Auton Adapt Syst 2(2)

    Google Scholar 

  9. Chervenak A, Deelman E, Livny M, Su M, Schuler R, Bharathi S, Mehta G, Vahi K (2007) Data placement for scientific applications in distributed environments. In: 2007 8th IEEE/ACM International Conference on Grid Computing, pp 267–274. https://doi.org/10.1109/GRID.2007.4354142

  10. Chhetri MB, Lumpe M, Vo QB, Kowalczyk R (2017) On estimating bids for Amazon EC2 spot instances using time series forecasting. In: IEEE International Conference on Services Computing, pp 44–51. https://doi.org/10.1109/SCC.2017.14

  11. Cirne W, Brasileiro F, Paranhos D, Góes LFW, Voorsluys W (2007) On the efficacy, efficiency and emergent behavior of task replication in large distributed systems. Parallel Comput 33(3):213–234

    Article  Google Scholar 

  12. Dejun J, Pierre G, Chi CH (2010) EC2 performance analysis for resource provisioning of service-oriented applications. Springer, Berlin, Heidelberg, pp 197–207

    Google Scholar 

  13. Furdek M, Wosinska L, Goscien R, Manousakis K, Aibin M, Walkowiak K, Ristov S, Gushev M, Marzo JL (2016) An overview of security challenges in communication networks. In: International Workshop on Resilient Networks Design and Modeling (RNDM). IEEE, pp 43–50

    Google Scholar 

  14. Gärtner FC (1999) Fundamentals of fault-tolerant distributed computing in asynchronous environments. ACM Comput Surv 31(1):1–26

    Article  Google Scholar 

  15. Gomes T, Tapolcai J, Esposito C, Hutchison D, Kuipers F, Rak J, de Sousa A, Iossifides A, Travanca R, André J, Jorge L, Martins L, Ugalde PO, Pašić A, Pezaros D, Jouet S, Secci S, Tornatore M (2016) A survey of strategies for communication networks to protect against large-scale natural disasters. In: 2016 8th International Workshop on Resilient Networks Design and Modeling (RNDM), pp 11–22

    Google Scholar 

  16. Guo W, Chen K, Wu Y, Zheng W (2015) Bidding for highly available services with low price in spot instance market. In: International Symposium on High-Performance Parallel and Distributed Computing, pp 191–202. https://doi.org/10.1145/2749246.2749259

  17. Javadi B, Thulasiramy RK, Buyya R (2011) Statistical modeling of spot instance prices in public cloud environments. In: IEEE International Conference on Utility and Cloud Computing, pp 219–228. https://doi.org/10.1109/UCC.2011.37

  18. Kamiński B, Szufel P (2015) On optimization of simulation execution on Amazon EC2 spot market. Simul Model Pract Theor 58:172–187. Special issue on Cloud Simulation. https://doi.org/10.1016/j.simpat.2015.05.008

  19. Karunakaran S, Sundarraj RP (2015) Bidding strategies for spot instances in cloud computing markets. IEEE Internet Comput 19(3):32–40. https://doi.org/10.1109/MIC.2014.87

    Article  Google Scholar 

  20. Khandelwal V, Chaturvedi A, Gupta CP (2020) Amazon EC2 spot price prediction using regression random forests. IEEE Trans Cloud Comput 8(1):59–72. https://doi.org/10.1109/TCC.2017.2780159

  21. Mao M, Humphrey M (2012) A performance study on the VM startup time in the cloud. In: IEEE Fifth International Conference on Cloud Computing, pp 423–430. https://doi.org/10.1109/CLOUD.2012.103

  22. Mas Machuca C, Secci S, Vizarreta P, Kuipers F, Gouglidis A, Hutchison D, Jouet S, Pezaros D, Elmokashfi A, Heegaard P, Ristov S, Gusev M (2016) Technology-related disasters: a survey towards disaster-resilient software defined networks. In: International Workshop on Resilient Networks Design and Modeling (RNDM). IEEE, pp 35–42

    Google Scholar 

  23. Mazzucco M, Dumas M (2011) Achieving performance and availability guarantees with spot instances. In: 2011 IEEE International Conference on High Performance Computing and Communications, pp 296–303

    Google Scholar 

  24. Mosse D, Melhem R, Ghosh S (1994) Analysis of a fault-tolerant multiprocessor scheduling algorithm. In: Proceedings of IEEE 24th International Symposium on Fault-Tolerant Computing, pp 16–25

    Google Scholar 

  25. Ostermann S, Iosup A, Yigitbasi N, Prodan R, Fahringer T, Epema D (2010) A performance analysis of EC2 cloud computing services for scientific computing. In: Cloud Computing. Springer, Berlin, Heidelberg, pp 115–131

    Google Scholar 

  26. Pham T, Ristov S, Fahringer T (2018) Performance and behavior characterization of Amazon EC2 Spot Instances. In: 2018 IEEE 11th International Conference on Cloud Computing (CLOUD), pp 73–81

    Google Scholar 

  27. Plankensteiner K, Prodan R, Fahringer T, Kertész A, Kacsuk P (2009) Fault detection, prevention and recovery in current grid workflow systems. Springer US, Boston, pp 1–13

    Google Scholar 

  28. Poola D, Ramamohanarao K, Buyya R (2014) Fault-tolerant workflow scheduling using spot instances on clouds. Procedia Comput Sci 29:523–533

    Article  Google Scholar 

  29. Poola D, Ramamohanarao K, Buyya R (2016) Enhancing reliability of workflow execution using task replication and spot instances. ACM Trans Auton Adapt Syst 10(4):1–21. https://doi.org/10.1145/2815624

  30. Ristov S, Mathá R, Prodan R (2017) Analysing the performance instability correlation with various workflow and cloud parameters. In: 2017 25th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), pp 446–453. https://doi.org/10.1109/PDP.2017.80

  31. Singh VK, Dutta K (2015) Dynamic price prediction for amazon spot instances. In: 2015 48th Hawaii International Conference on System Sciences, pp 1513–1520. https://doi.org/10.1109/HICSS.2015.184

  32. Tang S, Yuan J, Li XY (2012) Towards optimal bidding strategy for Amazon EC2 cloud spot instance. In: IEEE International Conference on Cloud Computing, CLOUD ’12, pp 91–98. https://doi.org/10.1109/CLOUD.2012.134

  33. Tornatore M, André J, Babarczi P, Braun T, Følstad E, Heegaard P, Hmaity A, Furdek M, Jorge L, Kmiecik W, Mas Machuca C, Martins L, Medeiros C, Musumeci F, Pašić A, Rak J, Simpson S, Travanca R, Voyiatzis A (2016) A survey on network resiliency methodologies against weather-based disruptions. In: 2016 8th International Workshop on Resilient Networks Design and Modeling (RNDM), pp 23–34

    Google Scholar 

  34. Voorsluys W, Buyya R (2012) Reliable provisioning of spot instances for compute-intensive applications. In: IEEE International Conference on Avanced Information Networking and Applications, AINA’12, pp 542–549. https://doi.org/10.1109/AINA.2012.106

  35. Wang C, Liang Q, Urgaonkar B (2017) An empirical analysis of Amazon EC2 spot instance features affecting cost-effective resource procurement. In: ACM/SPEC on International Conference on Performance Engineering, ICPE’17, pp 63–74. https://doi.org/10.1145/3030207.3030210

  36. Weinman J (2017) The economics of the hybrid multicloud fog. IEEE Cloud Computi 4(1):16–21

    Article  Google Scholar 

  37. Wolski R, Brevik J, Chard R, Chard K (2017) Probabilistic guarantees of execution duration for Amazon spot instances. In: International Conference for High Performance Computing, Networking, Storage and Analysis, SC’17, vol 18, pp 1–11. https://doi.org/10.1145/3126908.3126953

  38. Yi S, Heo J, Cho Y, Hong J (2007) Taking point decision mechanism for page-level incremental checkpointing based on cost analysis of process execution time. J Inf Sci Eng 23(5):1325–1337

    Google Scholar 

  39. Yi S, Kondo D, Andrzejak A (2010) Reducing costs of spot instances via checkpointing in the Amazon elastic compute cloud. In: 2010 IEEE 3rd International Conference on Cloud Computing, pp 236–243

    Google Scholar 

  40. Yu J, Buyya R (2005) A taxonomy of scientific workflow systems for grid computing. ACM SIGMOD Rec 34(3):44–49

    Article  Google Scholar 

  41. Zhou A, Wang S, Sun Q, Li J, Zhao Q, Yang F (2018) Support for spot virtual machine purchasing simulation. Cluster Comp 21(1):1–13

    Article  Google Scholar 

Download references

Acknowledgements

This chapter is based on work from COST Action CA15127 (“Resilient communication services protecting end-user applications from disaster-based failures—RECODIS”) supported by COST (European Cooperation in Science and Technology).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sasko Ristov .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Ristov, S., Fahringer, T., Peer, D., Pham, TP., Gusev, M., Mas-Machuca, C. (2020). Resilient Techniques Against Disruptions of Volatile Cloud Resources. In: Rak, J., Hutchison, D. (eds) Guide to Disaster-Resilient Communication Networks. Computer Communications and Networks. Springer, Cham. https://doi.org/10.1007/978-3-030-44685-7_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-44685-7_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-44684-0

  • Online ISBN: 978-3-030-44685-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics