Abstract
On-Demand cloud resources are highly available and reliable since most common cloud service providers organize their clouds as a network of several regions (data centres) and multiple availability zones in each region. This redundant and highly distributed resource pool guarantees users high availability and reliability, even in case of disasters. In order to increase revenues, cloud service providers offer their unused computing resources for much cheaper prices than On-Demand resources, in the form of volatile cloud resources. The trade-off for the high discount is their volatile ability, i.e. lower availability and lower reliability. This means that a user can lose part or all volatile resources at any time, similar to a large-scale technology-related massive failure (disaster). This chapter introduces volatile cloud resources, their life cycle, pros and cons. It also presents several resilient techniques against volatile cloud resources’ disruptions and multiple failures.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Agmon Ben-Yehuda O, Ben-Yehuda M, Schuster A, Tsafrir D (2013) Deconstructing Amazon EC2 spot instance pricing. ACM Trans Econ Comput 1(3):1–20. https://doi.org/10.1145/2509413.2509416
Amazon EC2: Spot Instance. URL https://aws.amazon.com/ec2/spot/. Accessed on 5th March
Benoit A, Hakem M, Robert Y (2008) Fault tolerant scheduling of precedence task graphs on heterogeneous platforms. In: 2008 IEEE International Symposium on Parallel and Distributed Processing, pp 1–8
Bux M, Leser U (2015) DynamicCloudSim: simulating heterogeneity in computational clouds. Fut Gen Comp Syst 46:85–99. https://doi.org/10.1016/j.future.2014.09.007
Calheiros RN, Buyya R (2014) Meeting deadlines of scientific workflows in public clouds with tasks replication. IEEE Trans Parallel Distrib Syst 25(7):1787–1796
Chard R, Chard K, Bubendorfer K, Lacinski L, Madduri R, Foster I (2015) Cost-aware cloud provisioning. In: IEEE 11th International Conference on e-Science, pp 136–144. https://doi.org/10.1109/eScience.2015.67
Chard R, Chard K, Wolski R, Madduri R, Ng B, Bubendorfer K, Foster I (2017) Cost-aware cloud profiling, prediction, and provisioning as a service. IEEE Cloud Comput 4(4):48–59. https://doi.org/10.1109/MCC.2017.3791025
Chen J, Yang Y (2007) Adaptive selection of necessary and sufficient checkpoints for dynamic verification of temporal constraints in grid workflow systems. ACM Trans Auton Adapt Syst 2(2)
Chervenak A, Deelman E, Livny M, Su M, Schuler R, Bharathi S, Mehta G, Vahi K (2007) Data placement for scientific applications in distributed environments. In: 2007 8th IEEE/ACM International Conference on Grid Computing, pp 267–274. https://doi.org/10.1109/GRID.2007.4354142
Chhetri MB, Lumpe M, Vo QB, Kowalczyk R (2017) On estimating bids for Amazon EC2 spot instances using time series forecasting. In: IEEE International Conference on Services Computing, pp 44–51. https://doi.org/10.1109/SCC.2017.14
Cirne W, Brasileiro F, Paranhos D, Góes LFW, Voorsluys W (2007) On the efficacy, efficiency and emergent behavior of task replication in large distributed systems. Parallel Comput 33(3):213–234
Dejun J, Pierre G, Chi CH (2010) EC2 performance analysis for resource provisioning of service-oriented applications. Springer, Berlin, Heidelberg, pp 197–207
Furdek M, Wosinska L, Goscien R, Manousakis K, Aibin M, Walkowiak K, Ristov S, Gushev M, Marzo JL (2016) An overview of security challenges in communication networks. In: International Workshop on Resilient Networks Design and Modeling (RNDM). IEEE, pp 43–50
Gärtner FC (1999) Fundamentals of fault-tolerant distributed computing in asynchronous environments. ACM Comput Surv 31(1):1–26
Gomes T, Tapolcai J, Esposito C, Hutchison D, Kuipers F, Rak J, de Sousa A, Iossifides A, Travanca R, André J, Jorge L, Martins L, Ugalde PO, Pašić A, Pezaros D, Jouet S, Secci S, Tornatore M (2016) A survey of strategies for communication networks to protect against large-scale natural disasters. In: 2016 8th International Workshop on Resilient Networks Design and Modeling (RNDM), pp 11–22
Guo W, Chen K, Wu Y, Zheng W (2015) Bidding for highly available services with low price in spot instance market. In: International Symposium on High-Performance Parallel and Distributed Computing, pp 191–202. https://doi.org/10.1145/2749246.2749259
Javadi B, Thulasiramy RK, Buyya R (2011) Statistical modeling of spot instance prices in public cloud environments. In: IEEE International Conference on Utility and Cloud Computing, pp 219–228. https://doi.org/10.1109/UCC.2011.37
Kamiński B, Szufel P (2015) On optimization of simulation execution on Amazon EC2 spot market. Simul Model Pract Theor 58:172–187. Special issue on Cloud Simulation. https://doi.org/10.1016/j.simpat.2015.05.008
Karunakaran S, Sundarraj RP (2015) Bidding strategies for spot instances in cloud computing markets. IEEE Internet Comput 19(3):32–40. https://doi.org/10.1109/MIC.2014.87
Khandelwal V, Chaturvedi A, Gupta CP (2020) Amazon EC2 spot price prediction using regression random forests. IEEE Trans Cloud Comput 8(1):59–72. https://doi.org/10.1109/TCC.2017.2780159
Mao M, Humphrey M (2012) A performance study on the VM startup time in the cloud. In: IEEE Fifth International Conference on Cloud Computing, pp 423–430. https://doi.org/10.1109/CLOUD.2012.103
Mas Machuca C, Secci S, Vizarreta P, Kuipers F, Gouglidis A, Hutchison D, Jouet S, Pezaros D, Elmokashfi A, Heegaard P, Ristov S, Gusev M (2016) Technology-related disasters: a survey towards disaster-resilient software defined networks. In: International Workshop on Resilient Networks Design and Modeling (RNDM). IEEE, pp 35–42
Mazzucco M, Dumas M (2011) Achieving performance and availability guarantees with spot instances. In: 2011 IEEE International Conference on High Performance Computing and Communications, pp 296–303
Mosse D, Melhem R, Ghosh S (1994) Analysis of a fault-tolerant multiprocessor scheduling algorithm. In: Proceedings of IEEE 24th International Symposium on Fault-Tolerant Computing, pp 16–25
Ostermann S, Iosup A, Yigitbasi N, Prodan R, Fahringer T, Epema D (2010) A performance analysis of EC2 cloud computing services for scientific computing. In: Cloud Computing. Springer, Berlin, Heidelberg, pp 115–131
Pham T, Ristov S, Fahringer T (2018) Performance and behavior characterization of Amazon EC2 Spot Instances. In: 2018 IEEE 11th International Conference on Cloud Computing (CLOUD), pp 73–81
Plankensteiner K, Prodan R, Fahringer T, Kertész A, Kacsuk P (2009) Fault detection, prevention and recovery in current grid workflow systems. Springer US, Boston, pp 1–13
Poola D, Ramamohanarao K, Buyya R (2014) Fault-tolerant workflow scheduling using spot instances on clouds. Procedia Comput Sci 29:523–533
Poola D, Ramamohanarao K, Buyya R (2016) Enhancing reliability of workflow execution using task replication and spot instances. ACM Trans Auton Adapt Syst 10(4):1–21. https://doi.org/10.1145/2815624
Ristov S, Mathá R, Prodan R (2017) Analysing the performance instability correlation with various workflow and cloud parameters. In: 2017 25th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), pp 446–453. https://doi.org/10.1109/PDP.2017.80
Singh VK, Dutta K (2015) Dynamic price prediction for amazon spot instances. In: 2015 48th Hawaii International Conference on System Sciences, pp 1513–1520. https://doi.org/10.1109/HICSS.2015.184
Tang S, Yuan J, Li XY (2012) Towards optimal bidding strategy for Amazon EC2 cloud spot instance. In: IEEE International Conference on Cloud Computing, CLOUD ’12, pp 91–98. https://doi.org/10.1109/CLOUD.2012.134
Tornatore M, André J, Babarczi P, Braun T, Følstad E, Heegaard P, Hmaity A, Furdek M, Jorge L, Kmiecik W, Mas Machuca C, Martins L, Medeiros C, Musumeci F, Pašić A, Rak J, Simpson S, Travanca R, Voyiatzis A (2016) A survey on network resiliency methodologies against weather-based disruptions. In: 2016 8th International Workshop on Resilient Networks Design and Modeling (RNDM), pp 23–34
Voorsluys W, Buyya R (2012) Reliable provisioning of spot instances for compute-intensive applications. In: IEEE International Conference on Avanced Information Networking and Applications, AINA’12, pp 542–549. https://doi.org/10.1109/AINA.2012.106
Wang C, Liang Q, Urgaonkar B (2017) An empirical analysis of Amazon EC2 spot instance features affecting cost-effective resource procurement. In: ACM/SPEC on International Conference on Performance Engineering, ICPE’17, pp 63–74. https://doi.org/10.1145/3030207.3030210
Weinman J (2017) The economics of the hybrid multicloud fog. IEEE Cloud Computi 4(1):16–21
Wolski R, Brevik J, Chard R, Chard K (2017) Probabilistic guarantees of execution duration for Amazon spot instances. In: International Conference for High Performance Computing, Networking, Storage and Analysis, SC’17, vol 18, pp 1–11. https://doi.org/10.1145/3126908.3126953
Yi S, Heo J, Cho Y, Hong J (2007) Taking point decision mechanism for page-level incremental checkpointing based on cost analysis of process execution time. J Inf Sci Eng 23(5):1325–1337
Yi S, Kondo D, Andrzejak A (2010) Reducing costs of spot instances via checkpointing in the Amazon elastic compute cloud. In: 2010 IEEE 3rd International Conference on Cloud Computing, pp 236–243
Yu J, Buyya R (2005) A taxonomy of scientific workflow systems for grid computing. ACM SIGMOD Rec 34(3):44–49
Zhou A, Wang S, Sun Q, Li J, Zhao Q, Yang F (2018) Support for spot virtual machine purchasing simulation. Cluster Comp 21(1):1–13
Acknowledgements
This chapter is based on work from COST Action CA15127 (“Resilient communication services protecting end-user applications from disaster-based failures—RECODIS”) supported by COST (European Cooperation in Science and Technology).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Ristov, S., Fahringer, T., Peer, D., Pham, TP., Gusev, M., Mas-Machuca, C. (2020). Resilient Techniques Against Disruptions of Volatile Cloud Resources. In: Rak, J., Hutchison, D. (eds) Guide to Disaster-Resilient Communication Networks. Computer Communications and Networks. Springer, Cham. https://doi.org/10.1007/978-3-030-44685-7_15
Download citation
DOI: https://doi.org/10.1007/978-3-030-44685-7_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-44684-0
Online ISBN: 978-3-030-44685-7
eBook Packages: Computer ScienceComputer Science (R0)