Abstract
Performance perturbations are a natural phenomenon in volunteer computing systems. Scheduling parallel applications with precedence-constraints is emerging as a new challenge in these systems. In this paper, we propose two novel robust task scheduling heuristics, which identify best task-resource matches in terms of makespan and robustness. Our approach for both heuristics is based on a proactive reallocation (or schedule expansion) scheme enabling output schedules to tolerate a certain degree of performance degradation. Schedules are initially generated by focusing on their makespan. These schedules are scrutinized for possible rescheduling using additional volunteer computing resources to increase their robustness. Specifically, their robustness is improved by maximizing either the total allowable delay time or the minimum relative allowable delay time over all allocated volunteer resources. Allowable delay times may occur due to precedence constraints. In this paper, two proposed heuristics are evaluated with an extensive set of simulations. Based on simulation results, our approach significantly contributes to improving the robustness of the resulting schedules.
Similar content being viewed by others
References
Anderson DP, Cobb J, Korpela E, Lebofsky M, Werthimer D (2002) SETI@home: an experiment in public-resource computing. Commun ACM 45(11):56–61
Folding@home (2009) http://folding.stanford.edu/
Einstein@Home (2009) http://einstein.phys.uwm.edu/
Darbha S, Agrawal DP (1998) Optimal scheduling algorithm for distributed-memory machines. IEEE Trans Parallel Distrib Syst 9(1):87–95
Zomaya AY, Ward C, Macey BS (1999) Genetic scheduling for parallel processor systems: comparative studies and performance issues. IEEE Trans Parallel Distrib Syst 10(8):795–812
Topcuouglu H, Hariri S, Wu M-Y (2002) Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans Parallel Distrib Syst 13(3):260–274
Lee YC, Zomaya AY (2008) A novel state transition method for metaheuristic-based scheduling in heterogeneous computing systems. IEEE Trans Parallel Distrib Syst 19(9):1215–1223
Lee YC, Subrata R, Zomaya AY (2009) On the performance of a dual-objective optimization model for workflow applications on grid platforms. IEEE Trans Parallel Distrib Syst 20(9):1273–1284
Shivle S, Sugavanam P, Siegel HJ, Maciejewski AA, Banka T, Chindam K, Dussinger S, Kutruff A, Penumarthy P, Pichumani P, Satyasekaran P, Sendek D, Smith J, Sousa J, Sridharan J, Velazco J (2005) Mapping subtasks with multiple versions on an ad hoc grid. Parallel Comput 31(7):671–690. Special Issue on Heterogeneous Computing
Shivle S, Siegel HJ, Maciejewski AA, Sugavanam P, Banka T, Castain R, Chindam K, Dussinger S, Pichumani P, Satyasekaran P, Saylor W, Sendek D, Sousa J, Sridharan J, Velazco J (2006) Static allocation of resources to communicating subtasks in a heterogeneous ad hoc grid environment. J Parallel Distrib Comput 66(4):600–611. Special Issue on Algorithms for Wireless and Ad-hoc Networks
Braun TD, Siegel HJ, Maciejewski AA, Hong Y (2008) Static resource allocation for heterogeneous computing environments with tasks having dependencies, priorities, deadlines, and multiple versions. J Parallel Distrib Comput 68(11):1504–1516
Ali S, Maciejewski AA, Siegel HJ, Kim J-K (2004) Measuring the robustness of a resource allocation. IEEE Trans Parallel Distrib Syst 15(7):630–641
Smith J, Briceño LD, Maciejewski AA, Siegel HJ, Renner T, Shestak V, Ladd J, Sutton A, Janovy D, Govindasamy S, Alqudah A, Dewri R, Prakash P (2007) Measuring the robustness of resource allocations in a stochastic dynamic environment. In: Proc international parallel and distributed processing symposium (IPDPS 2007), Mar 2007
Chtepen M, Claeys FHA, Dhoedt B, De Turck F, Demeester P, Vanrolleghem PA (2009) Adaptive task checkpointing and replication: toward efficient fault-tolerant grids. IEEE Trans Parallel Distrib Syst 20(2):180–190
Ali S, Kim J-K, Siegel HJ, Maciejewski AA (2008) Static heuristics for robust resource allocation of continuously executing applications. J Parallel Distrib Comput 68(8):1070–1080
Sugavanam P, Siegel HJ, Maciejewski AA, Oltikar M, Mehta A, Pichel R, Horiuchi A, Shestak V, Al-Otaibi M, Krishnamurthy Y, Ali S, Zhang J, Aydin M, Lee P, Guru K, Raskey M, Pippin A (2007) Robust static allocation of resources for independent tasks under makespan and dollar cost constraints. J Parallel Distrib Comput 67(4):400–416
Mehta AM, Smith J, Siegel HJ, Maciejewski AA, Jayaseelan A, Ye B (2007) Dynamic resource allocation heuristics that manage tradeoff between makespan and robustness. J Supercomput 42(1):33–58. Special Issue on Grid Technology
Shestak V, Smith J, Maciejewski AA, Siegel HJ (2008) Stochastic robustness metric and its use for static resource allocations. J Parallel Distrib Comput 68(8):1157–1173
Deb K, Gupta H (2006) Introducing robustness in multi-objective optimization. Evol Comput 14(4):463–494
Qin X, Jiang H (2005) A dynamic and reliability driven scheduling algorithm for parallel real-time jobs executing on heterogeneous clusters. J Parallel Distrib Comput 65(8):885–900
Dongarra J, Jeannot E, Saule E, Shi Z (2007) Bi-objective scheduling algorithms for optimizing makespan and reliability on heterogeneous systems. In: Proc 19th annual ACM symposium on parallel algorithms and architectures (SPAA’07), 2007, pp 280–288
Dogan A, Ozguner F (2002) Matching and scheduling algorithms for minimizing execution time and failure probability of applications in heterogeneous computing. IEEE Trans Parallel Distrib Syst 13(3):308–323
Benoit A, Hakem M, Robert Y (2008) Fault tolerant scheduling of precedence task graphs on heterogeneous platforms. In: Proc international parallel and distributed processing symposium (IPDPS), 2008
Byun E, Choi S, Baik M, Hwang C, Park C, Jung SY (2005) Scheduling scheme based on dedication rate in volunteer computing environment. In: Proc 4th international symposium on parallel and distributed computing (ISPDC), 2005, pp 234–241
Wu M-Y, Gajski DD (1990) Hypertool: a programming aid for message-passing systems. IEEE Trans Parallel Distrib Syst 1(3):330–343
Lord RE, Kowalik JS, Kumar SP (1983) Solving linear algebraic equations on an MIMD computer. J ACM 30(1):103–117
Cormen TH, Leiserson CE, Rivest RL (1990) Introduction to algorithms. MIT Press, Cambridge
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Lee, Y.C., Zomaya, A.Y. & Siegel, H.J. Robust task scheduling for volunteer computing systems. J Supercomput 53, 163–181 (2010). https://doi.org/10.1007/s11227-009-0326-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-009-0326-1