Neural network based multi-objective evolutionary algorithm for dynamic workflow scheduling in cloud computing
Introduction
Cloud computing is a large-scale heterogeneous and distributed computing infrastructure for the scientific and commercial communities, which provides high quality and low cost services with minimal hardware investments. Infrastructure-as-a-Service (IaaS), Platform-as-a-Service (PaaS) and Software-as-a-Service (SaaS) are among the most popular service layers that cloud computing delivers over the internet. In this paper, we will mostly refer to IaaS, where the customers can access hardware resources, on which the applications can be deployed.
Workflows are the common techniques to construct large scale compute and data intensive applications from different research domains. An application workflow is modelled with a directed acyclic graph where the nodes of the graph are tasks that are interconnected via compute or data resources. The workflow scheduling problem in cloud computing aims to map the tasks of a given application onto available resources [1], [2], [3], [4]. It is an NP-complete problem [1], in which the orchestration of task executions is the main concern in order to optimize the objectives specified in QoS.
The periodical workflow scheduling mostly involves with the applications that are submitted to the system periodically, where those applications have been increasingly encountered in different domains. They emerge in physics for gravitational waves yearly [5], in business for fiscal analysis monthly and in meteorology for weather forecasting and storm surge prediction daily or hourly [6]. The workloads of the tasks in the applications may vary in every period according to the amount of data they collect. These unpredictable fluctuations lead to period-based scheduling where the tasks should be rescheduled with respect to their latest workloads in every period.
The concept of dynamism in workflow scheduling that we consider in this study is twofold. The first type of dynamism is transient resource failures over time, where the resources may dynamically join to and leave from the cloud. They may arise in consequence of several events such as software faults (bugs, overflows, etc.) or hardware faults (irregular electric power, hard disk failures, etc.). The other source for dynamism in workflow scheduling problem is changing number of objectives during execution of workflows. Cloud computing confronts with real-life scenarios, where the number of objectives may change over time [7]. For instance, makespan of a workflow may not be taken into the consideration until a workflow with tighter deadline is submitted for execution. It is also emphasized that cloud computing is one of the subjects that is open for more variety of changing objectives to be considered. The objectives can be ignored or considered by several external factors including the level of power consumption, the level of unbalance of workloads among resources or changing QoS requirements. This dynamism has impact on the selection of optimal workflow scheduling solutions in different periods.
To the best of our knowledge, our paper is among the first attempts to model the workflow scheduling problem in cloud computing as a dynamic multi-objective optimization problem (DMOP) by considering the resource failures and changes in the number of objectives as main sources of dynamism. In this paper, we present a novel prediction based dynamic multi-objective evolutionary algorithm, called NN-DNSGA-II algorithm, that incorporates artificial neural network with the non-dominated sorting genetic algorithm (NSGA-II algorithm). The NN-DNSGA-II algorithm combines the strength of the neural network with dynamic NSGA-II to reveal the change patterns in the optimization environments. It exploits correlation between task–resource pairs and correlation between two successive optimization environments in order to estimate the future positions of optimal solutions.
Additionally, we adapt five dynamic multi-objective algorithms from the literature that do not require prediction, including DNSGA-II-A, DNSGA-II-B, DNSGA-II-HM, DNSGA-II-RI algorithms and dynamic variation of the particle swarm optimization algorithm called DMOPSO algorithm. They are included for types of changes in the scenarios that are not predictable, where they have different types of response mechanisms to cope with the changes. While DNSGA-II-A, DNSGA-II-RI and DMOPSO basically insert random solutions to re-diversify the population, DNSGA-II-B inserts mutated versions of the existing solutions in the population. Finally, DNSGA-II-HM use adaptation of mutation rates in the environment.
An extensive empirical study is carried out to demonstrate the performance of our algorithm, where the optimization objectives include the minimization of makespan, cost, energy and imbalance; and the maximization of reliability and utilization. The resource specifications are based on the of Amazon EC2; and the workflows from Pegasus Workflow Management System are the test beds. The comparisons of the algorithms are carried out by three metrics from the multi-objective optimization problems, which are the number of non-dominated solutions, Schott’s spacing and Hypervolume. The experimental evaluation validates that our NN-DNSGA-II algorithm outperforms the adapted ones up to 24 out of 30 cases in varying frequency of changes and up to 35 out of 45 cases in varying severity of changes.
The rest of the paper is organized as follows. Section 2 reviews the related work. In Section 3, we present workflow application model, the architecture for workflow execution and the objectives considered in this study. Dynamic workflow scheduling problem is presented in Section 4, which is followed by the proposed NN-DNSGA-II algorithm and adaptations of five dynamic multi-objective algorithms from the literature. Section 5 presents experimental setup and the definitions of metrics considered in this study. The results and discussions of our empirical study is presented in Section 6; and Section 7 concludes the paper.
Section snippets
Related work
Workflow scheduling on distributed resources is an extensively studied NP-complete problem [1]. Although there are a few single objective workflow scheduling problems [8], [9], most of the existing research have multi-objective characteristics [1], [10], [11]. Although some of them linearly combine multiple objectives [12], [13], it may not appropriate to the nature of the real-world problem at some certain conditions. One limitation is setting weight parameters with correct values.
Application model
A workflow application, , is modelled through a directed acyclic graph, where is the set of tasks and is the set of edges among precedence-constrained tasks. The weight of a task is defined by its reference execution time in seconds and the weight of an edge is defined by its output data size in bytes. For a task , refers the set of its immediate predecessors and refers the set of its immediate successors. A task with no predecessor is
Dynamic workflow scheduling problem
A multi-objective optimization problem (MOP) is an optimization problem that has two or more objectives, where at least two objectives may be in conflict with one another. For a MOP, the set of solutions that are not dominated by any other solutions is called as Pareto-optimal set (POS) in decision space and Pareto-optimal front (POF) in objective space. True POF is not known a priori for this problem. On the other hand, a dynamic multi-objective optimization problem (DMOP) is a MOP, in which
Experimental setup
The algorithms are evaluated by using ten real world workflows with approximately 100 and 1000 tasks from Pegasus Workflow Management System [39]. They include Montage from astronomy, CyberShake from geology, Epigenomics from biology, Inspiral from physics and Sipht from bioinformatics. The Fig. 2 symbolically represent their topological structures such as the task pipelines, data distributions and aggregations. The specifications of the resources in Table 2 are compatible with Amazon EC2, US
Results and discussion
The results of our experimental evaluation are given in two parts. The first subsection presents the performance evaluation of the algorithms for resource failures. The second subsection is for validating performance of our algorithms for changing number of objectives. In order to indicate the significance between the results of the algorithms, the Wilcoxon ranksum test [47] is carried out at the 0.05 significance level for each table that consider workflows with 100 tasks (specifically, Table 4
Conclusions
This paper is one of the first systematic attempts to model the dynamic workflow scheduling problem as a dynamic multi-objective optimization problem (DMOP), where the sources of dynamism are driven by changing resources due to failures and changing number of objectives due to a set of real-world scenarios encountered during workflow executions. The minimization of makespan, cost, energy and imbalance, and the maximization of reliability and utilization are the optimization goals considered in
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgement
The preliminary version of this paper has been presented in the CloudAM workshop of the International Conference on Utility and Cloud Computing (UCC 2018) [46]. The workshop paper includes only a rudimentary model of the problem and results of a few preliminary tests by adapting a set of dynamic multi-objective evolutionary algorithms from the literature.
Goshgar Ismayilov is currently a master of science student in Computer Engineering Department at Marmara University, Istanbul, Turkey. He received the B.S. degree in Computer Engineering Department from Marmara University in 2016. His research interests include dynamic optimization problems, evolutionary computation, cloud computing and machine learning.
References (46)
- et al.
A hyper-heuristic cost optimisation approach for scientific workflow scheduling in cloud computing
Future Gener. Comput. Syst.
(2018) - et al.
Minimizing cost and makespan for workflow scheduling in cloud using fuzzy dominance sort based HEFT
Future Gener. Comput. Syst.
(2019) - et al.
Towards workflow scheduling in cloud computing: a comprehensive analysis
J. Netw. Comput. Appl.
(2016) - et al.
A methodological framework for cloud resource provisioning and scheduling of data parallel applications under uncertainty
Future Gener. Comput. Syst.
(2019) - et al.
Evolutionary multi-objective workflow scheduling in cloud
IEEE Trans. Parallel Distrib. Syst.
(2016) - D. Klusácek, B. Parák, G. Podolníková, A. Ürge, Scheduling scientific workloads in private cloud: problems and...
- et al.
Business and Scientific Workflows: A Web Service-Oriented Approach
(2013) - et al.
- et al.
Towards an integrated gis-based coastal forecast workflow
Concurr. Comput.: Pract. Exper.
(2008) - et al.
Dynamic multiobjectives optimization with a changing number of objectives
IEEE Trans. Evol. Comput.
(2018)
Deadline based resource provisioning and scheduling algorithm for scientific workflows on clouds
IEEE Trans. Cloud Comput.
Deadline-constrained coevolutionary genetic algorithm for scientific workflow scheduling in cloud computing
Concurr. Comput.: Pract. Exper.
An adaptive multi-objective evolutionary algorithm for constrained workflow scheduling in clouds
Distrib. Parallel Databases
A truthful dynamic workflow scheduling mechanism for commercial multicloud environments
IEEE Trans. Parallel Distrib. Syst.
A taxonomy and survey on scheduling algorithms for scientific workflows in IaaS cloud computing environments
Concurr. Comput.: Pract. Exper.
Fault-tolerant scheduling for real-time scientific workflows with elastic resource provisioning in virtualized clouds
IEEE Trans. Parallel Distrib. Syst.
Meeting soft deadlines in scientific workflows using resubmission impact
IEEE Trans. Parallel Distrib. Syst.
Cited by (144)
A two-stage preference driven multi-objective evolutionary algorithm for workflow scheduling in the Cloud
2024, Expert Systems with ApplicationsCo-evolutionary and Elite learning-based bi-objective Poor and Rich Optimization algorithm for scheduling multiple workflows in the cloud
2024, Future Generation Computer SystemsA multi-objective fitness dependent optimizer for workflow scheduling
2024, Applied Soft ComputingDynamic multi-objective workflow scheduling for combined resources in cloud
2023, Simulation Modelling Practice and TheoryKnowledge-driven adaptive evolutionary multi-objective scheduling algorithm for cloud workflows
2023, Applied Soft Computing
Goshgar Ismayilov is currently a master of science student in Computer Engineering Department at Marmara University, Istanbul, Turkey. He received the B.S. degree in Computer Engineering Department from Marmara University in 2016. His research interests include dynamic optimization problems, evolutionary computation, cloud computing and machine learning.
Haluk Rahmi Topcuoglu is currently a Professor in Computer Engineering Department at Marmara University, Turkey. He received the B.S. degree and the M.S. degree in Computer Engineering from Bogazici University, Istanbul, Turkey in 1991 and 1993, respectively. He received the Ph.D. degree in Computer Science from Syracuse University, Syracuse, NY in 1999. His research interests mainly include dynamic optimization problems, workflow scheduling in cloud computing, task scheduling and mapping for multicore architectures, software-based hardware reliability and parallel programming. He is a member of the IEEE, the IEEE Computer Society and the ACM.