Cost-Effective Resource Provisioning for Real-Time Workflow in Cloud

In the era of big data, mining and analysis of the enormous amount of data has been widely used to support decision-making.*is complex process including huge-volume data collecting, storage, transmission, and analysis could be modeled as workflow. Meanwhile, cloud environment provides sufficient computing and storage resources for big data management and analytics. Due to the clouds providing the pay-as-you-go pricing scheme, executing a workflow in clouds should pay for the provisioned resources. *us, cost-effective resource provisioning for workflow in clouds is still a critical challenge. Also, the responses of the complex data management process are usually required to be real-time. *erefore, deadline is the most crucial constraint for workflow execution. In order to address the challenge of cost-effective resource provisioning while meeting the real-time requirements of workflow execution, a resource provisioning strategy based on dynamic programming is proposed to achieve costeffectiveness of workflow execution in clouds and a critical-path based workflow partition algorithm is presented to guarantee that the workflow can be completed before deadline. Our approach is evaluated by simulation experiments with real-time workflows of different sizes and different structures. *e results demonstrate that our algorithm outperforms the existing classical algorithms.


Introduction
Nowadays, the big data technology has been used in a wide range of applications including complex systems to support decision-making [1,2]. Along with the enormous commercial benefits, scientific advances, management efficiency, and analytical accuracy brought by big data, this new technology raises many challenging problems such as high cost and latency of big data storage, transmission, and processing [3][4][5]. To tackle these problems, cloud computing environment and workflow modeling methods are recognized as the effective way.
Many large-scale scientific applications in areas such as astronomy, bioinformatics, and meteorology would generate and process large amounts of data; such applications consist of a large number of data processing tasks that are frequently modeled as workflows [6]. Normally, a workflow is represented by a Directed Acyclic Graph (DAG) with nodes and edges, where nodes represent tasks and edges represent data/ control dependencies between tasks in a complex application system. Meanwhile, once a time-critical or realtime application system [7,8] is modeled as a real-time workflow, the deadline constraint can be used to ensure the tasks complete on time effectively.
Cloud computing is being investigated as an effective platform that delivers hardware infrastructure and software applications as services for the tasks of big data management and analytics. And real-time workflows can also be executed in such high performance computing environment. ere are various cloud providers offering large amount of services with different quality of service (QoS) [9][10][11] as well as different prices. A lot of efforts have been made in the area of service recommendation [12,13] based on QoS from the perspective of service providers [14][15][16], but few concerns have focused on how to select the services in cloud platform based on QoS from the perspective of users. Particularly, cloud computing provides a flexible pricing model (i.e., payas-you-go and on-demands services); users are charged based on their consumption of various resources with different QoS. erefore, one of the most challenging problems with real-time workflows in cloud computing is to get a costeffective way to complete the workflow within the deadline [17].
In reality, there are two main stages when executing a workflow in a cloud computing environment [18]. e first one is the resource provisioning stage: during this phase, computing resources from the clouds will be selected and reserved to prepare for the workflow's execution. e second one is the task scheduling stage: during this phase, a schedule is generated and each task will be mapped into the bestsuited resource. at is, the second stage determines where and when each task of a workflow will be executed, while the first stage decides what types and how many resources will be leased from the cloud service providers and therefore the total cost of the workflow is mainly decided at this stage. To address these distinctions between task scheduling and resource provisioning, we propose in this paper a novel cost optimization algorithm that focuses only on resource provisioning.
In order to satisfy the deadline of real-time workflow, our proposal partitions the original workflow into some subworkflows which can be executed in parallel based on the critical path methodology. en we use the dynamic programming knapsack algorithm to provision resources in clouds for these sub-workflows to minimize the total cost. Our approach is evaluated by simulation experiments with real-time workflows of different sizes and different structures. e results demonstrate that our algorithm outperforms the existing classical algorithms. Major contributions of this paper are stated as follows: (i) A global resource provisioning for real-time workflow strategy is addressed for cost optimization under deadline-constrained. (ii) A workflow partition algorithm based on critical path technique is proposed to get some subworkflows being executed in parallel to ensure the deadline constraint. (iii) A dynamic programming algorithm is used to provision the most cost-effective resources to the sub-workflows. (iv) We perform extensive simulations and show the efficacy of our strategy over two existing algorithms, namely, simply DPK and IC-PCP. e rest of the paper is organized as follows. Section 2 introduces related work followed by problem specification in Section 3. Section 4 explains the proposed algorithm while Section 5 presents the evaluation of the algorithm performance. Finally, conclusions and future work are summarized in Section 6.

Related Work
Cost optimization for workflow execution has been widely studied over the years. Efficient resource utilization in parallel and distributed computing environment is a key issue for cost-effectiveness. To resolve this issue, numerous researches have been done by a variety of workflow scheduling methods in clouds. Accordingly, a significant number of real-time workflow scheduling algorithms focusing on reducing the overall execution cost of real-time workflow have been proposed. Alkhanak et al. [19] classified the cost optimization method of workflow in cloud computing environment into two categories: heuristic methods and meta-heuristic methods.
Abrishami et al. [17] presented a static algorithm IC-PCP (IaaS Clouds-Partial Critical Path) based on the heuristic for scheduling a single workflow instance on an IaaS cloud. is algorithm considers cloud features such as VM heterogeneity, pay-as-you-go, and time interval pricing. ey try to minimize the execution cost of scheduling all tasks in a partial critical path on a single machine that can finish the tasks before their latest finishing time (which is calculated based on the application's deadline and the fastest available instance). In [20], Verma and Kaushal proposed a greedy algorithm based on the classical HEFT for providing a suitable trade-off between execution cost and time. Zheng et al. [21] presented three novel scheduling heuristic algorithms to help users to schedule their big data processing workflow application on clouds so that the cost can be minimized and the deadline constraints can be satisfied; different configurations of CPU frequency were considered in their work. Meta-heuristic approaches such as Genetic Algorithm (GA) based [22], Ant Colony Optimization (ACO) based [23], Particle Swarm Optimization (PSO) based [24], and symbiotic organism search algorithms were used to address the same objectives that minimize the cost of workflow execution while considering the deadline. Wu et al. [25] proposed a meta-heuristic algorithm L-ACO as well as a simple heuristic ProLiS to minimize execution cost of a workflow in clouds under a deadline constraint; experimental results show that the meta-heuristic L-ACO performs better in terms of execution costs and success ratios of meeting deadlines but the heuristic ProLiS is more efficient. at is to say, the meta-heuristic based algorithm has achieved improved performance on cost optimization but with some compromise on the execution time.
Task scheduling assigns tasks to the most cost-efficient resource to optimize the execution cost of a workflow. But task scheduling is a well-known NP-hard problem. ere is no scheduling algorithm that can obtain an optimal solution in polynomial time [17]. Otherwise, to tackle deadlineconstrained workflow scheduling, deadline-distribution method is the most widely used method by both metaheuristic based and heuristic based algorithms. e deadline-distribution method always distributes the deadline to each task in proportion to its minimum execution time [25]. In most case, this kind of methods ignores the tasks of workflow that can be executed in parallel, so the subdeadline of each single task may be not appropriate. According to these sub-deadlines, task scheduling algorithms cannot get a global optimal result, because a task level optimization was used and hence failed to utilize the whole workflow's structure and characteristics. Resource provisioning can get a minimum cost by selecting an optimal assembly of resources for the global workflow execution. Our proposal minimized the total cost of workflow 2 Complexity execution from the perspective of resource provisioning to the whole workflow (or sub-workflows) without concerning each single task's resource mapping, which can further improve resource utilization. With respect to cost optimization, resource provisioning is more efficient and effective than task scheduling for a realtime workflow in cloud computing environment [26]. Many researches have been done to minimize the overall cost of workflow by resource provisioning [27,28]. Both static and dynamic methods are used to provision the cloud resources [29]. Static method assumes that accurate information about workflow and cloud resource performance can be obtained before scheduling. Dynamic provisioning adopts no such assumption. Most scientists run the same workflows, thereby enabling the collection of such information through several trial runs. erefore static method is an appropriate method for workflows [30]. Static scheduling is also eased by the fact that cloud providers usually declare the performance specifications of their resources. We use static method for resource provisioning in this paper.

Problem Specification
In order to facilitate the reading of the paper, all of the symbols used in this section are listed in Table 1.

Workflow Model.
A real-time workflow refers to using the workflow technology for modeling of a real-time application system. at is, temporal constraints are added to the original workflow model using DAG graphs.

Definition 1.
A workflow application is represented by a Directed Acyclic Graph (DAG) G � (T, W, E), where T � t 1 , t 2 , . . . , t n is a set of n tasks, W � w 1 , w 2 , . . . , w n is the computational workload of each task in T, and E � e ij | 0 ≤ i ≠ j ≤ n is the set of directed edges between two tasks. An edge e ij of the form (t i , t j ) denotes that there is data dependency between t i and t j ; t i is said to be the parent task of t j and t j is said to be the child task of t i . is relationship between t i and t j can be represented by t i � Parent(t j ) and t j � Child(t i ). e computational workload in W is given by the number of mega-floating point operations that need to be executed. e temporal constraint for a real-time workflow is the deadline which is denoted by D.Based on this definition, a child task cannot be executed until all of its parent tasks are completed. Figure 1 shows an example workflow with six tasks. t entry and t exit are additional dummy nodes with computational workload 0 to give the whole workflow a single entry and a single exit. e task t 1 must be finished before tasks t 3 and t 4 can start. Also, both t 3 and t 4 must be finished before t 5 can start. However, there is no path between t 3 and t 4 . So t 3 and t 4 do not need to be executed in any particular sequence. erefore, the tasks {t 1 , t 3 , t 4 , t 5 } can be executed as t 1 ⟶ t 3 (t 4 ) ⟶ t 5 . is means that the workflow segment t 1 , t 3 , t 5 or t 1 , t 4 , t 5 must be executed sequentially, but t 3 , t 4 can be executed in parallel.
ere are two classic types of workflow processes in a large real-time workflow as shown in Figure 2. One is sequential workflow process Figure 2(a) in which all tasks must be executed one after the other in a certain order. e other is parallel workflow process Figure 2(b) in which all tasks can be executed simultaneously without any particular order.

Resource Model.
is paper focuses on Infrastructure as a Service (IaaS) clouds which offer the user a virtual machine (VM) pool of unlimited and heterogeneous resources with different computational performance, memory capacity, and price that can be accessed on demand such as Amazon's EC2 (Elastic Compute Cloud) [31]. Better computational performance or more memory implies higher price. We assume that there is no limitation on using each resource; i.e., the workflow can order any number of resources from each cloud provider at any time. Based on the profiling results about workflows given in [1] and the VM types offered by Amazon EC2, we assume that the VMs have sufficient memory to execute the workflow tasks. So, memory capacity of VMs will no longer be considered in this paper. We define VM j as a VM type in terms of its computational performance P j and PRICE j per billing cycle of τ time units.
For each VM j of a certain type, we assume that the computational performance in terms of mega-floating point operations per second (MFLOPS) either is available from the provider or can be estimated [32].
is information is used to deduce the computational performance by P j � MFLOPS j * τ, where τ denotes the billing cycle of VM j which is specified by the cloud provider. User's payment for the usage of each VM j is based on the billing cycle τ. Any partial utilization of the leased VM is charged as if the full billing cycle was consumed. For example, if τ � 60 seconds and a VM is used for 61 seconds, then the user will pay for two cycles of 60 seconds, that is, 120 seconds. Also, we assume that there is no limitation on the number of billing cycles of VMs that can be leased from the provider.

Definition 2.
Assume there are m types of VMs that can be provided by cloud providers; the collection of available VMs will be denoted as VMs � VM j | 1 ≤ j ≤ m . Each VM j has two parameters; one is MFLOPS j and the other is PRICE j . MFLOPS j is defined as mega-floating point operations per second of the VM j , and PRICE j is the price per τ seconds of the VM j , where τ is the billing cycle of the VM j . P j � MFLOPS j × τ is the computational performance of VM j per billing cycle.

Cost Model.
Normally, the final cost is based not only on the utilization of computational resources, but also on the data transfer between the parent task and its child task. If both parent and child tasks are in the same VM, there is no data transfer fee because the tasks share the same data center in a single VM. When the parent task and the child task are executed on different VMs that belong to the same cloud provider, there will not be any data transfer fee because most of the public cloud providers, such as Amazon EC2, do not charge for internal data transfers among their computational Complexity 3 resources. So, when we use these public cloud computing resources, we can ignore the data transfer fee at this moment. For the rest of this paper, we will assume that we utilize computing resources from one cloud provider only. e cost of using the computational resources is calculated from the resource occupation time multiplied by the price of the resource. e resource occupation time, denoted by RT, concerns not only the tasks execution time (ET), but also the data transfer time (TT) and the initial boot time (BT) of each resource provisioned by the workflow.
us, RT � BT + TT + ET. However, as the development of Docker technology, the initial boot speed of virtual resources is of the order of seconds or milliseconds. It is so small compared with the execution time of a workflow that it can be ignored.
at is, we can assume that BT � 0. When the data transfer occurs between a parent task t i and its child task t j , the transfer time depends only on the amount of data to be transfered and the bandwidth provided by the cloud provider. Both of these values are fixed, regardless of the type of VMs that is provisioned by the workflow. erefore, the value of data transfer time of each task t i can be denoted by TT i , which is VM-independent. For a VM j , the resource occupation time can be denoted by In order to optimize the total cost, we need to minimize each C j . From the formula of C j , we see that ET ij and a ij are the two variables that depend on both t i and VM j , while TT i depends only on t i but not VM j . us, TT i can be regarded as a constant from the cost optimization point of view. For the rest of the paper, we can ignore TT i .
Actually, a task does not have to be executed entirely by one VM. It can execute sequentially on several VMs as long as the execution coincides with a billing cycle. For example, suppose t i has executed on a VM for one billing cycle and it has not yet finished; then it has the option of staying with the same VM in the next billing cycle or switching to a different VM. Let ET i denote the execution time of task t i . en, t i will be charged ET i � m j�1 x ij (x ij ≥ 0 and x ij is an integer) billing cycles, where the ceiling of ET i is due to the last billing cycle of VM j (assuming that VM j is the last VM provisioned by t i ). erefore, the cost of resource provisioning to task t i can be denoted as e total cost of a workflow can be defined as follows: with n tasks as depicted in Definition 1, while there are m VMs � VM 1 , VM 2 , . . . , VM m as defined in Definition 2 that can be provisioned to workflow G. e total cost of resource provisioning to the workflow G is Note that, in the above definition, we have assumed that no two tasks are combined together. For example, suppose there are two tasks t 1 and t 2 each requiring execution of 1.5 billing cycles. If we combine the two tasks together, it only requires 3.0 billing cycles. However, in the above definition, they will require a total of 4 billing cycles, two billing cycles for each task. We will discuss how to combine two tasks together in later sections.

Symbol
Meaning

G(T, W, E)
A workflow represented by DAG T{t 1 , t 2 , . . ., t n } A set of n tasks W w 1 , w 2 , . . . , w n e related workloads of n tasks in T where TW � n i�1 w i is the sum of all tasks' workload, x j is the number of billing cycles used by VM j for all the tasks, D is the deadline of the workflow G as defined in Definition 1, and ET(G) is the total execution time of the workflow G. It is generally known that ET(G) is not only affected by each task's execution time but also related to the execution sequence for all tasks in a workflow. at is, the execution time of a sequencial workflow as shown in Figure 2(a) will be the sum of each task's execution time while a parallel workflow as shown in Figure 2(b) will take the longest running task's run time as its execution time. So the execution time of a workflow is defined as follows: with n tasks as depicted in Definition 1. If the workflow is a sequential process, the execution time can be expressed as ET(G) � t i ∈G ET i . If the workflow is a parallel process, the execution time can be expressed as ET(G) � MAX t i ∈G ET i . ET i in both formulas denote the execution time of each task in the workflow. So, if the workflow contains the same tasks the execution time of parallel workflow must be less than sequential workflow. Shown in Table 2 is an example of three types of VMs with computational performance P j and price PRICE j per billing cycle. In this example, we assume the billing cycle is one hour.
When we provision these three types of VMs to the tasks in the sample workflow shown in Figure 1, the cost of each task executing on each VM is shown in Table 3. As can be seen from this table, when w 1 � 3, provisioning VM 3 would take 1 hour and is charged 0.1, while provisioning VM 1 or VM 2 would take less than 1 hour but is charged 0.8 or 0.4. So, the optimal provisioning for t 1 is one VM 3 and with a charge of 0.1. In Table 3, the optimal single VM provisioning of each task is given in bold. Notice that for t 1 , t 2 , t 3 , and t 5 , the optimal provisioning for each of them is to use one type of VM. For t 4 , the optimal provisioning is to use one VM 1 and one VM 2 for a total charge of 1.2. For t 6 , the optimal provisioning is to use one VM 2 and one VM 3 for a total charge of 0.5. e total workload of all the tasks in the workflow shown in Figure 1 is 146. e optimal provisioning is to use five VM 1 for a total charge of 4.0. Note that this is less than the sum of each task's optimal provisioning which is 4.2. is is because each individual optimal provisioning might have idle time. For example, for t 5 , we use one VM 1 which can execute 30 MFLOPS in one hour, but t 5 has a workload of only 26 MFLOPS. So, 4 MFLOPS are wasted. e columns CSW and NSW will be explained later.

Cost Optimization Stragety.
W is assumed to be the computational workload that need to be provisioned. e decision variable x k is the number of billing cycles of VM k that are leased from the IaaS provider. Let the VMs be sorted in nonincreasing order of performance versus price; i.e., P 1 / PRICE 1 ≥ P 2 /PRICE 2 ≥. . .≥ P n /PRICE n . e VMs are considered in this order. at is, VM 1 is considered first, followed by VM 2 , and so on. e optimal value function when restricted to using the first k VMs can be defined by the following equation.
(2)  Complexity e optimal value function is given by f m (W). To solve equation (2) we use dynamic programming by considering VM 1 first. If there is some idle time left in using VM 1 (because it does not use a full billing cycle), VM 2 will then be considered for using the residual time left by VM 1 . If VM 2 again has idle time left, VM 3 will be considered next. is process is repeated until we reach VM m . e following is the dynamic programming algorithm to solve equation (2).
e final f m (W) is the minimum cost of provisioning all m types of VMs to achieve the computational workload W. VMs provisioning of individual task could cause idle time unless the computational workload of the task matches exactly the computational performance of the VMs provisioned for it. But the probability of this matching is very low. If we assemble some tasks together to provision VMs, the efficiency of resource utilization can be significantly increased. As shown in Table 3, the optimal cost for w 1 and w 3 is 0.1 and 0.8, respectively. However, if we combine w 1 and w 3 together, the total workload is exactly 30. erefore, the optimal result is to use one VM 1 that will cost 0.8. Such cost is lower than the total cost of provisioning VMs to w 1 and w 3 separately. As a result, the more tasks we can assemble as a whole for provisioning, the more we can save money in leasing resources. In order to minimize the total cost of a workflow execution, it would be better to provision all tasks in the workflow as a whole. If we assemble all tasks as a single task, it would require that all tasks be executed sequentially.
e execution time may exceed the temporal constraint of the workflow. According to Definition 4, some tasks should be partitioned from this sequential process while other tasks can be executed in parallel. In order to reduce the total execution time to be satisfied with the temporal constraint, some strategy of workflow partition will be discussed in the next section.

Deadline Assurance Method.
When ET(G) is greater than the temporal constraint of the workflow G, the workflow cannot be finished before the deadline. In real life, meeting deadline constraint is very important because many real-time workflows are Urgent Computing [33] to support emergency computations such as severe weather prediction for hurricanes, flooding, earthquake, etc. erefore, temporal constraint is a very important dimension for workflow quality of service (QoS). e temporal constraint of a workflow execution must be considered when we optimize the cost. To guarantee the workflow can be finished before the deadline, the workflow can be partitioned into several sub-workflows, where each sub-workflow can be executed in parallel and all tasks in one sub-workflow should be executed sequentially without exceeding the temporal constraint. Each sub-workflow executing on time will ensure the whole workflow to be finished before its deadline. So a series of temporal constraints must be defined for every subworkflow.
e critical path of a workflow is the longest execution path between the entry and exit tasks of the workflow [34]. All the tasks that belong to the critical path are called critical tasks. e sum of the computational workload of the critical tasks is maximum compared with any other path in the workflow. e critical tasks are executed sequentially because of the parent-child relationship between them. e execution time of the critical path by the best performing VM must not exceed the temporal constraint D. Otherwise, there is no solution that can meet the deadline constraint. In the rest of this section, we assume that the execution time of the critical path does not exceed D.
Suppose the workflow G is partitioned into a number of sub-workflows. One of the sub-workflows G CSW contains all of the critical tasks and some other (possibly none) noncritical tasks. is sub-workflow will be called the critical sub-workflow (CSW). e sub-workflows that contain no ere can only be one CSW but r (r ≥ 0) NSWs. Suppose there is a CSW G CSW . If we provision the best performing VM to G CSW with a resulting total execution time ET(G CSW ) and if ET(G CSW ) ≤ D, then there will be a successful provisioning that can guarantee the workflow completed by the deadline, because the other NSWs can be executed with the CSW in parallel. On the other hand, if ET(G CSW ) > D, it is impossible to get a successful provisioning of VMs to the CSW because of the violation of deadline. In this case, we have to partition this CSW into a new NSW and a new CSW which includes fewer tasks. e new CSW includes all critical tasks but fewer noncritical tasks. is procedure can be repeated until the CSW can be completed before the deadline. Also, any other NSWs' execution time must be satisfied with the temporal constraints gained from the layered approach.
As a workflow, there must be a structure that determines the execution order of each task in the workflow. Actually, all parent tasks should be finished before their child tasks can begin. We can use a layered approach to ensure such a sequence. A workflow can be divided into several levels (layers) that would be executed sequentially. At the same time tasks within one level do not depend on each other, so they can execute in parallel.

Definition 5.
Assume there is a workflow G as defined in Definition 1. Define the exit task to be Layer 0. Layer d is composed of all tasks that have a longest path with d edges to Layer 0 in the workflow G. So, Layer d is defined as For example, the workflow shown in Figure 1 can be divided into five levels as shown in Figure 3. As long as executing tasks from the high level to the low level, all parent tasks will finish before their child tasks start. ere is no execution order of the tasks in the same level. So, the sample workflow shown in Figure 3 can be arranged as a sequence: In order to guarantee that the deadline can be met, we must have ET(G CSW ) ≤ D, where G CSW is the critical subworkflow. Also, the total execution time of all tasks in each layer of each NSW must not be more than the execution time of all the tasks in the same layer of CSW. As for a NSW, if ET(BL(d)) ≤ LMET(d), where BL(d) � t i | BD(t i ) � d, t i ∈ NSW , then the temporal constraint of NSW can be satisfied. Otherwise, a partial NSW should be partitioned from the original NSW. is process is repeated until the inequality ET(BL(d)) ≤ LMET(d) can be met.
Take the sample workflow G shown in Figure 1 for instance; we assume that the deadline is four hours. Using the best performing VM, VM 1 , shown in Table 2 to execute this workflow, the result (shown in the TW column of Table 3) shows that using one VM 1 with five billing cycles will have ET(G) � 4.87 > D. So the workflow have to be partitioned into two sub-workflows. Based on the critical path t 2 ⟶ t 4 ⟶ t 5 , we can get a CSW and a NSW as shown in Figure 4. After this partition, ET(G CSW ) � 3.87 < D, while ET(t 1 ) < LMET(3) � 1 and ET(t 3 ) < LMET(2) � 1.47. So, the temporal constraint of NSW as shown in Figure 4 can be met. e cost of each sub-workflow is also shown in Table 3 (the CSW and NSW columns of Table 3). e total cost of both CSW and NSW is 4 which is equal to the optimal cost. So the workflow partition as shown in Figure 4 is a successful cost optimization scheme.

Cost-Effective Resource
Provisioning Algorithm

Algorithm Design.
e basic idea of the global resource provisioning algorithm is that, for a given workflow G as shown in Definition 1, a dynamic programming method is used to optimize the cost of resource provisioning, and a critical path based workflow partition technique is used to guarantee the deadline requirement of the real-time workflow. Specifically, whether the deadline is met has to be determined by a Resource Provisioning in Parallel Algorithm before dynamic programming procedure is employed; once the deadline cannot be satisfied, a workflow partition procedure is employed to divide the workflow into some sub-workflow; each sub-workflow can meet its own temporal constraint; then these sub-workflows can apply the Dynamic Programming Knapsack Algorithm to get the most cost-effective resource provisioning scheme. All of these algorithms are based on a Workflow Layer Algorithm. erefore, the whole process of our strategy is presented in detail as follows.
We first divide the workflow into l layers according to Definition 5. Second, using the dynamic programming algorithm given in the previous section, we provision m types Now we apply the dynamic programming algorithm to provision VMs to the CSW and all of the NSWs. X � x j | x j is the number of VM j , (j � 1, 2, . . . , m) obtained by this algorithm is the optimal resource provisioning results, and the minimum cost is TC � m j�1 PRICE j × x j . As this algorithm can achieve the global optimization of cost, it is denoted as Global Resource Provisioning for Real-time Workflow Algorithm which is shown in Algorithm 1. e called procedure WorkflowLayer(G) in GRP4RW is a algorithm to layer the workflow G according to Definition 5. e pseudocode of WorkflowLayer algorithm is shown in Algorithm 2.
First, (Lines 2-8) calculate the longest path of each task t i to t exit as BD(t i ). Next, (Lines 9-13) assign the tasks with the same value of BD(t i ) to the same layer. en, sort the tasks in the same layer in nonincreasing order of their workload (Line 12). is has the effect of partitioning the larger task to a sub-workflow to minimize the number of sub-workflows. In this way the cost optimization will be more efficient. Using this algorithm, the workflow G can be divided into l layers, where each layer contains all the tasks with the same length to t exit . e next called procedure ParallelProvision(G VMs, D) in GRP4RW is an algorithm to realize parallel provisioning of resource to multiple sub-workflows in order to guarantee the cost-effectiveness of each sub-workflow on the premise of meeting deadline. e pseudocode of Resource Provisioning in Parallel Algorithm is shown in Algorithm 3.
is algorithm first attempts the Dynamic Programming Knapsack Algorithm (DPK) which is shown in Algorithm 4 to provision VMs to the original workflow (Lines 2-3). If successful, it returns the optimized provision X (Lines 4-5). Otherwise, it calls Partition Workflow Algorithm (Parti-tionPath) which is shown in Algorithm 5 to partition the workflow into G CSW and G NSW (Line 7). It then recursively call ParallelProvision procedure for CSW and NSW separately (Lines 8-12), until all sub-workflows provision VMs successfully. As to G CSW , the temporal constraint is always the deadline. e temporal constraint for a G NSW can get from the Layer Minimum Execution Time based on Definition 6.
is Dynamic Programming Knapsack Algorithm is based on the idea of dynamic programming elaborated in equation (3). Primarily, this algorithm will judge whether the workflow G can be completed within the fixed temporal constraint D (Line 2). If using the VM with the best computational performance to execute the workflow will not satisfy the deadline constraint, the procedure will return failure (Line 3). Otherwise, it utilizes the method of dynamic programming to find the VMs provisioning vector X and then return (Lines 5-20). Note that D is a real number expressed in terms of the number of billing cycles. e mission of Partition Workflow Algorithm is to partition the current workflow or sub-workflow into two parts: G CSW and G NSW . e G CSW includes, but are not limited to, critical tasks. e other part, G NSW , does not contain any critical tasks. We use a Boolean variable t i .assigned to denote the feasibility of putting t i in G NSW . Because all critical tasks cannot be put into G NSW , the original value of t i .assigned should be set as "false" except the critical tasks (Lines 2-9). We choose the noncritical path between two critical tasks as the basic of G NSW (Lines 10-31). Once a task t belongs to G NSW , the parent tasks of t that have not yet been assigned must belong to G NSW (Lines 22-29). Such a principle can make the sub-workflow as large as possible to assign the best cost performance resources as explained in Section 3.2.2. Meanwhile, the parent and child tasks belonging to a single sub-workflow can guarantee the correct execution order of the tasks (parent tasks finished before child tasks).

Algorithm Analysis.
We now consider the time complexity of our algorithm. We suppose the number of tasks in the workflow G is n, and the maximum number of types of VMs is m.  8 Complexity e first part of WorkflowLayer algorithm (Lines 1-4) is a counting loop whose time complexity is O(n). Another counting loop (Lines 6-9) will repeat l times, depending on the number of layers that the workflow is divided into. l will be no more than n because there must be at least a task in each layer. So the total time complexity of this Work-flowLayer procedure is O(n).
e ParallelProvision procedure is a recursive procedure. First, it calls the DPK procedure, which provisions each type of VMs only once. So, the time complexity of the DPK procedure is O(m).
en, another procedure Parti-tionPath will likely be called, which has time complexity O(n). is is because each task belongs to either G CSW or G NSW , and each task will be assigned only once. Because the workflow will be partitioned into two sub-workflows by calling the PartitionPath procedure, the maximum number of times the ParallelProvision procedure will be called will be 2 log 2 n � n. erefore, the time complexity of the Paral-lelProvision procedure is O(m · n + n · n) � O(n · (m + n)). Consequently, the overall time complexity of the Global Input: 1. A workflow G � (T, W, E). 2. A collection of virtual machine types VMs � VM j | P j , PRICE j , j � 1, 2, . . . , m . 3. A fixed deadline D. Output: 1. Optimized Provision X. 2. Minimum Cost TC.

Performance Evaluation
In this section, we present our empirical studies of the Globlal Resource Provisioning for Real-time Workflow Algorithm.

Simulation Settings.
To evaluate a resource provisioning algorithm, we measure its performance on some sample workflows. Deelman et al. developed a workflow generator, which can create synthetic workflows of arbitrary size that are similar to real world real-time workflows. Using this workflow generator, they created four different sizes for each workflow application in terms of the number of tasks. ese workflows are available in DAX (Directed Acyclic Graph in XML) format from their website [35]. For our experiments, we chose three sizes which are small (about 50 tasks), medium (about 300 tasks), and large (about 1000 tasks). Meanwhile, In order to explore the influence of different workflow structures on our algorithm performance, three different lengths of critical path are concerned in our simulation. However, the critical path length of each kind of workflow generated by this workflow generator is fixed and less than 10 tasks, so we have to organize several workflows altogether to construct the longer critical path we need. For example, we can connect 5 critical paths with a length of 8 and 1 critical path with a length of 10 end-to-end, so as to get a critical path with a length of 50.
For our experiments, we modeled an IaaS provider that offers a single data center and six different types of VMs with different processor speeds and different prices.
e VM configurations are based on the current Amazon EC2 offerings. We used the work of Ostermann et al. [32] to estimate the computational performance in MFLOPS based on the number of EC2 compute units. Another important parameter for the experiment is the time unit for one billing cycle. Most of the current commercial clouds, such as Amazon, charge users based on a billing cycle of one hour. So, we used a VM billing cycle of one hour.
As a large number of workflows with different attributes are used for our experiments, it is important to normalize the total cost of each workflow execution in order to clearly analyze the cost optimization issue. For this reason, we first take all the tasks of a workflow as a whole and use the dynamic programming method to solve it. We define this cost as Cheapest Cost (CC). Note that the Cheapest Cost is obtained by ignoring the deadline constraint.
e Normalized Cost (NC) of a workflow execution is defined by NC � (total cost of provisioned VMs)/CC, where CC is the Cheapest Cost of provisioning the VMs for the same workflow. Note that the total cost of provisioning the VMs in NC has deadline constraint. Because of the deadline constraints, clearly we have NC ≥ CC.
To evaluate the proposed algorithm, we need to assign a deadline to each workflow. In order to set a series of proper deadlines to our experiments, we define an upper bound of the deadline as the makespan of the Cheapest Cost (CC) situation and a lower bound as the makespan of the critical path executing on the best performaning VM in a sequential order. D U and D L denote the upper and lower bounds of deadline, respectively. To set the deadlines for workflows, we use the deadline factor α, and we set the deadline of a workflow to be D L + α · (D U − D L ), where 0 ≤ α ≤ 1. We let α goes up in steps of 0.1; i.e., α � 0, 0.1, 0.2, . . ., 0.9, 1.0.
To the best of our knowledge, many works have combined resource provisioning and task scheduling to minimize the cost of workflow execution; IC-PCP algorithm is the classical one among these works. erefore, we adopted IC-PCP algorithm as a baseline to evaluate our algorithm. As mentioned in Section 2, IC-PCP algorithm was designed to minimize the cost of workflow execution while meeting a user-defined deadline [17].
is algorithm begins by calculating the EST (Earliest Start Time), EFT (Earliest Finish Time), and LFT (Latest Finish Time) of each task. It then finds the partial critical paths associated with the EST, EFT, and LFT. e tasks on each path are scheduled on the same VM and are preferably assigned to an already leased instance which can meet the LFT of the tasks while the values of EST, EFT, and LFT of each unassigned task are updated. Finally, each unassigned task on the scheduled path is calculated and the process is repeated until all tasks have been scheduled. At the end of this process, each task has been assigned a VM and has start and end times associated with it. e IC-PCP algorithm also aims at cost optimization but focuses on task scheduling. e time complexity of IC-PCP algorithm is higher and it lacks flexibility. Additionally, we use the Simple DPK algorithm to each task of the workflow as another comparative reference to manifest the importance of global idea on cost optimization. Figure 5 shows the relationship between the cost optimization of resource provisioning by the GRP4RW algorithm and the deadline. Figure 5(a) shows the results with different sizes: small size (50 tasks), medium size (300 tasks), and large size (1000 tasks). ese different sizes have similar variations in the trends of the normalized cost with each different deadline. e more tasks a workflow has, the more gentle change of normalized costs with each different deadline. e fluctuation of normalized cost of the workflow with the smallest size is obvious. Figure 5(a) shows that the performance of cost optimization of our proposed algorithm is more stable as the size of workflow is larger. Figure 5(b) shows how the different structures of workflows influence the variation of normalized costs. Because a large size workflow shows more stable normalized cost, we use three large size workflows (1000 tasks each) with different lengths of critical path (CP). e lengths of the critical paths are 50 tasks, 100 tasks, and 200 tasks. e results show that there is a nearly inversely proportional linear relationship between the normalized cost and the deadline to the workflows with the longest critical path. However, the workflow with the shortest critical path has the most expedite convergence to the optimization as the deadline grows larger. erefore, our proposed cost Complexity optimization algorithm can be more efficient when the deadline is set to a low level for the workflow with shorter critical path.

Simulation Results.
Since the performance of cost optimization for larger size and longer critical path is better and more stable, we take a workflow with 1000 tasks with the length of the critical path being 200 as the simulation instance to compare the performances of three cost optimization algorithms: GRP4RW, IC-PCP, and Simply DPK. Figure 6 shows the result of the comparison. e normalized cost of Simply DPK algorithm does not vary much with the change of deadline factor and it always maintains a high value. e normalized cost of IC-PCP algorithm decreases with the increase of deadline. e normalized cost of our GRP4RW algorithm has a certain reduction with the deadline changes and the values of the costs are generally lower than the IC-PCP and Simply DPK algorithm at each deadline. ere are three reasons for these results. First, the Simply DPK algorithm uses the dynamic programming knapsack method to each single task of a workflow. So, its optimization effect is modest and deadlineinvariant. Second, the IC-PCP algorithm containing the combined idea uses the partial critical path as the optimization unit.
is makes the optimization effect improve significantly, but dynamic programming method not utilized in this algorithm results in suboptimal normalized costs for all different deadlines. Finally, both the global idea and the dynamic programming knapsack method are used in GRP4RW algorithm. So the optimization effect achieves the optimal result. In conclusion, global idea is the most important force in the cost optimization of resource provisioning algorithm for real-time workflow. At the same time, dynamic programming knapsack method also plays a positive role in the whole process.
Considering that the length of billing cycles could impact on the performance of the cost optimization algorithms, we studied the effect of the relationship between the length of billing cycles and the execution times of tasks in workflow on the performance of the algorithms by simulation experiments. As the execution time of task seriously depends on the workload of task, we used three different workflows with different workloads of tasks: one workflow denoted as Workflow with Small Workload in which more than 90% task workload are less than 1 MFLOPS, another workflow denoted as Workflow with Medium Workload in which more than 90% task workload are between 1 and 5 MFLOPS, and the third workflow denoted as Workflow with Large Workload in which more than 90% task workload are more than 5 MFLOPS. Meanwhile we use 30 seconds, 60 seconds, 120 seconds, 300 seconds, 600 seconds, and 1200 seconds as the different lengths of billing cycles. e experimental results are shown in Figure 7. It demonstrates that the lengths of billing cycles have impact on the algorithms performance; with the decrease of the length of billing cycles, the cost optimization effect of all algorithms is improved to some extent. When the length of billing cycle is 30 seconds, the performance of all these algorithms is optimal. When the length of billing cycle becomes 3600 seconds, the performance of algorithm Simply DPK decreases significantly, while that of algorithms IC-PCP and GRP4SW is passable. at is to say, when the execution times of tasks are extremely smaller than billing cycle such as the workflow with small workload under the situation that the length of billing cycle is 3600 seconds (seen in Figure 7(a)), the length of billing cycles impact on the performance of algorithms is more significant, and our proposed algorithm GRP4SW has the most advantage at that condition.

Conclusion
In this paper, we presented a global resource provisioning algorithm for executing real-time workflow in cloud. We modeled the problem as an optimization problem that aims to get a cost-effective resource provisioning to executive a workflow while meeting a deadline constraint. e problem was solved by using the dynamic programming knapsack algorithm. Our approach embodies basic IaaS cloud properties such as heterogeneity and elasticity of resource performance and pay-as-you-go model of price mechanism.
We perform simulation experiments using three different sizes and three different structures of real-time workflows separately. Our results show that the cost optimization of our proposed solution improves as the deadline increases. We compared the performance of our algorithm against two other algorithms (Simply DPK and IC-PCP). Our results show that our proposed GRP4RW algorithm has an overall better performance than both the Simply DPK and IC-PCP algorithms under every deadline constraints.
For future work, we intend to improve our strategy for optimizing both data storage and data transfer cost of a realtime workflow execution. In addition, more and more researches have been focused on the optimization objectives about security or energy consumption property of QoS [36,37]; the most important issue is realizing multiobjective optimization.

Complexity 13
Data Availability e workflows used in this study can be accessed via the website http://confluence.pegasus.isi.edu/display/pegasus/ WorkflowGenerator.

Conflicts of Interest
e authors declare that they have no conflicts of interest.