Improved Grid Scheduling Using Hybrid Heuristic Algorithms with Enhanced Initial Solutions

In this study, we have proposed a novel perspective for grid scheduling which aims at decreasing the makespan of the submitted jobs and increasing the utilization of resources involved. Grid scheduling is mapping jobs to grid resources at specific time intervals. Efficient scheduling is crucial to achieve excellent performance through grid computation. Meta-heuristics techniques are used, as grid scheduling is an NP-complete problem. Literature proposes genetic algorithm based heuristics and swarm based optimizations for grid scheduling. This study aims at using meta-heuristics techniques for the scheduling problem to reduce the Make span of task submitted to grid. Artificial Bee Colony (ABC) is selected for optimizing the scheduling due to its simplicity, flexibility and robustness. We have proposed Cluster Heterogeneous Min-Min Artificial Bee Colony (CHMM-ABC) and also a Hybrid ABC algorithm with reactive tabu search for efficient grid scheduling. Also the relationships between initial population and ABCs final outcome have been investigated in this study. Simulation confirms the efficiency of the suggested new approach. The proposed method reaches low makespan in the first run as initial swarm is created by the new CHEFT and Min-Min algorithm with RTS. Simulation reveals a make span decrease of 9.87 % to 13.32 % achieved by the new RTSABC compared to classic ABC.


INTRODUCTION
Grid scheduling is the activity of allocating different jobs to the available resources.Some of the resources available in grid computing are storage space, network bandwidth, CPU cycles and software.The resources participating in Grid Environment are heterogeneous and distributed.Each node in a grid environment shares their resources dynamically during the execution of an application.The selection of a resource depends on the availability, cost and Quality of Service (QoS) requirement of the applications.The assignment of jobs to the resources should be optimal to minimize the makespan, minimize the cost of allocated resources and maximize the throughput (Garg et al., 2010).Various scheduling algorithms have been proposed in literature for scheduling resources in the grid environment (Dong and Akl, 2006).The resources in grid environments are dynamic, heterogeneous and unpredictable which shares different services between users.Due to the grid's heterogeneous and dynamic nature traditional methods are not applicable for grid scheduling.Scheduling is an important area that needs to be addressed to achieve high performance as it aims to find suitable resources allocation for every job.Scheduling decision should address effective resource utilization to reduce job tardiness, when scheduled.Finding optimal resource allocation for specific jobs which reduce jobs schedule length is a challenging research area.Scheduling problem is a NP-complete problem (Lorpunmanee et al., 2007) and non-trivial.Task scheduling algorithms may be categorized into deterministic algorithms and approximate algorithms.Deterministic algorithms are capable of discovering exact optimal results but they are not able to resolve NP-hard optimization issues fast, because their time to reach a solution increases exponentially.Approximate algorithms are capable of discovering almost optimal solutions for optimization issues in a short while.These may be separated into heuristic or meta-heuristic algorithms.Various metrics are present to categorize meta-heuristic algorithms, for instance if they have their basis in single solutions or on a population.Single solution based algorithms like Simulated Annealing (SA), Tabu Search (TS), modify a single solution in the search procedure.Contrastingly, population based algorithms like Partial Swarm Optimization (PSO), Genetic Algorithms (GA) and ant colony optimization, regard populations of solutions.Few algorithms also explore problem spaces globally and others search it locally (Pooranian et al., 2013).
To get an optimal scheduling plan, evolutionary algorithms and swarm intelligence algorithms have been effectively used (SarathChandar et al., 2012).Meta-heuristics techniques like Particle Swarm Optimization (PSO), Ant Colony Optimization (ACO) and Genetic Algorithm (GA) have been efficiently utilized for solving issues in grid scheduling.Metaheuristic algorithms have been found to be more effective if the initial population is selected from existing sub optimal scheduling algorithms such as First Come First Serve (FCFS).Longest Job First (LJF) as well as FCFS schedule was used to get initial solution for the Fuzzy Particle Swarm based scheduling (Liu et al., 2010), Shortest Job First (SJF) was used as initial scheduler for Swift Scheduler algorithm (Somasundaram and Radhakrishnan, 2009) as well as SJF and LJF was used as initial schedule for GA based scheduler (Carretero and Xhafa, 2006) in literature.There is increasing interest in Multi-Objective Evolutionary Algorithm (MOEA) (Grosan et al., 2007) which combines evolutionary algorithms with theoretical frameworks of multi-criteria decision making.Although little real life issue may be boiled down to one objective, usually it is difficult to delineate one objective's aspects.Determining several objectives provide a more nuanced notion of a task.Multiobjective evolutionary algorithms yield potential solutions, optimal to an extent.Multi-objective optimization environment's main challenge is minimizing distance of generated solutions to Pareto set and maximizing developed Pareto set diversity (Coello, 2006).An excellent Pareto set is achieved by directing the search procedure through reproduction operators/fitness designation design mechanisms.To diversify, special care is ensured in the selection process.Similar care prevents non-dominated solutions from getting lost.ABC algorithm is a meta-heuristic approach on the basis of foraging behavior of honey bee swarms (Karaboga and Basturk 2008).It does not require cross over rates or mutation rates like genetic algorithms to solve the problem.ABC algorithm has been effectively utilized to resolve constrained and unconstrained function optimization issues.ABC's advantage over other optimization algorithms includes its (Bolaji et al., 2013): • Simplicity, malleability and its robust nature • Utilization of reduced control variables • Hybridization ease with other algorithms • Able to deal with objective cost with stochastic nature • Easy execution with basic logical operations We have also studied the impact of various initialization techniques used by optimization algorithms in the grid scheduling in our previous work (Vigneswari and Maluk Mohamed, 2014a) and have derived a hybrid algorithm Cluster Heterogeneous Min-Min Artificial Bee Colony (CHMM-ABC) for efficient grid scheduling (Vigneswari and Mohamed, 2014b

MATERIALS AND METHODS
Literature survey: Many paradigms based on population heuristic algorithms such as min-min, fast greedy, Tabu Search and ant System has been suggested to optimize the grid scheduling.The Grid scheduling problem was studied in Ruda and Rudová (2005) which aim for an optimal assignment of jobs to resources.
A load balanced min-min algorithm has been suggested in Kokilavani and Amalarethinam (2011).This algorithm not only aims at reducing makespan but also aims at improving resource utilization.The second phase of this algorithm reschedules the underutilized resources.Double Min-Min Algorithm based on efficient set pair analysis has been proposed in Miriam and Easwarakumar (2010).Along with reducing the makespan, this algorithm ensures system availability.Quality of service guided min-min proposed by He et al. (2003) which schedules task requiring high bandwidth.This algorithm out performs the classical min-min algorithm.Min-mean heuristic has been proposed in Kamalam and Muralibhaskaran (2010) which reschedules the schedule given by Min-Min.It regards the mean makespan of all resources.The performance of this algorithm underperforms when the heterogeneity of the task increases.Quality of Service Guided Weighted Mean Time-Min (QWMTM) and Quality of Service Guided Weighted Mean Time Min-Min Max-Min Selective (QWMTS) algorithms have been proposed in Chauhan and Joshi (2010).The network bandwidth is considered as variable for QOS.The algorithm proposed in Singh and Suri (2008) selects the next job based on applied heuristics.The selection is based on quality of service based Max-Min or quality of service based Min-Min algorithm.This algorithm utilizes the history information about the job execution.The entire above said scheduling algorithm using Min-Min produces better makespan but resource utilization will be inefficient as they consider the shortest job first.Our work aims at improving both the makespan and also resource utilization.The combination of GELS algorithm with PSO was used in Barzegar et al. (2009) for grid scheduling.The job which cannot complete in a particular resource within dead line can be switched to another resource by using the proposed objective function.This study aims at improving QOS parameters.A scheduling strategy using PSO algorithm which uses position and velocity vector instead of real vector was proposed in Abraham et al. (2006).This algorithm aims at completing the task at minimum time span and also efficient resource utilization.Izakian et al. (2009) proposed an algorithm for task scheduling in grid systems.It focuses on concurrently minimizing makespan as well as flow time.It utilizes matrices wherein all columns represent jobs' resources allocation while all rows denote jobs allotted to resources.The scheduler suggested in Izakian et al. (2009) aims to decrease both makespanas well as flowtime.Work flow scheduling has been solved in Raj and Vasudevan (2011) by using Simulated Annealing and it is efficient in a grid environment.But the rate of convergence is less in this algorithm.The rate of convergence is good in our algorithm.A scheduling strategy using PSO algorithm which uses position and velocity vector instead of real vector was proposed in Liu et al. (2010).This algorithm aims at completing the task at minimum time span and also efficient resource utilization.Another algorithm using PSO based on TS was proposed in Mathiyalagan et al. (2010).It also provides fast convergence by modifying the inertia equation.Zhang et al. (2008) resolves task scheduling issues through the utilization of PSO alongside Small Position Value (SPV) rule taken from arbitrary key representations.SPC rules may transform continuous position values into discrete permutations in PSO.Simulations reveal that PSO outperforms GA in big scale optimizing issues.PSO is utilized for task scheduling alongside two heuristic models: Latest Finish Time (LFT) as well as Best Performance Resource (BPR) in Chen and Wang (2011).This is used for deciding task priorities in resource queues.Yusof and Stapa (2010) proposes a TS algorithm, a local search algorithm that is utilized to schedule tasks in a grid system.TS utilizes a perturbation strategy for pair changing.Tabu search was used to provide better scheduling in Benedict and Vasudevan (2008).The objectives were maximizing job completion ratios as well as minimizing penalties of grid schedulers in selecting specific sequences.FCFS, EDF and LCFS were also compared with this method.Tabu search algorithm wasalso used in Fayad et al. (2007) for generating excellent schedules and explore robust nature of schedules during processing time differ through assessment of performance in fuzzy as well as crisp modes.In Omara and Arafa (2010), the Critical Path Genetic Algorithm (CPGA) as well as Task Duplication Genetic Algorithm (TDGA) are suggested; both alter the typical genetic algorithm to enhance efficacy.Two greedy models are appended to genetic algorithms such that wait timings for jobs to begin and finally makespan is decreased.The TDGA (El-Rewini et al., 1994) algorithm merges Duplication Scheduling Heuristic (DSH) algorithm with a genetic algorithm.Overloads as well as transmission delay is decreased and overall implementation times are brought to a minimum.The chaos-genetic algorithm (Gharooni-Fard et al., 2010) is a genetic algorithm for resolving issue of dependent jobs scheduling, to valuate QoS where chaos parameters are utilized than arbitrarily giving original population.The merging of the benefits of genetic algorithms and chaos parameters to explore search space avoids premature convergence in algorithm and provides solutions rapidly, with quicker convergence.Integer Genetic Algorithm (IGA) (Tao et al., 2010) is a GA to solve dependent tasks scheduling which concurrently regards three qualities of service variables: time, cost as well as dependability.As the variables are conflicting, they are not capable of being concurrently optimized because enhancement in one decreases quality in the next.Weights are designated arbitrarily or by the user to every variable.The Group Leaders' Optimization Algorithm (GLOA) (Pooranian et al., 2013), use the new evaluation (distributed) algorithm to resolve issue of scheduling independent jobs in grid computations.The algorithm owes its inspiration to the impact of leaders in social aggregations.Outcomes of GLOA were contrasted with GA, SA, GGA and GSA.The comparison shows that the runtime and makespan is lesser than other AI methods.Xu et al. (2003) suggested a simple Ant Colony Optimization method in grid simulation architectures and utilized valuation indexes in response times and resource average utilizations.Lu et al. (2004) and Yan et al. (2005) suggested another enhanced Ant Colony Optimization model that enhanced job finishing ratios.But they did not utilize the several valuation indexes to assess it.ACO has been discussed in Chang et al. (2009), Ku-Mahamud and Abdul Nasir (2010) and Zhu and Wei (2010) with various variations.Performance of metaheuristics for task scheduling in grid computing, owing their inspiration to nature was studied in Abraham et al. (2008).The performance of GA, SA, ACO as well as PSO were evaluated.
Many works have been discussed in literature regarding using ABC model schedule tasks in grid computations.A brief survey is available in Alyaseri and Ku-Mahamud (2013) for ABC and BCO methods used in Grid scheduling.It lay emphasis on further optimization of the algorithm to get better results.ABC scheduling has been used in Vivekanandan et al. (2011) which reduce finish time and average waiting time.Binary implementation of ABC (BABC) and its extended version are available in Kim et al. (2013) which attains balance between diversification and convergence of the exploration procedure.Another approach Multi-Objective Artificial Bee Colony (MOABC) (Arsuaga-Ríos et al., 2011) provides decision support to the users in selecting resources in context with execution time and cost.BCO has been improved in Taheri et al. (2013) to minimize the makespan and data transfer time.Makespan, deadline and priority requirements are considered in Mousavinasab et al. (2011) for grid scheduling.In our work we have modified the ABC algorithm to get better makespan and also resource utilization.We have proposed a Clustered HEFT-Min-Min algorithm to choose the initial population.Hence, we get good convergence rate in lesser number of iteration itself.We have also proposed Hybrid CHMM-ABC and RTS algorithm to further enhance the Makespan and resource utilization.

Background details:
The various methods used in this investigation are presented in this section.Scheduling algorithms like Heterogeneous Earliest Finish Time (HEFT), Min-Min and ABC are briefed.

Heterogeneous Earliest Finish Time scheduling (HEFT):
HEFT is a list of scheduling heuristics based on 2 components: priority function to arrange every node in task graphs at compilation time and objective functions that should be reduced to a minimum.HEFT beings by fixing tasks computational costs and edges communication costs to mean values.Tasks are designated values titled upward ranks.A task's upward rank in this algorithm ti is the hugest of mean computational costs and mean communication costs on a directed route from task ti to exit task.A task list is created through tasks classification through an upward rank decreasing order; ties are random.At a scheduling step, an unscheduled task with greatest upward rank value is chosen and designated to a processor that lowers finish execution time, using insertion-based scheduling policy.HEFT has 3 phases (Wieczorek et al., 2005): • Weighting: designates a weight to node or edge in workflow.
• Ranking: generates an ordered task lists, arranged in execution order.• Mapping: assigns tasks to resources.
Min-min algorithm: Min-Min algorithm schedule tasks by considering tasks execution time on resources.The algorithm starts with a set U of unscheduled tasks.The least completion times set for every task exiting in U is discovered.Next, a task with total minimum completion time from unscheduled tasks is chosen and designated to respective resource.Last, the fresh scheduled job is eliminated from U and the procedure reiterates till all jobs are scheduled.
Artificial Bee Colony (ABC) algorithm: Employed bees look for food locations in parallel and inform other bees by dancing.Employed bees equal the quantity of food sources with an employed bee being designated to a food source.The bee on arriving at source, calculates a new solution, retaining the best solution (greedy heuristics).When a source fails to improve after iteration, it is dumped and substituted by a scout bee located food source, involving a random calculation for a new solution.Onlooker bees evaluate and choose the best solution from among those given by employed bees.Scout bees start a new solution search (Suter et al., 2004).ABC has four phases (Kiran and Gündüz, 2012).
i. Initialization Phase: Food sources with certain population size are created by scout bees randomly.A food source xm is a vector to an optimization issue, xm has D parameters and D is objective function's search space dimension needing optimization.Initiation of food sources are produced randomly through an expression given in Eq. ( 1): (0,1) * ( ) where, ui and li are upper and lower bound of objective function's solution space, rand (0, 1) is an arbitrary number in a range of [0, 1].
ii. Employed Bee Phase: Employed bees locate a new food source in the neighbourhood.A high quantity food source is selected.A neighbour food source vmi is determined by using equation (2) ( ) where xk is arbitrarily selected food source, i is arbitrarily selected variable index, mi φ is an arbitrary number in range [-1, 1].The food source fitness is essential to locate a global optimal.Fitness is calculated by a formula.Then a greedy selection is employed between xm and vm: where fm(xm) is the objective function value of xm.
Onlooker bee phase: Onlooker bees see waggle dance in the dance area and compute food sources profitability randomly choosing a higher food source.Food source quantity is valuated through profitability and profitability of all sources.Pm is defined by Eq. ( 4): (5): (0,1) * ( ) where xm is new generated food source, rand (0, 1) is a random number within range [0, 1], ui and li are upper and lower bound of objective function's solution space.

PROPOSED SCHEDULING TECHNIQUES
In this section we are going to brief about the two algorithms proposed by us for enhancing the grid scheduling.We have proposed a Clustered HEFT-Min-Min (CHMM) ABC algorithm which has two folds.Initially the resources are clustered into groups and then Min-Min algorithm is applied to select the initial Swarm, then ABC algorithm is applied.We have also suggested a nove l hybrid ReactiveTabu Search-Artificial Bee Colony (RTS-ABC) to find an optimum schedule for the submitted tasks.

Proposed Cluster HEFT (CHEFT):
In the proposed Cluster HEFT (CHEFT) method, the grid is partitioned into clusters for better utilization of the distributed resources first.The grid is represented by an acyclic graph G (V, E) wherein V is a group of nodes denoting resources while E is a group of directed edges denoting interconnection between resources.The fan-out of information communication equipment is the edges incident from it, as well as the fan-in the edges incident to it.Fundamental inputs are resources with zero fan-in and fundamental outputs are resources with zero fanout.In an acyclic graph G, every node in V is designated weights except for primary input resource that gets zero weight.Grid resources are clustered into a sparser network such that maximum fan-out is reduced.Hence the edges are decreased in a clustered network.After clustering, the edges are equal to the clusters as only one fan-out is radiated from a cluster producing a good initial solution for scheduling.The pseudo code of the proposed Cluster HEFT (CHEFT): CLUSTER (resource, vector, i) Represent fan-in resources of i as vector While test vector If vector>0 i = a randomly selected resource from vector j = number of resource that can be arrived at by i If current cluster size +j<maxcluster size Assign the set S of resource that can be reached by i to the currentcluster Remove the clustered resource from vector Do While HEFT (cluster, s) Store processing capability of each cluster in descending order in s ‫ݔܽ݉ܵ‬ = (݆) ‫݆ݍ‬ = 1 (cluster with q cpu time) While set of task ‫ݑ‬ ≠ 0 For every task ݅ Calculate the optimal capacity ܵ݅ * For each ready task ݅

Proposed Cluster Heterogeneous Min-Min Artificial Bee Colony (CHMM-ABC):
In the new CHMM-ABC algorithm's first step, initial swarm ‫ݔ‬ Ԧ i (i = 1, ..., SN) solutions are given through the usage of new CHEFT Algorithm with Min-Min algorithm which runs till all tasks are assigned.In the algorithm's second stage, for every employed bee, whose sum equals half the food sources, a fresh source is generated.Then, an onlooker bee selects a food source with probability based on fitness and generates a new source at chosen food source.The stages of CHMM-ABC is given (Nallusamy et al., 2015): The Fitness function is given by: ( ) where, 1 α β + = , P i are selected clusters, B avg is average bandwidth among selected clusters and B i,1+1 is difference in bandwidth between two hops.Onlookers are spread to sources which are assessed to see if they are abandoned.If the cycle, by which a source is incapable of being enhanced, is bigger than a predefined constraint, source is regarded as depleted.Employed bee linked to depleted source is a scout searching randomly in a problem domain by: Hybrid artificial bee colony-reactive tabu search algorithm: Initial population in a new hybrid Reactive Tabu Search -Artificial Bee Colony (RTS-ABC)with random initialization and RTS is applied.The output of RTS is input for ABC.Tabu Search (Battiti and Tecchiolli, 1994)  In the next stage of our hybrid algorithm, for every employed bee, whose sum is equal to half the food sources, a fresh source is generated.In the next step, onlooker bee selects a food source with fitness based probability producing a fresh source at a chosen food source site.

RESULTS AND DISCUSSION
Here, we have performed threefold of analysis: • Comparison of ACO, ABC algorithm and ABC algorithm with proper initialization techniques.• Then we have compared the performance of ABC scheduling with proposed CHMM-ABC and Hybrid ABC scheduling algorithms.We consider the make span as the metrics for this comparison.
• Considering the dynamic nature of grid resources, we have varied the number of resources participating in the grid and analyzed the performance of our proposed algorithms to evaluate the resource utilization.
We have assumed that each job is independent of each other and the resources are dynamically distributed.The jobs have dynamic arrival time and have different resources requirements.The make span value is calculated during each run.The initial solution starts with 40 bees of which 20 bees are worker bees and the remaining are onlooker bees.The ABC parameters used in this study is given in Table 1.

Analysis of ABC, CHMM-ABC and analysis of proposed hybrid ABC scheduling algorithms:
Here we have compared the classic ABC scheduling with our proposed CHMM-ABC and Hybrid ABC scheduling algorithms.Figure 2 shows the graph of average Makespan of 5 runs achieved in the simulation for ABC, CHMM-ABC and proposed hybrid ABC scheduling algorithms.It is observed that average Makespan is reduced after 25 iterations for the proposed CHMM-ABC and the Makespan keeps reducing with the iteration as the proposed CHMM-ABC refines the solution iteratively for the proposed fitness function.In all the five runs, the solution converged within 100 iterations.
Figure 3 shows the average makespan of 5 runs achieved in the simulation.A significant decrease of makespan of 16.13% is achieved by the CHMM-ABC when compared to the classic ABC.Also, the significant decrease of makespan of 19.82% is achieved by the proposed Hybrid RTS-ABC when compared to the classic ABC.In Fig. 4, for 175 jobs with 5 resources, considerable decrease of makespan of 8% is achieved by the RTS-ABC when compared to the Min-Min and 1.88% for classic ABC.ACO which is widely used in scheduling underperforms when compared to ABC and proposed RTS-ABC by 3.19 and 5.08%, respectively for 175 jobs.Case study: A sensor grid middleware has been designed to receive vital sign from the patients and monitor them continuously to check for deviations from normal values.This sensor grid has been proposed in literature wherein sensor data is integrated to the grid for processing.The scheduling component of this middleware is an crucial element which schedules the data to computational resource for executing in a SaaS.Already we have tested the efficiency of ABC  scheduling in Vigneswari and Maluk Mohamed (2014c).The efficiency of our proposed ABC scheduling is analyzed in this section Experiments were also carried out by increasing the quantity of jobs and the quantity of resources.Table 2 and 3 shows the Makespan for increasing number of jobs with 5 and 15 resource clusters.

CONCLUSION
The algorithm CHMM-ABC, presented in the current work provides a novel perspective for grid scheduling.It aims at decreasing Makespan and increasing the resource utilization.We have also presented a hybrid CHMM ABC-RTS algorithm our CHMM-ABC algorithm elegantly combines with RTS and Provides better performance results.Detailed performance studies of these two proposed algorithms in a telemedicine application reveal the benefit of using the two proposed grid scheduling algorithms.The new method reaches low makespan in the first run as initial swarm is created by the new CHEFT and Min-min algorithm with reactive tabu search that improves the new method.An interesting direction of future research work is to explore the possibilities of including trust worthiness of the resources as one of the objective.
(xm) is the fitness of xm.Scout phase: Scouts randomly search for new solutions.When solution xi is abandoned, a new solution xm is discovered.Then xm is defined by Eq.

Fig. 1 :
Fig. 1: Average makespan Comparison of average makespan for grid scheduling with three initialization methods: The effectiveness of the scheduling depends on the rate of convergence of the initial solution.Various studies have proved that if proper initialization method is used, the convergence rate is better.Therefore initially we have performed scheduling of jobs with ABC algorithm and ABC algorithm with three initialization techniques such as random, orthogonal and chaotic.During every run, Makespan value is calculated.Totally 100 iterations are performed.Figure 1 shows the comparison between the Average Makespan values of Min-Min algorithm, ACO, ABC without initialization and ABC with three initialization methods From Fig. 1, it is noted that Average Makespan of 3 runs is found in the simulation with varying number of iterations.The average Makespan of ABC performs better when a proper initialization technique is used.

Table 1 :
Parameters used in the initial simulation

Table 2 :
Makespan by varying the number of jobs with 5 resource

Table 3 :
Makespan by varying the number of jobs with 15 resource