Hybrid Ant Colony System and Genetic Algorithm Approach for Scheduling of Jobs in Computational Grid

: Metaheuristic algorithms have been used to solve scheduling problems in grid computing. However, stand-alone metaheuristic algorithms do not always show good performance in every problem instance. This study proposes a high level hybrid approach between ant colony system and genetic algorithm for job scheduling in grid computing. The proposed approach is based on a high level hybridization. The proposed hybrid approach is evaluated using the static benchmark problems known as ETC matrix. Experimental results show that the proposed hybridization between the two algorithms outperforms the stand-alone algorithms in terms of best and average makespan values.


INTRODUCTION
Grid computing technology is considered as an intelligent multi-level platform that provides a wide range of services (Kołodziej and Khan, 2012).Grid computing is defined as "geographically distributed computers, linked through the Internet in a grid-like manner and are used to create virtual supercomputers of vast amount of computing capacity able to solve complex problems from e-Science in less time than known before" (Xhafa and Abraham, 2010).Another definition for grid computing is "a form of distributed computing that coordinates and provides the facility of resource sharing over various geographical locations" (Rajni and Chana, 2013).From these definitions, grid computing could be defined as a technology of connecting various resources distributed in different locations with the aim to provide various services.
Grid systems evolve from existing technology such as distributed computing, web service and the Internet (Magoules et al., 2009).Grid systems are classified as modern High Performance Distributed Systems (HPDSs) along with clusters and cloud systems (Kolodziej, 2012).However, there are crucial characteristics which differ between them, such as scale, network type, administrative domain, resources structure and security (Hsu et al., 2011;Kołodziej et al., 2014;Montes et al., 2012).
There are many different types of grid systems, such as: • Sensor grid, which is based on sharing sensor resources in a sensor network (Li and Li, 2014) • Campus grid, which is implemented in campus environments in order to facilitate unified access to the distributed and heterogeneous resources such as clusters, storage and scientific instruments (Bose et al., 2004) • Data grid, which is mainly designed to provide data-intensive applications that need to access, transfer and modify massive data stored in distributed storage resources (Venugopal et al., 2006) • Desktop grid, another important type of grid developed to connect Personal Computers (PCs) with large-scale networks using the Internet or any other high-speed connection media (Kolodziej, 2012) • Utility grid, which is based on providing computing services to the users or organizations in return for regular payment (Babafemi et al., 2013;Garg et al., 2009).
Grid system has been utilized in various fields, such as the high energy physics grid in Japan (Sakamoto, 2007), molecular systems using grid environment (Costantini et al., 2014), multi-physics coupled applications using Batch Grid (Murugavel et al., 2011), Enzyme Design Process using University of California Grid (Wang et al., 2011), medical informatics using GridR environment which is based on embedding R software into grid framework (Wegener et al., 2009), processing of scientific knowledge using high performance grid computing by means of natural language processing and text mining (Jeong et al., 2014), Climate Simulations for Europe and India regions based on grid computing environment (Cozzini et al., 2014), 3D electrophoresis coupled problem simulation based on Asynchronous grid computing environment (Chau et al., 2013), high-resolution agricultural systems modelling using grid computing and parallel processing (Zhao et al., 2013), services for neuroimaging analysis using Intelligent grid based on neuGRID project (Richard et al., 2013), The Earth System Grid Federation (ESGF) which is based on nodes that are geographically distributed around the world (Cinquini et al., 2012) and chemistry experiment tools based on PL-Grid environment (Eilmes et al., 2014).
One of the main components in grid computing systems is the Resource Management System (RMS) which is required in providing and sharing the resources efficiently in the grid environment (Czerwinski et al., 2012;Siddiqui and Fahringer, 2005).RMS could be implemented with one or multiple resource management nodes called Resource Manager (RM) (Qu et al., 2005).Resource management in grid computing is a challenging task due to the heterogeneous, dynamic, autonomous and ephemeral grid resources (Li and Li, 2012).RMS has several services, such as Grid Information Services (GIS), monitoring the status of tasks and environment, resource scheduler, resource reservation, accounting and reporting (Czerwinski et al., 2012;Abraham et al., 2000).The scheduler has the main influence in grid computing performance (Amiri et al., 2014).The scheduler's responsibility is to map the submitted jobs from users to the suitable and available resources (Qureshi et al., 2014).The efficiency of the scheduler depends on the implemented scheduling algorithm.Scheduling could be done using simple algorithms such as greedy or random approach.However, using more sophisticated algorithms will enhance the scheduler's efficiency, which in turn will enhance the grid performance in general.
Scheduling of jobs in grid computing is known as an NP-complete problem due to the complexity and intractable nature of the problem (Burkimsher et al., 2013;Prajapati and Shah, 2014), which could be solved using metaheuristic algorithms.These types of algorithms have the ability to find near optimal solution in reasonable time compared to optimal solution in a very long processing time (Xhafa et al., 2011a).Metaheuristic algorithms, such as Tabu Search (TS), Genetic Algorithm (GA) and Ant Colony Optimization (ACO), show very promising performance to solve various types of scheduling problems (Zapfel et al., 2010).However, hybridizing two or more algorithms show better performance than applying a stand-alone algorithm (Kolodziej, 2012).This is due to the ability of the hybrid approach to skip from local minima, using more options available in the algorithms used for the hybridization.Hybrid approaches between ACO and GA have been studied in Chaari et al. (2012) and Al-Mahmud and Akhand (2014).These hybridized approaches are different from the proposed hybridized approach in this study.The Ant System (AS), which is a variant of ACO, has been used in Chaari et al. (2012) and Al-Mahmud and Akhand (2014) to solve robot path planning and university class scheduling respectively.In this study, the Ant Colony System (ACS), which is another variant of ACO, is used to solve job scheduling in static grid computing environment.

METAHEURISTIC ALGORITHM FOR NP PROBLEMS
In computational grid systems, scheduler is an important component for resource management.Scheduler algorithm has the responsibility to schedule jobs efficiently (Amiri et al., 2014).Job scheduling is known as an NP-complete problem which needs metaheuristic algorithms to be solved (Folino and Mastroianni, 2010).One of the best metaheuristic algorithms in the field of optimization is ACO.ACO is considered as a swarm intelligence algorithm which mimics the behaviours of real biological ants.ACO is implemented to solve various problems, such as routing (Wang et al., 2013), scheduling (Neto and Filho, 2013) and classification (Michelakos et al., 2011).Many studies have implemented and enhanced ACO for job scheduling in grid computing.An ACO approach for job scheduling in grid system in Kant et al. (2010) proposed two types of ants, namely the red and black ants for the purpose of sharing the search load.The performance of this algorithm was compared with Min-Min algorithm presented in Liu et al. (2009) and first come first serve algorithm.Experimental results show that this algorithm outperforms the other two algorithms.
A study presented by Chang et al. (2009) proposed the Balanced ACO (BACO) algorithm for job scheduling in grid.The proposed algorithm is based on the basic ideas from the ACO algorithm.Each ant in the system represents a job in the grid systems.In addition, the pheromone value represents the weight for a resource in the grid system.Higher weight means that the resource has a better computing capability.The study also considered the bandwidth speed availability between the scheduler and resource.This algorithm has been implemented in the Taiwan UniGrid which consists of more than 20 campuses.The experimental results show that the BACO algorithm outperforms the improved ACO in Yan et al. (2005), fastest processor to largest task first (Menasce et al., 1995) and Sufferage (Silva et al., 2003).
A hybrid ACO approach (HACO) for job scheduling in grid computing proposed in Nithya et al. (2011) has integrated the heuristic information to make the algorithm converge faster to the solution.The experiments conducted have used the benchmark model known as Expected Time to Compute (ETC) model presented in Braun et al. (2001).The performance of HACO was compared with ACO in terms of makespan criterion.Empirical results show that HACO outperforms the existing ACO algorithm.
A successful variant of ACO algorithm for job scheduling in computational grid presented in Kumar and Sumathi (2011) is known as the ant colony system developed by Dorigo and Gambardella (1997).ACS algorithm enhances ant system in three phases: first, the exploration mechanism becomes stronger due to the implementation of the aggressive rule.Second, only the ant who found the best solution is allowed to deposit the pheromone trail to the arcs which belong to that solution.Third, the evaporation process will be applied only to the arcs used by ants to increase the exploration of alternative arcs (Dorigo and Stutzle, 2004).
Besides ACO-based algorithm, there are many other algorithms that have been successfully applied to solve optimization problems.One of these algorithms is GA, which is a metaheuristic algorithm that imitates the principle of genetic process in living organisms (Sivanandam and Deepa, 2008).GA mimics the evolutionary process by applying selection, recombination and mutation to generate solutions from the search space.Genetic algorithm is a well-known algorithm to solve various types of combinatorial optimization problems.Enhanced Genetic-based scheduling for grid computing is proposed in Kołodziej et al. (2011b).The authors presented an implementation of Hierarchic Genetic Strategy (HGS) for job scheduling in dynamic computational grid environment.HGS has the ability to search the solution space concurrently using various evolutionary processes.The study focused on bi-objective optimization specifically, makespan and flow time simultaneously which have been optimized.Experiments were conducted under heterogeneous, large scale and dynamic environments using the grid simulator.HGS was tested with static and dynamic grid computing environment.The experiment with static environment is based on the ETC matrix model presented by Ali et al. (2000a) and for dynamic environment, the authors used a simulator presented by Xhafa and Carretero (2009).HGS was also compared with two other GA-based schedulers presented in Braun et al. (2001) and Carretero et al. (2007).The results show that HGS outperforms the other GA-based schedulers.However, it is not known how HGS will perform against other metaheuristic algorithms, since only GA-based algorithms were used for comparison.
A study presented by Xhafa et al. (2011c) proposed a hybrid approach between GA and TS for independent batch job scheduling in grid computing.The hybrid algorithm aims to optimize the makespan and flowtime as a bio-objective scheduling problem.In addition, the authors proposed hierarchical and simultaneous approaches for optimizing makespan and flowtime.
Two types of hybridization were provided, namely low and high level hybridization which are known as GA(TS) and GA+TS algorithms.The experiments conducted have considered static and dynamic grid computing environment using HyperSim-G simulator developed by Xhafa et al. (2007a).The proposed algorithms were compared with GA presented by Carretero et al. (2007) and TS presented by Xhafa et al. (2009a).Experimental results show that the proposed hybrid algorithms outperform the other stand-alone algorithms in terms of makespan criterion.However, in terms of flowtime criterion, GA and TS stand-alone algorithms outperform the proposed hybrid algorithm.Such a contradiction is normal for job scheduling in grid computing.In spite of the limitation on the experiments and benchmarking problem, the study has clearly illustrated the implementation of the hybrid algorithms.Kim et al. (2013) applied Artificial Bee Colony (ABC) for job scheduling in computational grid.The authors proposed Binary ABC (BABC), Efficient Binary Artificial Bee Colony (EBABC1) and flexible ranking strategy (EBABC2) algorithms.The study aimed to minimize the makespan criterion for job scheduling in grid computing.The experiments were conducted using a series of benchmark problems defined in Liu et al. (2010).The proposed algorithms were compared with genetic algorithm, simulated algorithm and particle swarm optimization algorithm.In terms of makespan criterion, EBABC1 and EBABC2 algorithms achieved the best results among all other algorithms with superior performance for EBABC2.Nayak et al. (2012) proposed an algorithm which combined the merits of genetic algorithm and bacterial foraging optimization algorithm called Genetic Bacterial Foraging (GBF).The proposed algorithm implemented a dynamic mutation as presented in Michalewicz (1996) and crossover operator developed by Michalewicz (1999).The aim of the study is to reduce the execution time as a cost function.The experiment was conducted using a dynamic environment generated with a simulator developed by the authors.The proposed algorithm was compared with Bacterial Foraging Optimization (BFO) algorithm.The experiment results show that the proposed GBF algorithm outperforms BFO algorithm.However, the experiment scenario was very small, using only four resources and five tasks.Therefore, more studies are required to understand the behavior of Bacterial Foraging Optimization algorithm.Rajni and Chana (2013) conducted a study on Bacterial Foraging Optimization (BFO) algorithm for resource scheduling on computational grid systems.The study aimed to optimize makespan and cost values by considering Resource Provisioning (RP) units adopted from Aron and Chana (2012).The proposed approach was implemented using the GridSim simulator developed by Buyya and Murshed (2002).
The experiments were conducted by generating a workload using a model defined in Lublin and Feitelson (2003) and the expected time to compute the model presented in Ali et al. (2000b).The authors compared the proposed algorithm with genetic algorithm, simulated annealing and GA-TS algorithms.The experiment results show that the proposed BFO algorithm outperforms other algorithms in terms of makespan and cost values for both low and high machine heterogeneity benchmark problems.In addition, the results show that the Coefficient of Variation (CV) of the proposed algorithm is in the range 0-2%, which confirms the stability of the proposed algorithm.
A comparison of four metaheuristic algorithms for task scheduling in computational grid system was presented by Meihong et al. (2010).The algorithms used in their study for comparison are genetic algorithm, ant colony optimization algorithm, particle swarm optimization algorithm and simulated annealing algorithms.The evaluation criteria are makespan and mean response time.The authors conducted experiments using static environment.The results show that the PSO algorithm has the best performance among the other algorithms.However, the experiments were conducted using very small scenarios (5 users and 3 resources).Therefore, the robustness of the compared algorithms is not proven.In addition, only classical versions of the algorithms are used, while enhanced versions are better in terms of performance.In order to obtain a clear picture about which metaheuristic is better, more investigations and experiments are required using a known benchmark such as the one presented in Braun et al. (2001).Izakian et al. (2010) proposed a discrete particle swarm optimization for job scheduling in grid computing.Their approach aims to minimize the makespan and flowtime simultaneously in grid computing.In their study, they provided two representations for mapping between problem solution and PSO particle.The first representation used a direct encoding that is a vector with the size equal to the number of tasks.Each element in the vector represents the machine number.The second representation used a binary matrix size of (jobs number * machines number).The matrix was represented with values of either 0 or 1.The benchmark problem used to evaluate the proposed algorithm is based on the expected time to compute the model presented by Braun et al. (2001).The proposed algorithm was compared with GA, ACO, PSO and Fuzzy PSO algorithms.The experiment results show that the proposed algorithm achieved good results in makespan reduction, while for flowtime, the algorithm performed the worst.Although the study aims to minimize makespan and flowtime, the contradiction is clear between them such that the algorithm could not reduce both of them simultaneously.This contradiction is mentioned by Xhafa and Abraham (2010) in grid computing as well.
In general, the proposed algorithm performs better than other algorithms.
Another study using fuzzy particle swarm optimization for job scheduling in grid computing has been proposed in Liu et al. (2010).In their algorithm, they extended the velocity and position of particles from the real vectors to fuzzy matrices.The advantages of using fuzzy matrices in PSO are the speed of convergence and the increase of the ability to find a faster and feasible solution.The study used the makespan criterion to measure the algorithm's performance.The performance of the proposed algorithm was compared with genetic algorithm and simulated annealing algorithm.The experiment results show that the proposed algorithm outperforms the other algorithms especially in terms of execution time.However, the study did not use a common benchmark in order to evaluate the proposed algorithm with other approaches.In addition, only genetic algorithm and simulated annealing algorithms were used for comparison, which are also not enough to give a complete picture.
Proposed ACS+GA for job scheduling: Hybridization is a term which refers to the approach that combines two or more algorithms in order to achieve a result which is not achievable using a stand-alone approach (Xhafa et al., 2009b).Algorithms could be fully or partially hybridized to be able to get the best features of the combined algorithms.There are two levels of hybridization between algorithms, namely high level and low level (Xhafa et al., 2011c).In high level, which is also called loosely coupled hybridization, each algorithm preserves its identity.In other words, each algorithm operates fully in the hybridized approach.This type of hybridization can be seen as a chain of algorithm execution (˓ˬ˧JJ˩ˮℎ˭ # → ˓ˬ˧JJ˩ˮℎ˭ $ → ⋯ → ˓ˬ˧JJ˩ˮℎ˭ ).This execution can be further looped into a certain number of iterations until the termination condition is satisfied.Through the algorithm execution, the output solution is passed from ˓ˬ˧JJ˩ˮℎ˭ # to ˓ˬ˧JJ˩ˮℎ˭ $ and so on.In low level hybridization, also known as strongly coupled, the algorithms interchange their inner procedures.The level of hybridization reflects the degree of inner exchange among the hybridized algorithms.In low level hybridization, one of the algorithms is the main algorithm, which calls other algorithms at any time of execution (depending on the hybridization design).The low level hybridization algorithm could be presented as ˓ˬ˧JJ˩ˮℎ˭ # (˓ˬ˧JJ˩ˮℎ˭ $ ).In this representation, ˓ˬ˧JJ˩ˮℎ˭ # is the main algorithm and ˓ˬ˧JJ˩ˮℎ˭ $ is the subordinated algorithm (Jourdan et al., 2009;Xhafa et al., 2011b).
This study implements a high level hybridization approach, namely ACS+GA.ACS will start first for a specific time and after ACS finishes execution, GA will Fig.1: ACS+GA (high level) pseudocode start to enhance the solution found by ACS.In other words, the solution found by ACS will be a part of the initial populations of GA.
For ACS implementation, the heuristic information needs to be defined.For static environment, heuristic value is calculated from the ˗ˠ˕ ˭IˮJ˩˲using {1 / (˗ˠ˕ + HJIˤ )} where ˗ˠ˕ represents the expected time to compute ˮIJ˫ ˩on ˭IIℎ˩J˥ ˪ and HJIˤ is the previous load assigned to ˭IIℎ˩J˥ ˪ (Ku-Mahamud and Alobaedy, 2012).Longer computing time and more loads will produce a smaller heuristic value, which will make the probability of selecting this machine smaller and vice versa.The probability of ant ˫ to map ˮIJ˫ ˩ to ˭IIℎ˩J˥ ˪ is calculated by: H, Otherwise; (1) where, ˮ is the pheromone value, is the heuristic value, is a parameter which determines the relative influence of the heuristic information, J is a random variable uniformly distributed between [0, 1], J0 (0 ≤ J0 ≤ 1) is a parameter which determines the exploration/exploitation rate and H is a random variable selected according to the probability given by Eq. ( 2) with = 1 (Dorigo and Stutzle, 2004): For GA algorithm implementation, the output from the ACS algorithm will be a part of the initial population of GA.The solution will be in the form of a vector.The index of each element represents the task number, while the value of the vector element represents the machine number assigned to it.Therefore, the vector size is equal to the total number of tasks and the values in each element will be any value of non-negative integer number in the range of (0 to m-1), where m is the total number of machines in the grid.Figure 1 depicts the pseudocode of the proposed algorithm.

Problem formulation:
The problem in job scheduling for grid computing is known as a multi-objective problem due to the various criteria in computational grid such as makespan, flowtime, load balancing, utilization, matching proximity, turnaround time, total weighted completion time and average weighted response time (Xhafa and Abraham, 2008).In this study, two criteria are implemented, namely makespan and flowtime, with the priority to makespan as the main optimization objective.Makespan metric measures the general productivity of grid computing.The best scheduling algorithm is the one that can produce a small value of makespan, which means that the algorithm is able to map tasks to machines in a good and efficient way.Therefore, the objective in this study is to minimize the makespan.Makespan is defined as the time when the last task finishes execution, formally defined as: where, ˟Iℎ˥ˤ is the set of all possible schedules, HJIJ is the set of all jobs to be scheduled and ˘ denotes the time when task ˪ finalizes (Xhafa and Abraham, 2008).Flowtime is the second criteria used in this study which refers to the response time to the user submissions of task executions.Flowtime is defined as the sum of finalization time of all tasks, formally defined as: These criteria could conflict with each other since limited resources could be the bottleneck of the system (Xhafa and Abraham, 2008).
In order to test the proposed algorithm, a suitable benchmark is required to reflect the robustness of the algorithm.The benchmark should reflect the environment attributes such as resources and jobs heterogeneity.The considered benchmark for static grid computing is based on the successful model known as ETC to generate benchmarks on grid computing introduced by Braun et al. (2001).This model is widely accepted by researchers to be used for job scheduling in grid (Braun et al., 2001;Garg et al., 2010;Kolodziej et al., 2011aKolodziej et al., , 2011b;;Ritchie and Levine, 2004).The benchmark defines a matrix called Expected Time to Compute.Each row in the ˗ˠ˕ [˩, ˪] matrix contains the expected time to compute task Therefore, ETC has J * ˭ entries where J represents the number of tasks and ˭ represents the number of machines.ETC matrix is again defined using three metrics, namely task heterogeneity, machine heterogeneity and consistency.Task heterogeneity measures the variance in execution time among tasks while machine heterogeneity measures the variance in machine speed among machines.The heterogeneity of tasks and machines is represented with two values of "high" and "low" respectively.In addition, the ETC matrix captures other possible features of a real heterogeneous computing system using three more metrics to measure the consistencies, namely consistent, inconsistent and semi-consistent.The ETC matrix is considered consistent whenever a machine J executes a task ˮ faster than another machine J , therefore, machine J will execute all other tasks faster than machine J .ETC matrix is considered inconsistent when a machine J could execute some tasks faster than machine J and some others slower.Finally, the semiconsistent ETC matrix is an inconsistent matrix which has a consistent submatrix of specific size.Combining all these matrices will generate 12 distinct types of possible ETC matrix (Braun et al., 2001).

EXPERIMENTS AND RESULTS
Metaheuristic algorithms, such as ACS and GA, have many parameters that need to be tuned.The values of the parameters need a lot of tuning in order to achieve the desired performance (Zapfel et al., 2010).Therefore, the best values have been adopted from the literature.In this experiment, the parameter values for ACS and GA are selected based on the recommended values from Dorigo and Stutzle (2004) and Xhafa et al. (2007b) respectively.Table 1 presents the parameter values for the ACS algorithm.
Table 2 shows the parameter values for GA.The total population size of GA is set to 10, while the selected population size as an intermediate population is set to 6.The probability to operate a crossover operation is 0.9, while the probability to operate a mutation operation is 0.4 (Xhafa et al., 2007a).
Important operators in GA are presented in Table 3.To select a population from the population pool, many operators are available such as the roulette wheel and ranking.This study has implemented a tournament operator with value 3 as a selection operator.For crossover operator, the fitness-based operator is found as the best operator compared with m-point crossover and uniform crossover (Xhafa et al., 2007b).Finally, a Re-balanced operator is used as a mutation operator, which is considered better than random mutation.
Experiments have been conducted using Intel® Core(TM) i7-3612QM CPU @ 2.10 GHz and 8G RAM.The grid computing simulator is developed using visual C#.The time given for each experiment is 90 sec (45 sec for each algorithm).This time restriction is a very important requirement to mimic the real environment for job scheduling in grid computing (Carretero et al., 2007;Xhafa and Duran, 2008).Each algorithm is executed 10 times in order to calculate the average values as well as to get the best run.The first column of each table represents the instance name with an abbreviation code: x-yyzz as follows: x represents the type of consistency; c means consistent, i means inconsistent and s means semiconsistent.yy represents the heterogeneity of the tasks; hi means high and lo means low.zz represents the heterogeneity of the machines; hi means high and lo means low.
For example: c_hilo means consistent environment, hi heterogeneity in tasks and low heterogeneity in machines.
The results show that the proposed algorithm is able to reduce the makespan significantly on seven instances as illustrated in Table 4, which shows the best makespan values.
Table 5 depicts the average values for makespan.The proposed algorithm is able to achieve good results on five instances.However, GA also performs well on four instances.
The experiments show different performance for flowtime objective.The AS algorithm outperforms the other algorithms for the best and average flowtime values as shown in Table 6 and 7 respectively.This behavior is expected due to the contradiction between makespan and flowtime.
In order to represent the performance of the proposed algorithm visually, a geometric mean is used to normalize the makespan and flowtime values of the 12 instances (Izakian et al., 2009).Figure 2 displays the results of the proposed algorithm, which is the best among other algorithms for best makespan values.In addition, Fig. 3 shows the same for average makespan values.
For the best and average flowtime values, Fig. 4 and 5 present the geometric mean values of the 12 instances respectively.The results show that the AS algorithm outperforms other algorithms.

CONCLUSION
Job scheduling in grid computing system needs a metaheuristic algorithm to be solved efficiently.Due to the complexity of the problem, stand-alone algorithm is insufficient for some cases.However, hybrid metaheuristic algorithms perform better than standalone algorithm in solving many combinatorial problems.This study has implemented a high level hybridization between ACS and GA to solve job scheduling in grid computing system.The results showed that the proposed algorithm outperforms other algorithms in terms of makespan reduction.Future work related to the proposed hybridization algorithm will focus on hybrid ACS with local search algorithms and the implementation of the hybrid algorithm in dynamic grid computing environment.