Hybrid RSO Algorithm with SFLA for Scientic Workow Scheduling in Cloud Using Clustering Techniques

An inevitable part of the cloud computing environment is virtualization, as it can multiplex or combine many virtual machines in a single physical machine, and simultaneously an isolated environment is provided to every virtual machine. An important issue in cloud computing is workow scheduling, which maps tasks of workow to VMs based on various functional and non-functional requisites. Workow scheduling is an NP-hard optimization problem and it is quite hard to achieve an optimal schedule. Metaheuristic algorithms helped in solving the problem of cloud task scheduling and this was compared to other heuristics. Reactive Search (RSO) and its structure will consist of a local heuristic based on a certain neighborhood complemented by making use of a memory-based mechanism. The Shued Frog Leaping Algorithm (SFLA) is based on swarm evolution that imitates information exchange divided into memeplexes when searching for food. This paper proposes a new set of optimization heuristics along with hybrid optimizations (RSO - SFLA) to solve problems in combinatorial optimization.


Introduction
The Internet has become the most sought-after technology which has become inevitable in day-to-day life. Many technologies that depend on the Internet such as cloud computing, World Wide Web (WWW), Internet Relay Chat (ITC) are in use for many years now. In today's computing world, cloud computing is being used commonly and can be termed as a new style of technology. This has become the technology present almost everywhere, and is a network on-demand and also user friendly (e.g, networks, servers, storage, applications, and services) that can be managed with minimal effort without much interaction with the service provider.
Cloud computing is one of the resources provided by the internet which is on an on-demand and rent basis, though the rent is quite minimal. This is one of the progressing technologies with computational and storage services as pay-as-you-go models. Generally speaking, the term internet or network is represented diagrammatically in the form of a cloud and so this technology gets its name as 'cloud computing and it is nothing but, using the internet in providing diverse computing resources to a different sector of the population with varying needs, living at different places. The main aim of cloud computing is the distribution of work to a varying group of individuals or interaction in a conductive manner with others while using resources of other organizations which perform large-scale computation. The various disciplines of cloud computing are Load Balancing, interoperability, Virtualization, Quality of Services, and so on [2].
The software that is used to isolate the physical machine from varied virtual machines is called Virtual Machine Monitor/hypervisor. The original de nition of Virtual Machine (VM) is that it is an e cient, isolated, duplicate of a real machine that allows multiplexing of the underlying physical machine.
Resources are allocated to virtual machines in a ne-grained manner through virtualization. The dynamic management of workload in servers is allowed by Virtual Machine Technology with increased elasticity. A separate machine is allocated to each user application and resources are provided based on the requirement of the application. Development platforms can be planned by the user based on the requirements of the application such as OS and other variables of the environment. Thus, multiple applications can be run in a single physical machine with different VMs Allocation of the necessary resources is done so that there is no violation of the Service Level Agreement, even in peak demand. Full control of resource allocation is provided through virtualization to the administrator that results in the use of resources optimally [15].
One of the main activities that are being executed in the cloud computing environment is scheduling. It is done to improve work pro ciency to attain the maximum bene t. The main objective of scheduling algorithms in the cloud environment is to utilize the resources properly while managing the load between resources so that the least execution time is taken.
The hybrid RSO-SFLA based method is proposed in this paper to improve work ow scheduling in the cloud [5]. The proposed method uses clustering techniques to enhance the scheduling. Related work available in the literature is presented in Section 2. The methods used were described in Section 3. The results were discussed in Section 4 and Section 5 concludes the paper.

Literature Review
Shirvani [12] had proposed another hybrid meta-heuristic algorithm that could solve the parallelizable and scienti c work ows on an elastic cloud platform. This was because applying one single approach will not be able to get an optimal solution to these complicated issues. Several scienti c work ows were modeled as a Directed Acyclic Graph (DAG) where the data dependency was in existence. For the marketplace of the cloud, every provider will deliver a set of Virtual Machine (VM) con gurations that are variable. Normally, such parallelizable task scheduling based on parallel computing machines for obtaining a minimum total execution time, or a makespan will be an NP-Hard problem. For dealing with such combinatorial issues, there was a Hybrid Discrete Particle Swarm Optimization (HDPSO) algorithm that was proposed with three phases. The rst phase had a random algorithm that was followed by some novel theorem applied for producing swarm members. This was an input proposed with a novel Discrete Particle Swarm Optimization (DPSO) algorithm for its second phase.
Shishido et al, [13] had examined the effects of the Particle Swarm Optimization (PSO) and the Geneticbased algorithms (GA) while attempting to optimize work ow scheduling. Another new security and costaware work ow algorithm were chosen for evaluating the metaheuristics and their performance. Three algorithms are evaluated in three different real-world work ows that have a constraint of risk rate ringing between 0 and 1 using a 0.1 step. The ndings had shown that the GA-based algorithms were able to outperform the PSO in terms of its response time and cost-effectiveness.
Singh et al, [14] had also proposed a new work ow scheduling algorithm that was inspired by the Hybrid Chemical Reaction Optimization (HCRO) algorithm. This scheme proved to be e cient in terms of energy and also proved to have a minimized makespan. This is referred to as an Energy E cient Work ow Scheduling (EEWS) algorithm. This EEWS was introduced in the form of a novel measure that determines the energy that has to be conserved by taking into consideration an environment that is DVS-enabled.
Employing certain simulations on various scienti c applications of work ow, it had been demonstrated that the scheme proposed was able to perform better compared to the HCRO and the Multiple Priority Queues Genetic Algorithm (MPQGA). This was observed, based on different metrics of performance that include the makespan and the energy conserved. The proposed algorithm and its signi cance were judged using an Analysis of Variance (ANOVA) test and an LSD analysis.
Wang et al, [16] had also proposed yet another multi-objective optimization work ow scheduling approach that was based on the dynamic game-theoretic model. It aimed at reducing the work ow make spans and their total cost thereby maximizing the system fairness which was identi ed in terms of its workload distribution which was among certain heterogeneous cloud-based Virtual Machines (VMs). Some extensive case studies were conducted on work ow templates along with real-world third-partybased commercial IaaS clouds. The results of the experiment suggested that the approach proposed was able to outperform all the traditional ones through achieving a lower makespan of work ow, lower cost, and a better level of system fairness.
On-demand provisioning along with the availability of resources in cloud computing have made it ideal to execute the applications of the scienti c work ow. The application begins execution with a minimum set of resources and some more were allocated when needed [5]. Work ow scheduling, however, was an NPhard problem thus making solutions based on metaheuristics explored widely. Kaur and Mehta [7] had proposed yet another augmented SFLA (ASFLA) method used for resource allocation along with work ow scheduling used in IaaS. Its performance was compared to other SFLA and PSO algorithms. The e cacy of the ASFLA was assessed with some work ows using a Java-based simulator. Simulations showed a signi cant reduction in the cost of execution.

Methodology
The section presents SFLA, RSO, and the proposed hybrid RSO-for scheduling in the cloud.

Reactive Search Optimization (RSO)
Reactive Search (RS] and its structure will consist of a local heuristic based on a certain neighborhood complemented by making use of a memory-based mechanism. This had been designed to avoid cycles.
There is an RSO framework that employs reinforcement learning based on history to implement knowledge acquisition by generalizing experiences before functioning. Some crucial parts are found in the RS algorithm which is: the objective function requiring optimization, the selected representation for which there are potential solutions needed. The algorithm has been designed to work for discrete problems in optimization [5].
The Shu ed Frog Leaping Algorithm (SFLA) is based on swarm evolution that imitates information exchange divided into memeplexes when searching for food. There is a need for a mixture of global search and local search in the memeplex to develop the algorithm to arrive at an optimal solution [17].
The NP-hard nature of the optimization problem in scheduling has made it very challenging to identify all the right solutions. This section can nd a way in which the hybrid Genetic Algorithm (GA), Arti cial Bee Colony (ABC), and the decoding heuristic can search for their ideal solutions. These decoding solutions can also generate certain feasible solutions based on tasks and their mapping to the VM [6].
There is yet another new set of optimization heuristics along with hybrid optimizations. These hybrid optimizations will assume that it has two or sometimes more than two algorithms for a similar optimization. This hybrid optimization tends to make use of a heuristic that chooses the best among such algorithms for being applied to a certain situation. For constructing this another hybrid allocator that can choose between two of the register allocation algorithms which are the linear scan and graph coloring [3] is employed. The primary goal here was the creation of another allocator to provide a proper balance between the two factors: attempting to identify a proper packing of these variables to the registers (thus achieving the good performance of running time) and reducing the allocator overheads. A hybrid optimization can bring down the effort of compilation making use of an e cient algorithm. This can be more effective but quite expensive as an optimization algorithm deems additional bene ts as well.
There was one more hybrid optimization approach to design the linkages method that is applied to a dimensional synthesis and also combines the advantages of stochastic, as well as deterministic optimization. The former is based on real-valued algorithms such as the EA and is used to extensively explore the design variable space while looking for a suitable linkage [11]. The latter, on the other hand, makes use of another technique of optimization for improving e ciency through reducing its high CPU time needed by the EA techniques for such applications. For this, the deterministic approach has to be implemented in two different stages. The rst one will be the evaluation of tness in which the deterministic approach is employed to obtain a new error estimator. In the next stage, this deterministic approach further re nes such solutions given by the algorithm and its evolutionary part. This new estimator ensures the evaluation of various individuals for each of these generations, thus avoiding all well-adapted linkages that cannot be detected by the other methods. This hybrid RSO-SFLA method was proposed for improving cloud work ow scheduling. The method makes use of clustering techniques for scheduling.

Hybrid Optimization methods
The proposed Novel Fire y Algorithm (NFA) was quite different from the Standard Fire y Algorithms (SFAs) proposed originally to solve problems of continuous optimization to map real-encoded re ies to their solutions (task sequences). This employs a new mapping operator that is distance-based which considers the distance between the re y and the one that is the brightest and also considers the best solution [18]. These mapped solutions are evaluated using an e cient strategy of task assignment called the Fast Task Assignment (FTA). To ensure a good start to this, an effective composite heuristic was used to generate high-quality solutions. It also develops another movement scheme to maintain a better probability in the mapping operator for inheriting the ''good genes'' found in the current best solution.
In the case of the standard Particle Swarm Optimization (PSO) algorithm, all initial particles have been created randomly. This randomness can also reduce the chances of converging to its best solution. The Best-Fit (BF) algorithm was merged into the PSO to improve convergence. All the other steps used in a standard PSO algorithm were retained. At the same time, the TABU Search (TS) algorithm was used as a neighborhood search algorithm to employ an intelligent search along with exible techniques of memory.
This has been designed for guiding all other algorithms to nd a way to escape from the local optima [1]. Thus, this is applied for solving task scheduling problems. The TS has been thus merged to the PSO to avoid being trapped inside the local optima and accelerate the search.
The work further proposed a Multi-Objective Hybrid Arti cial Bee Colony (MOHABC) algorithm that was used for Service Composition and Optimal Selection (SCOS). This was used in cloud manufacturing, where both the consumption of energy and quality of service were viewed from the perspective of the environment and economy being the two pillars in sustainable manufacturing. MOHABC makes use of the concept known as Pareto dominance for directing the search of the bee swarm. It also maintains certain non-dominated solutions in external archives. For achieving an ideal distribution of various solutions in the Pareto front, the CS along with the Levy ight was introduced.
The Ant Colony Optimization (ACO) is viewed as another improved variant of the load adjusting system of the Ant Colony and Complex Network Theory (ACCNT) while working in an open and distributed computing alliance. These calculations can make use of the pheromone of the ants to gather and further update information regarding the cloud by selecting a certain hub using a particular node for assigning this task and also evenly distributing work. For the proposed calculation, the ants begin to form their Head Hub to navigate around the system to make forward or reverse development and also locate under stacked or over-burdened hubs [9].
Fire y Algorithm (FA) is yet another very popular algorithm based on the population that can perform a global search that is suitable as it tends to have a very high degree of convergence wherein every re y attempts to identify its best state individually. So, it can, therefore, avoid the local optimum and search for a global optimum. At the same time, Simulated Annealing (SA) algorithm tends to have a convenient process of local search. It is because of this reason that both algorithms have been combined to form an FA-SA algorithm to ideally bene t from the advantages of the algorithms and also to perform better in terms of task scheduling in the cloud. For the method proposed, the FA algorithm generally initiates rst to perform a global search within the search space. Once the FA algorithm completes this, the SA algorithm will be executed to be able to perform the local search to the earlier solution that was given by the FA algorithm. This means, the initial population that was employed in the SA algorithm is not randomly chosen but obtains the value that is provided to it by the FA algorithm. This is the actual optimum value that has been given by the FA algorithm [4].
Hybrid SFLA-Genetic Algorithm has an independent local search made in every memeplex with steps that have to shu e that is executed when the frog's interchange messages that can feature certain speedy searches similar to other algorithms. These can get trapped within the local optima and further help in preventing and identifying an ideal global optimal solution. To ensure these traps can be avoided, there has been a need to combine the GA which can be used in the form of a substitute to update the SFLA and GA equation that is optimally combined for preempting that is being trapped within the local optima. This is effectively done at the time the GA conducts its local search. For this algorithm, each memeplex independently evolves to make a local search in different regions of its solution space after which they may be shu ed and re-divided into newer ones. This is to make global searches by exchanging information with each other [6].
There are several code models to the GA chromosome and the SFLA frogs use the codes in a model known as the real-number mode. To conduct the allocation of resources using optimal algorithms, the code model for these optimal algorithms has been made. This has further modi ed the position of the solution and has represented this in the form of a binary string. The length of the string was equal to the number of resources. There was a bit value of 1 that was assigned to the chosen resource indicating there was a bit value of 0 for the non-selected resource. Identi ed was another modi cation of a solution (chromosomes and frogs) that were represented in the form of a binary string for ensuring the SFLA and GA were combined to the SFLA-GA. The expressions were interchanged during selection. While conducting the evolution of the memeplexes in the SFLA, frog features represent GA chromosomes and the GA was used to conduct a local search.

Proposed Reactive Search Optimization with Shu ed Frog Leaping Algorithm
In this section, the RSO, along with the SFLA are discussed. In a reactive search, search history had to be employed to obtain a new parameter that was based on the feedback. This can tune all the algorithms for maintenance of its internal exibility needed for covering various problems. This type of tuning was further automated and then implemented at the time the algorithm runs and monitors past behavior. Also, there was an automated balance that had to be made to diversi cation and intensi cation. This was for managing the dilemma known as "exploration versus exploitation" obtained by various feedback mechanisms that begin with intensi cation and progressively move to diversi cation.
The algorithms are merged to form a new hybrid SFLA-RSO algorithm. The steps involved in this are as shown below: Firstly, an initial population is created and each memeplex has a random position along with the cost (light intensity) is computed for the position.
The subsequent stage is the execution of the SFLA. Here, the cost, its basis, and magnitude (light intensity) for each memeplex for all preconditions of the other memeplex are not met and are used to update the algorithm.
Next, there may be the tendency to move towards the subsequent step that performs the RSO process. In this, the neighborhood is randomly chosen.
The reactive hints have been a ready response to the events during alternative solutions that were tested by using an internal online feedback loop used for self-tuning in the case of crucial parameters.
Yet another new memeplex was presented by the RSO and this was associated with the previous one. For the case of the cost, (light intensity) for a new memeplex, this was found as being better compared to the earlier one. For this, the global best was swapped to the new memeplex.
Lastly, for a condition where the termination criterion was met, the memeplex that was the best is considered to be the output. If it is not so, all subsequent iterations begin from its memeplex algorithm from its rst step.

Results And Discussion
The RSO without clustering, RSO with clustering, and SFLA-RSO Optimization methods are evaluated. Experiments are evaluated using 600 to 1000 tasks. The optimization ratio on makespan, resource utilization, and computational cost is represented in tables 5.1 to 5.4 and gures 5.2 to 5.5.

When The Number Of Tasks Is Increased
The RSO without clustering, RSO with clustering, and SFLA-RSO Optimization methods are evaluated. Experiments are evaluated using 5000 to 25000 tasks. The optimization ratio on makespan, resource utilization, and computational cost is represented in tables 5.5 to 5.8 and gures 5.6 to 5.9.

Conclusion
The Standard SFLA has a very good ability of exploration along with fast convergence speed and the traditional RSO, on the other hand, has an effective ability of exploration. At the time of applying this to solve the problems of work ow scheduling, the SFLA may be stuck to the local optima as it has a low e ciency of global exploration. The RSO had a slower speed of convergence in certain cases as there can be a lack of powerful local ability of exploitation. For overcoming these disadvantages of both algorithms, the work has proposed a new hybrid approach that was based on the RSO and the SFLA that is used for the scheduling of scienti c work ows that are found in the IaaS clouds. For this, the pay-peruse pricing model. There is a decoding heuristic that is integrated into a hybrid approach that generates a feasible solution. The proposed algorithm's performance was evaluated against all other algorithms that prove its effectiveness in solving the problems of work ow scheduling in the IaaS clouds. The results proved that the SFLA-RSO along with clustering had a higher Utilization ratio % that was by about 3.47% and 1.14% for the 600 number of tasks, by about 4.59% and 2.27% for the 700 number of tasks, 4.55% and 9.3% for the 800 number of tasks, 4.49% and 4.49% for the 900 number of tasks and 6.74% and 4.44% for the 1000 number of tasks on being compared to the RSO without clustering and the RSO with clustering.

Future Work
In the future, other metaheuristics may be chosen from the hybrid techniques for improvement in the currently existing results.
A security mechanism can be introduced to the proposed algorithm.

Declarations
Ethical approval This article does not contain any studies with human participants or animals performed by any of the authors.
Funding This study was not funded by any funding agency  Tables   Table 4 is not available with this version. Figure 1 The owchart for the hybrid SFLA-RSO algorithm Makespan for SFLA-RSO with Clustering