Load Balancing Strategy for Hybrid Cloud-based Rendering Service

1 Abstract —This paper focuses on SaaS-type Hybrid Cloud architecture, where computing resources in Private Cloud are limited in quantity. In this case, the workload balancing problem must be solved, since some projects are lost because of under provisioning of Private Cloud resources, as well as costly resources are wasted during nonpeak periods. The workload balancing algorithm for Hybrid Cloud is proposed, assuming that a certain part of incoming projects in the workload can be postponed for a short time period. This algorithm is adapted for a rendering service to ensure its Cloud-based delivery. The cash flow model is built to determine the adequate quantity of computing resources in Private Cloud if the demand of resources is variable and has some pattern in time. The implementation of a proposed strategy for workload balancing is practically worth, since the same flow of incoming projects is serviced with a smaller amount of own computing resources, thus reducing their downtime. For this purpose, an experimental study of the expected effect is presented.


I. INTRODUCTION
A key feature in Cloud computing is an improved peakload handling and dynamic resource provisioning without need to set up a new software or hardware infrastructure in every location.However, the owner of such environments must solve workload balancing problems, as well as the challenge of determining the optimal number of resources required depending on workload in time.Within this context, the resource allocation management problem is considered in this paper, mainly for SaaS platform deployments.The work is motivated by the fact, that in practice the estimates of average server utilization range from 5 % to 20 % but for many services the peak workload exceeds the average by factors of 2 to 10.Thus, it is a big challenge to make sure that all users will receive their service and the downtime of private resources will be minimized [1], [2].
Cloud service owner usually faces two problems (Fig. 1):  to deploy the maximum quantity of resources (Fig. 1, Max line), wishing to satisfy all its users' requests but costeffective scalability is not achieved because of idle processes and resources during nonpeak periods;  to keep the minimum quantity of resources (Fig. 1, Min Manuscript received July 18, 2013; accepted October 2, 2013.line) in full usage even if the users' load is at the minimum level but the revenues from potential customers are lost if the quantity of servers is too low.

II. THE CONTRIBUTION OF THIS PAPER
This paper is focused on projects management model, classifying them into two categories: users that send projects of high priority (HP) and pay a full service price (with a guarantee that the service will be fully granted), and users that send projects of low priority (LP) and pay a minimal service price (with a certain probability that the service may not be received) [3].It is clear that such management model does not ensure that a certain part of LP projects will be serviced.
The contribution of this work is to present the workload balancing approach which ensures that both HP and LP projects will be carried out.All projects are arranged in a certain queue which is serviced according to the method presented in the paper.The main idea is depicted in Fig. 1: sector [A] denotes LP projects which executions are postponed for later time during workload balancing, since they cannot be serviced at the moment due to the low number of available resources that all are busy at the current moment; sector [B] denotes the execution of LP projects that have been postponed for a certain time period from sector [A].The concept of this approach implies that all LP projects that cannot be serviced at the current moment are postponed for later time when the users' demand for SaaS type service will be decreased.Sector [C] denotes both HP and LP projects which are serviced at the current moment due to a sufficient amount of resources.
The workload balancing strategy to be considered in this paper is based on the scheduled reservations for future  [4], [5], and it helps to plan and optimize balancing decisions.Some of researches [6]- [8] focus on resources management strategy based on Service Level Agreement (SLA) of virtual machine to provision resources.As argued in [9], resource management approach based on SLA lacks flexibility, because virtual machines may not require as many resources as defined by the SLA, causing the waste of Cloud resources.Second, the load imbalance of the whole system is the reason of too large number of different types of virtual machines.Thus, model of load balancing and scheduling to be developed in this paper will be based on the workflow prediction in Cloud that devotes virtual machines to users.
The current quantity of resources has to be managed in an effective manner.Thus, the global utility function as an objective is constructed in order to maximize Cloud service owner's profit.The paper [10], [11] proposes that utility functions, combined with optimization algorithms that seek to maximize utility for a workload based on given certain resources, may provide an effective paradigm for managing workload execution in Cloud computing.Considering another aspect of this problem, the optimal quantity of resources can be computed based on the future workload forecasts.Such idea is applied in this paper.
We posit that the approach to be proposed for balancing workloads, consisting of HP and LP projects, will allow decreasing the quantity of current resources as well as ensuring the service for all users.
For this purpose, the expected effect will be evaluated by simulation.The simulation results to be obtained will quantify the performance of proposed strategy before real system design experiments.

III. A CASE STUDY
In this paper, the project rendering service will be considered as an instance of SaaS type service.Rendering service requires many computing resources and can be easily processed in parallel by matching one video frame to a server.The users of video rendering service are liable to wait for a certain time period, especially when big video projects must be rendered, i.e. they are willing to get many resources for their projects later but not to render them on their own computers.The priority system is also used in academic networks.Such network is usually comprised of several organizations.HP is assigned to the projects that are received from the members of the organization, which owns the cloud.

IV. DETERMINING THE ADEQUATE QUANTITY OF RESOURCES IN PRIVATE CLOUD
To ensure the delivery of SaaS service, the architecture of Hybrid Cloud is selected since it allows resending projects to Public Cloud if resources of Private Cloud are fully loaded.
It is important for the owner of Private Cloud to determine the adequate quantity of resources as it was shown in Fig. 1.Since the owner aims to gain the profit, its obtained utility with constraint , where:  

V. CONCEPT OF LOAD BALANCING
As it was considered, another but also very important goal of SaaS service owners is  to service all projects sent by users.It was noticed that the dynamics of users' demand has some pattern in time (periodical variations).Thus, it is reasonable to postpone projects for later time (Fig. 1

from sector [A] to [B]
).Only some part of projects that are of LP can be postponed for a certain time ∆t (max 4 hours).In the following, the algorithm will be presented which ensures that all received projects during time period ∆t will be started to execute, in some special cases resending them to Public Cloud.
The result of such projects balancing method is depicted in Fig. 2. Based on this, one LP project and one HP project are forwarded to Public Cloud, as all own resources (Private Cloud) are busy.LP project is sent to the Public Cloud, since the postponement time ∆t has ended.In Fig. 2 ∆t is equal to two time intervals.HP project is sent to the Public Cloud, as there are more HP projects than Private Cloud can service at the current time moment.In Fig. 2 depicted effect will be experimentally evaluated for different postponement time ∆t.
Workload balancing method, which includes optimization component (1)(2)(3), reserves resources in advance to achieve the maximum service for HP projects, while LP projects can be postponed for later time.The reservation is organized based on users' demand forecasting.

VI. FORECASTING OF WORKLOAD DYNAMICS
The workload balancing strategy presented in this paper includes the forecasting module which is responsible for predicting the future resource usage according to variability patterns in client workload.Workload statistical data constitutes a time series, which is characterized by trends, seasonality, and peaks.
Prediction methods used to improve the utilization of Cloud computing resources can be grouped into two categories, namely statistical model based approaches and artificial intelligence based algorithms [12].
Since this paper is focused on workflow analysis and forecasts, the main attention is paid to time series based forecasting methods.Thus, the fitted time series model is accompanied by a hotspot detection algorithm to model the sudden peaks in the forecasts of resource usage [13], [14].This approach is applied for forecasting of HP projects dynamics, since the algorithm of workload balancing to be presented in this paper requires the reservation of resources for HP projects only.Forecasts are also computed for the durations of video frame rendering.To know a service time required for the whole project is only possible by rendering every frame.The main idea of our prediction algorithm is given in Fig. 3. Based on captured project rendering files, statistical relationships among rendering times for different video resolutions are estimated based on correlation coefficient.The obtained results showed that the frame's rendering time of current size can be predicted using the same frames only of lower resolution.Inaccuracies are significant when the resolution reaches 0.1 of original size or the video is stationary.

VIII. MAIN WORKLOAD BALANCING ALGORITHM
The purpose of an algorithm (Fig. 4) is to balance the projects to be rendered between Private and Public Clouds with minimum usage of Public Cloud and also to ensure that all projects will be started on time.In the rendering system, this algorithm starts each minute if the past instance is not running.All projects are rendered in Cloud clusters.The size of cluster can be, for example, 10-20 servers.Any cluster in the system can be in one of three states:  Free state  means that a cluster is ready to render any project;  Reserved state  means that a cluster is free but is waiting for a predicted HP project to be rendered.LP project cannot be rendered at this cluster;  Busy state  means that a cluster is currently rendering a project and will not accept any other project until its state will be changed.Main steps of the algorithm are explained further.
Reservation.Workload balancing model uses algorithms of forecasting workload dynamics and rendering durations to reserve clusters in Private Cloud for HP projects.Based on historical observations, a is reserved for each HP project for an average project rendering time duration.Let define t1 as time period which describes how frequent reservation is run, and t2  as reservation period.
Queue sorting.It is checked if there are any projects in the queue.If there are, firstly, HP projects are assigned to clusters.HP projects are arranged by forecasted rendering time (in a decreasing order).This lets us assure that the longest HP projects will be rendered in Private Cloud.Secondly, LP projects are sorted by the time moment they must be started.The start time of LP project is computed as incoming time of the project plus a period of time that the project can be postponed (this period is declared in system settings and depends on system size).So if all clusters in Private Cloud are in states 'busy' or 'reserved' the LP project can be delayed until its start time is reached.If more than one LP project must be started at the same, moment, they are arranged by predicted rendering time (the same as HP projects).If start time of LP project becomes equal to the current time, then LP project becomes HP and is sorted with others HP projects.
Reservations influence.HP projects may have reserved certain clusters in Private Cloud.By this rule, LP projects cannot occupy reservations even if there is no HP task at the moment (it is presumed that predicted HP project may arrive next minute).
Balancing between Clouds.HP projects must be started without a delay.HP project is sent to the Public Cloud immediately if there is no reservation or a free cluster in the Private Cloud.LP project can be delayed a defined period of time to get a cluster in the Private Cloud.LP project becomes HP if waiting time ends.
Clusters allocation if there is no load in the Private Cloud.Every time when a project is sent to the Private Cloud, it is checked if there are more free clusters than projects in a queue.During this check, the estimation of a number of projects that will arrive in future is performed and forecasting of their rendering time is done to get the exact number of possible free clusters in the future.In the case of free clusters to be available, the project is devoted to additional clusters.The amount of clusters depends on a number of waiting projects and free clusters.All projects in a queue get an equal number of additional clusters starting from the first in a queue until there is some free additional cluster left.

IX. EXPERIMENT BY SIMULATION
The goal of simulation is to demonstrate the advantages of the workload balancing for rendering service presented in this paper if it is assumed that a certain part of incoming projects can be postponed for a short time.The research focuses on the estimating the adequate quantity * N of resources in Private Cloud while matching them to a workload behaviour in time.The owner of Private Cloud aims to maximize its profit, which relates to the maintenance costs   N t C PR , , depending on the amount working stations N .
Functional dependency between resources and profit is depicted in Fig. 5.The figure is obtained from simulation, where the workload of projects has been predicted based on statistical data, and the current amount of working stations N has been increased by some step.
In Fig. 5, profit is growing while the amount of resources N is approaching value * N from the left, since all working stations in Private Cloud are fully loaded.If the number of resources N > * N , the obtained utility for the owner of Private Cloud starts to decrease.This is determined by the fact that certain time periods occur when all resources or part of them are not employed. zone denotes a situation when a certain part of resources are always in a 'free' state.It means that available resources are not fully employed.This is caused by an excessive amount of current resources.In general, the owner of Private Cloud can earn more profit with lower amount of available resources if users agree to wait their LP projects will be started to execute. Figure 6 depicts functional dependencies between amount of resources and profit under different delays for LP projects.In this figure the curve with 0 h postponement time matches the curve from Fig. 5, and notation * 0 N coincides with * N .
Thus, the owner of cloud-based rendering service gains more profit by proposing for users the lower service price PR LP P for LP projects.Figure 6 allows comparing the behaviour of profit's function for different postponements to be allowed on LP projects.It can be seen that the optimal number of resources has been decreased from * 0 N till * 4 N , as well as  zone has been expanded if higher postponement time for LP projects is available.Broader limits of  zone allows for the owner of Private Cloud to service the incoming workload in more flexible way.

X. CONCLUSIONS
The performed research can be summarized with following conclusions.
A novel workload balancing strategy for a Hybrid Cloud is proposed.This algorithm allows improving the utilization of computing resources in Private Cloud according to variability patterns in projects' workload.
The paper presents a profit making components important to the owner of Private Cloud.It is determined that one of the main components is   N t C PR , , which depends on the optimal value N to be computed.
The simulation has shown that the application of the proposed algorithm can increase the profit of the owner of Private Cloud due to the following factors: the broadening of  zone caused by higher postponement time for LP projects; the optimal quantity of resources to be selected, and the more evenly distributed workload.
Simulation results ensured the benefit of proposed load balancing strategy for hybrid Cloud-based rendering service.In future, this research will continue with strategy-based software developments in real world Cloud environment.

Fig 1 .
Fig 1. Allocation of projects in time.

Fig. 5 . 4 NFig. 6 .
Fig. 5. Profit dependency from resources.The simulation was performed for a fixed time horizon having the workload of a certain pattern.The  ,  ,  , and  zones have been distinguished in conformity with different amount of resources.These zones are depicted in Fig.5and are described as follows: zone is characterized with insufficient amount of resources, since a certain part of projects is resent to Public Cloud based on the algorithm presented in this paper.All available resources are fully loaded;  zone denotes the recommended amount of resources, when all incoming projects are serviced in Private Cloud.There exist no time periods when resources are in a 'free' state during all time horizon to be modelled;  zone is characterized with short-run occurrences when all resources are free;