A New Workload Recognition Strategy to Improve the Speed of Resource Provisioning in PaaS Layer of Cloud for Real-Time Demands

The real-time system should guarantee that all critical timing constraints will be met in advance. Many distributed systems such as a cloud environment have a nondeterministic structure and it would cause a serious problem for real time, but the user can access a large number of shared resources. Also launching a new resource in the IaaS layer of a Cloud is not instantaneous. Prediction model, risk management in PaaS and monitoring in IaaS are the most important parts that a real-time system should have because they must face a challenge in understanding the system and the behavior of workload completely. The results of analyzing, monitoring and prediction have serious impacts on system reaction. Understanding the workload is an important challenge in all systems and they use different models to identify the types or predict changes over the time. A prediction model must have the ability to produce and shape the pattern of workloads with low overhead. In this study, we propose an enhancement for profiling process with continues Markov chain to make hosts deterministic for users. The effectiveness and the accuracy of the proposed model measured in the evolution part. Also, the number of the failed tasks counted in this new model to show how proposed model is successful.


INTRODUCTION
In Cloud, a user has an opportunity to access a large number of resources. However, these resources are often shared with other users and many of the available resources are greatly over time (Bible et al., year). Also, the delay in IaaS layer for resource initializing may cause a failure or system delay during a process. The system could apparently work well in a period, but it could collapse in certain rare, but possible situations (Bible et al., year). The general structure of Cloud environment is depicted in Fig. 1. If all the critical time constraints cannot be verified, it could collapse because scheduling algorithm in core system layer in cloud system does not include specific mechanisms for handling real-time tasks (Wang, 2012). For programs that are running in a cloud, resource provisioning is one of the key issues. Allocating resources far beyond the request have a negative effect on cost and utilization for users and may cause over provisioning problem for providers. Also, allocating resources less than the request have a major impact on system delay and task failure. This means, in order to maximize the application performance, the user must carefully select a subset of the resources and schedule the application to run on these resources before an application is launched to run in the cloud, Dynamic resource scaling is one of the key characteristics that distinguish the Cloud systems from the traditional computing hosts (Kusic and Kandasamy, 2007).
Initialization time for a new virtual instance in PaaS layer of the Cloud is not immediate and it has several minutes delay for hardware resource allocation in IaaS layer hosting platforms. The perspective of the current technologies showed reduction of VM initialization time is possible (Islam et al., 2012). Some technologies like streaming VM allows the customer to preview the VM before it is completely ready. The simple solution is to ask all the customers to determine future VM requests, so the cloud service provider in the SaaS layer can prepare all the VMs on time. However, it seems impossible because first, The customers have no duty to propose their schedule. Second, the customers, are unable to know when the computing resources are needed. Third, the combination of customers, is always changing. Fourth, the actual schedule may change at any time (Jiang et al., 2011). In the researchers of this study's view, there is only one solution for overcoming the technology limitations and user constraint satisfaction and it is to predict the demands and prepare the VMs in advance. Predicting and monitoring the user's demand is a fundamental issue when tasks are running on a virtualized system (Jiang et al., 2011).
Performance analysis and prediction model need a potent understanding of the system. This is mainly because, the real-time control completely depends on sensory input data and environmental conditions. The system must be analyzable to achieve a desired level of performance and predict the consequences in the workload such as burstiness. Because, a workload has a critical impact on resource provisioning and performance of the cloud-based applications. Most of the predictors and performance analyzers face a challenge to understand workload completely and model them. They use different techniques to identify the type of a workload and predict the changes in that type over the time (Yin et al., 2014;Elnaffar and Martin, 2009).
In this study, we will make the following important objectives. In the first step we clearly introduce realtime workload characteristics and constraints, besides we explain, how to find the pattern of workload. In next step we use the pattern to introduce the prototype implementation of our model for performing real-time tasks in Cloud.

MATERIALS AND METHODS
In this section we described how it's possible to define workload. The models and methods have described the real-time workload in different way to make that more predictable.

Real
time workload requirements and characteristics: Resource management requires hard timing constraints on tasks' execution and it needs to be supported by the proper prediction model. Predictability can be achieved only by introducing fundamental changes in the basic design paradigm. If a task cannot be guaranteed within its time constraints, the system must notify it in advance, to take alternative actions (Buttazzo, 2011).
Predictability is one of the most important characteristics that a hard real-time system should have. With predictability, the system should be able to predict the evolution of the tasks and guarantee that all critical timing constraints will be meet in advance. The proposed prediction models for real time must used to assist the derivation of actions and the uncertainty of the prediction model must taken into account. As we can see in Fig. 2, all tasks must be finished before a deadline and to ensure avoidance of failure the slack time should be considered in the hard real-time system. The slack time has a positive impact on an opportunity to deal with the uncertainty (Su et al., 2013). As in introduction noted the customers have no duty to propose their schedule and the customers are unable to know when the computing resources needed. Thus, the deterministic behavior of a component is desired, because it simplifies the understanding of the real-time behavior and the time evolution of the system is predictable. In all deterministic systems the following issues must be completely clarified:  Timeline  Logical reasoning based on a deterministic cause and effect relationship  Testability of a system (Systems, 2012) Deterministic behavior is accessible with an estimated probability. The real-time imp lementation can fail to meet this wanted property of determinism for the subsequent reasons (Systems, 2012):  The base of the computation is not precisely defined.  When there is a hardware failure.  The concept of time is unclear.  The system contains Non-Deterministic design constructs In case of indeterminism, the user considers the system predictable if it allows computing temporal bounds to its outputs within a reasonable time. In this research, the different characteristics of uncertain and undetermined data involved in this context to handle uncertainty appropriately in real time. First, the system needs the representation of uncertainty on the level of attribute values in the prediction model for the real-time system. Second, the comprehensive models must consider both of the aspects:  Uncertainty over arbitrary domains for long-term prediction  The temporal uncertainty that is relevant to the processes of planning and forecasting the events can occur in undetermined way overtime (Eisenreich et al., 2011) All tasks on real-time computer systems require completing the computation within a pre-determined deadline. In this case, the results are computed in a reliable way and are accurate. Furthermore, favorable algorithms provide a high level of locality and parallelism. For large real-time scale architectures it would be very attractive to arrange a common highlevel algorithm that solves major problems, dominates real-time concept and has maximum available resources utilization.
The impact of profiling in non-deterministic systems: Most of the prediction models are forecast based on historical knowledge for short-term requests (Mallick et al., 2012). The historical knowledge token from the monitoring service, which logs information continuously as a profile in a searchable database (Anderson et al., 1997) and the whole process showed in Fig. 3. The proper analysis tool dissects the stored profile information at several levels. The information that are produced by the analysis tools leads users to explain the static and dynamic changes incurred in detail (Verboven et al., 2013). Profile creation process has four steps: data granularity, monitoring, processing and storing. This process faces different challenges in all steps to deal starts with, data definition till storing. These challenges are consistency, stability, extra overhead, over sizing, efficiency and the integrity of distributed knowledge in wide system ranges (Anderson et al., 1997;Verboven et al., 2013 andRen et al., 2010).
The resource request planned after estimating the resources based on a performance model and a workload model. Both performance and workload models use past knowledge for training. They construct   (Hameed et al., 2014). Rafael Weingarten introduced MAPE-K autonomic loop to show how knowledge produced. In this model, many parts and components have a serious impact on the knowledge creation process, which showed in Fig. 4. Rafael divided his model into two important parts: profiling and forecasting. Sensing, monitoring and analyzing belong to the profiling process and the plan, execute and effector parts. As you can see, Fig. 4 belongs to the forecasting process (Hameed et al., 2014 andWeingärtner et al., 2015). The Plan part in a forecasting process handles optimizing resource utilization and maintaining QoS and QoE (quality of experience). It should take an appropriate action based on its responsibility. The QoE is the behavior that is perceived by end users and it is a way to understand end users (Weingärtner et al., 2015).

Methodology for Workload pattern recognition:
In this section we discussed about our model and all methods we applied. The prototype has been prepared for cloud to perform real-time tasks.

Workload pattern recognition flow chart:
The performance of a prediction model, highly depends on the workload (Hutchison and Mitchell, 2005). Also, this is an attempt to find an accurate characterization that can reproduce the performance from historical workload traces (Zhang et al., 2011) such as CPU utilization, waiting time, virtual machine cost, response time, etc. The influence of changes could be determined accurately by using a historical workload to minimize the risk of performance regressions. For this purpose, the characteristic of workload must be well achieved. If well-understood, the provider will be able to model the workload (Hameed et al., 2014). In all reviewed papers, there are three techniques to estimate the workload for the next upcoming tasks: First, workload profiling. Second, workload modeling. Third, workload predicting. Statistical estimation techniques are used in profiling to extract reliable workload statistics, although they may not be very appropriate for predicting the workload with large variation. In the second method, many researchers build the model for the workload to compute the prediction for upcoming tasks in the workload by observing the characteristics of the specific applications (Gregoriades and Sutcliffe, 2008;Sun et al., 2013 andCalzarossa andSerazzi, 1993). The workload is probably predicted more accurately by the workload model, but this prediction cannot utilized in all applications. Workload prediction performs some specific strategies in a specific prediction model to predict the workload of upcoming tasks (Kuang et al., 2014). The main steps for the construction of workload models can be summarized as follows: Formulation: A workload model is a conceptual description of the tasks parameters (Hutchison and Mitchell, 2005). All prediction models have task decomposition in their frameworks to find the workload pattern (Kousiouris et al., 2014). They define specific parameters for their works such as task scheduling, delay, machine resource utilization or required processing nodes to reach the desired performance (Hameed et al., 2014;Hutchison and Mitchell, 2005).

Collection of the parameters:
The objective of this part is what data the system already has had and what additional data the prediction model will need to collect. This part, directly affects the whole model, because most of the prediction models work based on historical knowledge to forecast short-term user's resource request (Mallick et al., 2012). Monitoring cannot collect all parameters and metric's value, It should work on specific metrics and parameters from logging data value during time intervals while the workload is executed (Mallick et al., 2012). All prediction models have a raw data filter model in monitoring and they filter unnecessary information from raw data (Jiang et al., 2011).

Statistical analysis of the measured data:
Monitored metrics used statistical analysis to understand the behavior of the full system to produce applicable outcomes. Monitoring techniques classified as: on-line, off-line and hybrid. The online prediction models use online monitoring techniques and they are more accurate than off-line. They involve lots of overhead because monitoring always calculates the parameters and resets them during the process. In off-line monitoring, there is no instruction to reset the parameters and the monitoring technique uses previously logged data. In the current situation, monitoring is less accurate than the on-line. Hybrid monitoring measures the parameters typically at fixed time intervals (Elnaffar and Martin, 2009). This monitoring technique instructs the model to reset its parameters in every specific time interval. All monitoring techniques consist of the following steps:  Do collection for elementary analysis to extract the basic system behavior such as growth and descent trend in parameters (Hutchison and Mitchell, 2005)  Transforming the original value of parameters to a new form and eliminate the outliers data. The most common approach, which used for transforming, is a distribution model (Hutchison and Mitchell, 2005)  Pick a reasonable amount of knowledge as a sample, because the prediction model suffers from inadequate available performance data to train the machine in machine learning technique. Also, this sample must contain a small group of parameters and it is called data distillation. When the data is unstructured, messy and crude, the data distillation uses the extracting method to select relevant data. This distilled data is exported as a set to the next phase to filter data. Prediction model implements filter or use a normal distribution function to divide data into relevant and irrelevant data (Mallick et al., 2012;Jiang et al., 2011)  Classify data for static analysis, because the classifier has a serious impact on monitoring. If the prediction model keeps the classifier active all the time, then it would help online monitoring to reduce the overhead (Elnaffar and Martin, 2009). A robust classification is obtained, when the classifier finds a similarity in some parameters and does it in common intervals Representativeness: Use some tools for representing a workload. One or more parameters are used to interpret and model workload (Sharma et al., 2011).
Decision making: All decision makers have followed the same steps in their process, which is shown in Fig. 5. In the first step, events are monitored by event phase to find certainty or unpredictable events for future demands. In the action phase, the decision maker selects an action based course on certain criteria and find alternative actions. Eventually, the decision is made in the consequence phase and the resulting outcome is sent to resource provisions. These three parts are considered by most of the decision makers in all decisions: First, the available or alternative choices. Second, unpredictable events, which are not under the control of the decision maker. Third, the cost of the decision (Fredericks and Schneider, 2009).

Prediction evaluation:
The evaluator measures some of the error metrics as metrics of evaluation. Most of the new prediction models have some checkpoints to evaluate the model during a run time as you can see in Fig. 6(a),. As depicted in Fig. 6(b), if the prediction error is high in one step, then the prediction coefficient will be fitted for the next step. Ideally, the prediction error is normally distributed and helps predictor to be stable (Dinda, 2008).

Risk management and analyzer:
The resource risk management in PaaS layer has a potential to lead the system to an undesirable situation; then there is a risk of penalty and customer dissatisfaction. Hence, risk analysis can be identified as a proper solution to evaluate these risks. However, the entire risk management process contains many steps and thus needs to be thoroughly discussed. The risk management process consists of the following steps: First, establish the context. Second, identify the risks involved. Third, evaluate each of the identified risks. Fourth, identify techniques to manage each risk. Fifth, create, implement and review the risk management plan (García et al., 2014).

The overall model for workload recognition:
A sequence of events that are usually measured at consecutive times and placed at stable time intervals is 6 Fig. 7: Overview of the structure of the prediction model Training: The proposed abstraction flow diagram for an initial prediction model is demonstrated in Fig. 8 based on (Yin et al., 2014;Elnaffar and Martin 2009;Mallick et al., 2012;Hameed et al., 2014;Doulamis et al., 2007 andMian et al., 2013). Most of the methodologies can predict the future demand based on the recent request and historical knowledge. All needed metrics are collected during a measurement trace from representative environments to be imported to the learning part. The process in the learning part is the first step for most prediction models that use historical knowledge Eq. (1). The process begins with a measurement sequence value, which is collected at periodic intervals and then the modeler creates a model based on those values and the model template. The model template contains information about the structure of the desired model of users. These processes shape the training part of the prediction model. The returned model represents a fit to the model structure, which is described in the model template during the measurement sequence: The result of this part is the initial prediction model, which is formed, based on the training. The user must consider proper learning algorithm to be naturally efficient and effective in the forecasting paradigm.
In this study, we worked with a real data that divided a data stream into three categories: Training, testing (evaluating) and performing. The first part of workload used for the warm-up to train the system to reach a steady-state (Doulamis et al., 2007;Dick et al., 2014). Some other researchers used a benchmark for training in their prediction model. Then, according to certain reasons such as the volume and distribution of training data, they filtered (Tobaruela et al., 2014), smoothed (Sallam et al., 2014) or refined (Yin et al., 2014 andElnaffar andMartin, 2009). In the next step, the pure data used to create knowledge; then the results of monitored data, the static analysis and the initial results of performing are used to create knowledge. When the volume of knowledge is very high, knowledge and data are partitioned into mutually exclusive classes. The number of classes defined is various and depends on the scenarios and the users. After a classification, the modeler will try to find the proper model for those classes. Therefore, researchers of this study implemented normalization to avoid out of range data. Afterward, we distilled achieved data from tasks data to extract proper characteristics. Finally, we classified those data in different classes. We used Kmeans classification in our experiments ( Fig. 9 and 10).

Testing and evaluating:
The system tested and evaluated the model and during the training predictor used an m vector-valued prediction stream for comparison with actual observed values. The predictor also produced error estimations and these estimations will serve to compute a confidence interval for the prediction (Doulamis et al., 2007). Most testing parts used evaluation metrics as feedback. Testing and the evaluation metrics evaluated the prediction accuracy and in terms of the metrics computed the error correction and the system will apply them to fit a model. The evaluator compared the actual results with the forecasted results to achieve the accurately fitted model. The *complete process is shown in Fig. 11. The evaluator will produce much overhead in the prediction  (Cully et al., 2008). In all checkpoints, the evaluator is triggered to compute evaluation metrics. When the number of checkpoints is still a lot, the system will face an overhead problem. The first important challenge for users is how and when they must define checkpoints. Philipp Leitner suggested checkpoint predictor that showed in Fig. 9. He has a concern about where a prediction should be carried out. The hook is the exact point to trigger the checkpoint. These inputs define this point: First, concrete point that is determined by a user or a timer. Second, prediction error and facts. Third, the retraining strategy of the evaluator for rebuilding the checkpoint prediction. There is a limitation for checkpoint: If no or too little historical data is available, the checkpoint must be suspended by the predictor manager until enough training data has been collected (Leitner et al., 2010).

Failure recovery strategy:
This strategy designed for unstable Cloud system when Cloud failed to finish tasks without any deadline violation. Resources in the cloud shared among many customers this act caused overloaded resources.
The failed recovery strategy worked based on the number of customers' reduction and the users' share incrimination. The number of users' cut to half when in the first phase, Cloud has failed to perform real-time tasks within a deadline. In the proposed model if Cloud achieves success in any phase then the system starts to share resources between more customers. This method of sharing needs Failure recovery strategy, because if sharing does not go well and some tasks failed to finish performing before a deadline, the number of users will decrease in the next phase based on the following equation:

1
(2) where, S : The number of successful phase c : The number of the current phase K : The number of classes user s : The number of users in class k

Model construction for evaluation:
In these aspects, for the model construction, Markov chain has been used as a multiple time series prediction models. Markov chain used information from the previous job to consider the sequential dependencies for the next job submission. Markov chain identified as a small set of relevant states and can move from one state to another with certain probabilities (Yin et al., 2014 andMallick et al., 2012). Markov model uses some memory and it is possible to describe the whole model by a transition matrix. Markov model has a complicated construction; the number of states must be limited. If this model has many distinct states, then all of them must be considered in the trace. Markov model uses state space models; that means the state of a system contains all the information on the interdependence between the past and the future of the system and it works like some memory. Given the current state, the future evolution becomes independent from the past: where, x(t) is an unknown state and x(t+1) estimated based on that.
In Markov model if matrix A in Eq. (1) is stable then the future evolution becomes independent from the past: Therefore, the output of the model based on Eq. (3) Moreover, (5) is obtained from the following equation if u≥t: The Markov Model was used to explore the sequential correlations in workload pattern changes. This allows us to predict individual VM's workload based on the groups found in the previous step. This study is based on real measurement data collected from the real-time workload in CEA supercomputer, hence provides insights for administrators of the system to realize the typical cloud workload patterns and have a better resources managing.
Researchers of this study, used Markov chain as a prediction model and considered six different states in the proposed model. These states represented in Table 1.
They cover the whole possibility of experiments. Above mentioned states used in the proposed model to perform real-time tasks within a deadline. This model also controls the resource utilization separately in each class. This means each class can play individually. If Current Utilization >100% and the previous utilization is less than 100% Make an average for number of resources between these two situations State 5 Utilization = 100% and system is overloaded Start to absorb more resources

RESULTS
In this section, we show how CloudSim was set to implement our prototype. Also, we evaluate that prototype with the achieved results to show how the proposed prediction model can predict the future demand.

CloudSim setting:
In this experiments, we settled a cloud environment in CloudSim and we considered five data centers to provide a large number of available resources in the resource pool. VMs are considered to have the same specification with VM in Amazon EC2. This experiment performed for 24 times to reach the maximum user sharing. Also, we considered 80% for stable point for resource utilization. System distillate data from received tasks based on the size of cloudlet and the number of requested CPUs. The maximum number of users considered a million (a big amount of users) then model with or without classification has the best effort to share resources among these customers. During a learning part, we used a simple time series MA(2) to reconsider some resources separately in each class. Also, VMs will be allocated exclusively to each class. At the end of the phase, each class would be in one of the six situations that are defined in Table 1. For evaluation part, we implemented Markov chain model to turn Cloud to a deterministic host for real-time tasks (Table 2).
Evaluation metrics: In this study, according to the objectives that we've been looking for, several metrics computed and compared to evaluate the performance of our model. The first objective of this research was to design a model that would ensure that, in the face of real-time tasks does not violate the time limit. For this reason, the number of time limit violation in all model reviewed and compared. The second objective is truly important, we want to know how our model is efficient and following metrics have been studied. The RMSE were considered to evaluate the efficiency of models and R 2 were used to determine how models are accurate. Also, the number of users: to understand which models are successful in sharing resources among more users, have been used as an evaluation metric, as well as the average of CPU utilization has been considered to understand the correlation between tasks failure and utilization.
Evaluation results: In the first experiment, we performed model without any classification and the following transition matrix Eq. (7) and state diagram have been achieved (Fig.10) This experiment clearly shows our model can perform tasks within a time limit without any deadline violation because it has never met the state zero. As we explained before in Table 1; state zero happens when we have a deadline violation.
In next experiments we evaluate our model with tasks classification in the first step we distributed our tasks in 5 different classes. The state diagram (Fig. 11) and the transition matrix have been achieved as follow Eq. (8):  (8) The results show the proposed model has never visited the state zero the same as the previous experience we did not have any failed tasks. For the third experience, we performed our model with ten classes then the following results have been achieved Eq. (9) (Fig. 12). The third experiment results also show the model with ten classes has never visited the state zero: T r a n s i t i o n M a t r i x (9) We used some non-feedback based model for training processes such as autocorrelation and Mean model for prediction. These models used with the maximum user sharing and performed for five times as like as the proposed model. Also, the classification has been used to improve the results, but the achievements show very clear the autocorrelation and Mean model cannot guarantee the deadline violation never happen. Also, the results show the recovery plan to avoid failure in all model works well. The whole model after five times performing reached to zero number of failed tasks. In a subsequent experiment, the number of end users after five times perform were identified.
The Table 3 shows how the proposed model is successful to share resources among more users. As the results show the failed recovery strategy decreased the number of end users until the Cloud system reaches a steady state. The Markov models-with or without classification-remain steady during five times performing. Table 4 shows how many CPUs our real-time tasks received during these experiments. It is very clear the failure recovery strategy increased the resource share to solve the overloaded CPUs problem. Otherwise, proposed models (the Markov models with or without classifications) has reached to the state 3 in the Markov chain state diagram. The state 3 has a duty to stabilized Cloud system and establish load balancing. Also in all the experiments with ten classes of tasks, less number of CPUs absorbed, although according to the Table 4 they reached better user sharing (Table 5).
In all the experiments that have been tried the average of CPU utilization placed around 80%, because we reached to the following table in different tests. Hence, to find which model has better convergence (more than 80%) for utilization, all of the models evaluated. Table 6 shows the mean model and  Second run Third run Fourth run Fifth run Markov chain without classification 1, 000, 000 1, 000, 000 1, 000, 000 1, 000, 000 1, 000, 000 Markov chain with 5 classes 1, 000, 000 1, 000, 000 1, 000, 000 1, 000, 000 1, 000, 000 Markov chain with 10 classes 1, 000, 000 1, 000, 000 1, 000, 000 1, 000, 000    Third run  Fourth run  Fifth run  Markov chain without classification  22946  22882  23010  22850  22978  Markov chain with 5 classes  23290  23162  23226  23226  23114  Markov chain with 10 classes  15804  15484  15884  16540  18204  Mean model without classification  26  2272  24386  21378  15010  Mean model with 5 classes  186  570  3306  8234  13994  Mean model with 10 classes  810  1310  2428  3420  4524  Auto correlation without tasks classification  10  496  22786  20546  15138  Auto correlation with 5 classes  170  332  1626  4890  9466  Auto correlation with 10 classes  810  1134  1660  2172  2540  Table 5: Situations based on utilization Utilization Risk Utilization<0.4 No risk for task failure and ready to share between more customers 0.4<Utilization<0.8 Low risk for task failure, and ready to absorb a little bit more resources and share resources between more users 0.8<Utilization≤1 High risk for task failure, and ready to absorb more resources and share resources between more users Utilization = 1 and there is a long waiting task queue Very high risk for task failure No more resource sharing and system need more resources to solve the overloading problem   The percentage over 100 means overloading and there is a waiting task queue   Table 6 and 2, Mean and Autocorrelation models were overloaded in first perform and the number of failed tasks are two high. However, failure recovery strategy shows its impact on the results from the second run. The number of deadline violation decreased since utilization had been improved, although it seems there is no convergence of utilization for prediction models based on the Mean and Auto Correlation. Figure 13, we compute RMSE of models to evaluate which one of models is more efficient than the others. As you can see the Markov chain models with or without classification have better results among other models. The impact of classification shows the number of classes has a positive effect on efficiency.
In the last test models were evaluated in terms of the precision. As we mentioned before in "evaluation metrics"; R 2 is a common evaluation metric to find how much prediction model is accurate. This metric has been computed for all models (Table 7), the nearest R 2 to 1 has the best accuracy.

DISCUSSION
In this study, a new multi-objective model proposed and a new model to predict the demands and anomalies described. This model shared resources to prepare this host ready for performing real-time workload among more customers in Cloud system. The experiments showed feedback based prediction model can perform real-time tasks in Cloud system. However, there is no guarantee for Cloud to perform real-time tasks without any deadline violation. The proposed model can identify the required resources for real-time tasks (i.e., by using the Markov chain, a predictive model and automatic adaptive resource scaling and sharing). Cloud infrastructure providers adopted our approach not only to offer their customers with response time guarantees but also to minimize the resources allocated to the customers. It is difficult to say that task classification had a positive impact in all aspects. The experiments showed all models with task classification had slow convergence to reach the 80% of CPU utilization, but the proposed model improved the prediction efficiency and accuracy.
One of the way to extend this system is to support Real-time economic model on a cloud. Another way is to extend resource scaling strategy, which is currently only used to attract inexpensive resources, absorb idle resources and migrate them to associate with overloaded resources in advance in order to overcome the virtual machine boot-up latency problem is another open way for reasearchers.

ACKNOWLEDGMENT
Authors confirms there is no any Conflict of Interest