A Review of Dynamic Resource Management in Cloud Computing Environments

In a cloud environment, Virtual Machines (VMs) consolidation and resource provisioning are used to address the issues of workload fluctuations. VM consolidation aims to move the VMs from one host to another in order to reduce the number of active hosts and save power. Whereas resource provisioning attempts to provide additional resource capacity to the VMs as needed in order to meet Quality of Service (QoS) requirements. However, these techniques have a set of limitations in terms of the additional costs related to migration and scaling time, and energy overhead that need further consideration. Therefore, this paper presents a comprehensive literature review on the subject of dynamic resource management (i.e., VMs consolidation and resource provisioning) in cloud computing environments, along with an overall discussion of the closely related works. The outcomes of this research can be used to enhance the development of predictive resource management techniques, by considering the awareness of performance variation, energy consumption and cost to efficiently manage the cloud resources.


Introduction
Cloud computing has changed the way in which the businesses and individuals are used the Information Technology (IT) by offering their customers on-demand services such as applications, platforms and infrastructures at competitive prices depending on their usage (e.g., pay-as-you-go model). However, the widespread adoption of cloud computing and the rising number of cloud customers have increased the overall operating costs for cloud providers [1][2][3][4][5]. Thus, reducing the operational costs of different cloud services is an active area of research.
A number of mechanisms have been adopted by cloud service providers in order to achieve economies of scale in a cloud environment [6]. For example, dynamic consolidation presents a solution to improve resource utilization and achieve energy efficiency in clouds. Virtual Machines (VMs) consolidation allows VMs to move from one Physical Machine (PM) to another through live migration, without any interruption to the service. This mechanism plays a major role in load balancing between the PMs and reduces the overall energy consumption by switching off the idle hosts. However, live migration of the VMs is a resource-intensive operation that affects the performance of the migrating VM and therefore the services running on other VMs [7]. Also, there are additional costs [8] in terms of migration time and energy overhead, which need to be explored further [9]. Therefore, understanding the impact of VM live migration is essential to design an efficient VM consolidation strategy. Resource provision defined as VMs auto-scaling is another solution to provide additional capacity to the VMs on-the-fly in order to handle service performance variations. However, it can take a few minutes for this process to start [10], which is inappropriate for VMs that need to scale rapidly during computation [11]. In fact, there are additional costs [8] in terms of scaling time (booting/rebooting), license fees for the new VMs (horizontal scaling) and energy overhead that need attention [12]. Hence, understanding the impact of VMs autoscaling is important to design an efficient resource provision technique.
Furthermore, most of the literature studies have concentrated on reducing energy consumption and optimizing resource utilization, rather than enhancing service performance. To illustrate that, cloud providers such as Amazon EC2 [13] have developed their Service Level Agreements (SLAs) based on the availability of services, without such a service performance assurance [14]. For example, consider the situation where a number of VMs run on the same PM, and each VM is allocated its fair share of resources. If the workload of the VM's increases and no resources are sufficient to manage this increasement (e.g., the workload reaches the upper level of Central Processing Unit (CPU) such as 95% threshold). In this case, there may be resource competition leading to VMs' performance degradation, which may affect the fulfilment of the SLAs and therefore the revenue of the cloud service provider. Thus, predictive mechanisms have the advantage of taking preventive actions (e.g., live migration and auto-scaling) at an early stage to avoid service performance degradation.
The aim of this research is to investigate the dynamic resource management issues and the impact of VMs consolidation and resource provisioning in cloud computing environments. This would help to enhance the development of predictive resource management techniques, by considering the awareness of performance variation, energy consumption and cost to efficiently manage the cloud resources.
The remainder of this paper is organized as follows: Section 2 presents the fundamental concepts of cloud computing with a description of its definition, services types, deployment types and virtualization technologies. The aspects of cloud applications and their workload patterns as well as related benchmarks are discussed in Section 3. Section 4 reviews the existing work on cloud resource management, including VMs consolidation and resource provisioning. Section 5 includes the overall discussion, along with a comparison summary of the closely related works. Section 6 concludes this paper.

Virtualization
Virtualization is a key component of the cloud computing infrastructure and is defined as: "a technology that combines or divides computing resources to present one or many operating environments using methodologies like hardware and software partitioning or aggregation, partial or complete machine simulation, emulation, time-sharing, and many others" p.2, [18]. One of the main advantages of virtualization is to abstract the Physical Machines (PMs) hardware in order to provide Virtualized Machines (VMs) that can work in isolation and run different applications with different operating systems. By virtualization, the VMs can be consolidated to minimize the number of active PMs using (e.g., live migration), which would then reduce the power consumption as well as lowering the operational cost. Thus, virtualization adds an essential value to the cloud infrastructure by increasing the physical resource utilization, achieving significant energy savings and reducing the operational cost in cloud environments [19].

Virtual Infrastructure Manager
Cloud infrastructure providers use Virtual Infrastructure Manager (VIM) to manage their physical resources in order to provide virtualized resources to meet their customers' service requirements. In order to build, deploy and manage cloud infrastructures, there are several open-source cloud management platforms available to manage virtualized infrastructures in clouds. Some examples of the major open source cloud platforms are OpenNebula [20], OpenStack [21] and CloudStack [22]. The following Tab. 1 summarizes some of the features of these VIMs. OpenNebula, OpenStack and CloudStack have a common role in providing a platform for deploying, managing and provisioning (compute, storage and networking) resources through interfaces such as Web User Interface (Web UI) and Command Line Interface (CLI). However, there are some differences in terms of their architectures based on the configurations, settings and their deployment. For instance, OpenStack has many components to install, which may increase the complexity of installation and configuration as well as the management overhead [23]. In order to avoid this, the OpenStack administrator has to only install the required components to meet the needs of their cloud deployment. In contrast, OpenNebula does not have such constraints as it provides centralized deployment and has a fine-grained core [23]. In addition to OpenNebula, OpenStack and CloudStack, there are other VIMs available freely or commercially for the deployment and management of cloud infrastructures such as OpenQRM [24], Eucalyptus [25], Nimbus [26] and others more.

Hypervisors
Hypervisors-based virtualization abstracts the underlying physical hardware to provide isolated instances, called VMs, which can run their own operating system (guest-OS) [27]. These VMs are managed by the hypervisor, which is also referred to the Virtual Machine Monitor or Manager (VMM) to control the number of resources allocated to each VM. The hypervisor sits between the physical hardware and OS, which is also responsible for creating, running, migrating, copying, and deleting the VMs [27]. Further, hypervisors can be implemented in different ways such as full virtualization when the hypervisor runs on underlying physical OS and hardware virtualization when the hypervisor runs on underlying physical hardware. Some examples of hypervisors include Kernel-based Virtual Machine (KVM) [28], Xen [29], VMware [30] and Virtual Box [31].

Containers
Containers-based virtualization modifies the underlying host OS to provide isolated instances, called containers, that can run different applications by sharing the same host OS [27]. Containers provide new ways for faster-running applications, developing, and shipping. It represents a light-weight alternative instance when compared to VM, thus, instead of building one application, developers can build a suite of components, called micro-services, which come together over the container [32]. Most of cloud service providers have moved to Docker [33] such as Microsoft, Google and Amazon Web Services to provide the infrastructure that supports the container standard [34]. Containers are better suited to micro-services than VMs, they can start up and shut down more rapidly as well as their resources can be scaled independently. However, containers do not provide full isolation, which may cause security issues. Therefore, hypervisor-based is more appropriate than container-based virtualization in terms of isolation and security concern. Some examples of containers include Docker [33], Linux Containers (LXC) [35] and Warden Container [36].

Cloud Computing Applications
Cloud applications should be designed specifically with the support of a cloud computing architecture; thus, the applications need to break down into separate components to support the distribution among cloud resources. Also, the cloud applications should be designed to support scalability and elasticity, which allow dynamic reservation and release of the cloud resources to match the changes of the workloads.

Workload Patterns
In cloud environments, different applications have different resource usage requirements. Cloud applications may also experience different patterns of workload depending on the customers' usage behaviors, and these patterns of workload consume power differently based on the services and resources they use. As indicated in Fehling et al. [37], the cloud workload patterns can be categorized as static workload, periodic workload, once-in-a-lifetime workload, unpredictable workload, and continuously changing workload.
As depicted in Fig. 1, a static workload pattern occurs when an application is running continuously with the same and stable resource utilization over a period of time. Private websites and wikis are examples of such static workload. A periodic workload pattern can be experienced when an application is running with a repeated resource utilization peaks occurring over time intervals (e.g., seasonal changes). Examples of this type of workload include shopping websites during holiday periods, sporting events (Olympics) and traffic during rush hours.
Furthermore, when an application is running with stable resource utilization and peak once over time, it is considered once-in-a-lifetime workload pattern. Payroll, billing and backup applications are examples of once-in-a-lifetime tasks or jobs. An unpredicted workload pattern occurs when an application has a random peak (constantly fluctuating) of resource utilization over time. Unpredictable traffic and forecasting are examples of unpredicted workload. Finally, when the application is running with stable resource utilization and rapidly decreases or increases over time, it experiences a continuously changing workload pattern [37]. Examples of such type of workload include social networking (Facebook and Twitter), open-source downloads and Android applications.
As mentioned early, these types of application workload patterns can have a different impact on energy consumption based on the resources they consume.

Benchmarking
Benchmark suites are adopted to evaluate cloud services to support the configuration and adaptation of applications before they start utilizing cloud resources, such as VMs and containers. Benchmarking aims at defining and reproducing execution conditions for the target system (application, resource, service) to be evaluated [38]. It also provides a set of metrics in order to quantify the relative software and hardware performance, and understand how cloud application workloads behave as the underlying cloud resources are stretched and approach full capacity [39].
In this regard, the Standard Performance Evaluation Corporation (SPEC) [40] launched a tool that provides a set of synthetic workloads, which exercises the CPU, memory and disk performance as well as tests the energy efficiency of a system at different load levels. Generally, this benchmark exerts graduated levels of load on a given machine, normally evaluating the energy consumption and performance of server hardware between (idle 0% and fully active 100%) load at 10% graduated load levels.
Similarly, a simple benchmarking tool for POSIX systems, called Stress-ng [41], has been designed as a workload generator. This tool has the capability to simulate a wide range of workload patterns such as static, periodic, continuously changing, and once-in-a-lifetime workload patterns. Further, the Stress-ng workload generator is able to simulate both single and multi-threaded applications, as well as test workloads that are resource-bound in many ways, e.g., applications that are both CPU and memory intensive.

Dynamic Resource Management in Cloud Computing
Resource management is one of the most important problems in cloud infrastructures, which can be expressed as a multi-objective problem since there are several conflicting objectives (e.g., maintain the performance, reduce energy and costs) that need to be optimized [9,42]. Therefore, cloud service providers have applied dynamic resource management through VMs' consolidation and resource provisioning techniques in order to meet the performance requirements of applications, while minimizing the operation costs and energy consumptions in cloud data centers.

VM Consolidation
One of the benefits of virtualization is the VMs' consolidation strategy, which allows cloud service providers to migrate and reallocate the VMs from one host to another in order to increase resource utilization and reduce energy costs in cloud data centers. The aggregation of VMs through live migration, therefore, has a significant impact on energy efficiency by gathering several VMs into the minimum number of hosts and switching the idle hosts into a power-saving mode. However, VM consolidation is not a trivial task in case of unpredicted increases in demand, as it can result in generates unnecessary migrations, violations of the SLA and increases the operation cost due to the migration processes [43]. Therefore, dynamic VMs consolidation requires an estimate of the workload demand in order to handle the fluctuating demands of cloud customers, efficiently manage cloud resources and avoid unnecessary migrations [44].
VM live migration acts as a backbone of the VM consolidation process, which can be defined as the capability of transferring a complete state of the VM (including CPU states, memory pages, storage and network connections) from the source host to the destination host, without any interruption in the service or application [45,46]. There are two types of VM migration, which are currently used in cloud data centers, namely, post-copy and pre-copy migration.
Post-copy: Transfers a VM's memory contents after its processor state has been sent to the destination host. However, this method can take a long migration time, which consumes the resources on both source and destination hosts due to the residual dependency. Also, it has some downtime initially, which makes the VM's service unavailable for a certain time period [47]. Pre-copy: First copies the memory state to the destination, through iterative phases, after which its processor state is transferred to the destination. In this way, the VM can be migrated from one host to another with a close to zero downtime [48].
Live migration efficiency of multiple VMs has been investigated in various research studies. For instance, Ye et al. [45] presented a live migration framework of multiple VMs based on different resource reservation mechanisms. This framework aims to improve migration efficiency by using parallel migration and workload-aware migration strategies. Experimental results show that the performance overheads of the live migration process are affected by workload types, memory size and the number of CPUs. Thus, parallel migration and workload-aware migration strategies can efficiently improve the performance of migrated VMs. However, the performance overhead incurred by concurrent VM migrations may increase the migration interference on the destination host.
Zhao et al. [49] presented a VM placement method based on VM service performance, which aims to address VMs performance degradation issue when placing the VMs. This method takes the applicationaware resource consumption characteristic into consideration to place the VMs on appropriate PMs in order to guarantee the VM performances and ensure customers' Quality of Experience (QoE). The proposed method is evaluated in a real cloud platform (OpenStack) using video streaming applications. The results show that the proposed method can minimize PM performance degradation and guarantee the VM performance compared to other methods. However, their approach only focuses on the resource consumption characteristic when performing VMs placement and does it not take the power consumption of the PMs and VMs into account.
Moreover, Ferreto et al. [50] proposed an approach called dynamic consolidation with migration control, which aims to reduce the number of VM migrations and the number of active hosts using linear programming formulation. This approach gives a higher priority to migrate VMs with variable workload instead of the VMs with a stable workload in order to reduce the number of migrations and required hosts with a minimal SLA violation. They compared the proposed approach with static and dynamic consolidation approaches using TU-Berlin and Google data center workloads. The evaluation results demonstrate that the suggested approach performs well in terms of the number of PMs used and VMs migrated. However, this approach does not take into account VMs power consumption and migration costs when consolidating the VMs.
Farahnakian et al. [46] presented a modified approach of Best Fit Decreasing (BFD) algorithm, named a Utilization Prediction-aware Best Fit Decreasing (UP-BFD) algorithm. This approach employed a utilization prediction model to eliminate unnecessary VM migrations and reduce SLA violations using K-Nearest Neighbor Regression (K-NNR) model. The prediction model is trained by generating historical data based on different types of workloads developed in the CloudSim. This approach also considers both the current and future utilization of resources in order to perform VM consolidation based on the hosts CPU and memory utilization thresholds. Although this work focuses on reducing PMs energy consumption, the number of VM migrations and SLA violations, they do not consider the impact of energy consumption that occurs by VMs live migration decisions in their approach.
Further, Beloglazov et al. [51] addressed the problem of VMs consolidations under Quality of Service (QoS) constraints in cloud data centers. They employed the Markov chain model and the control algorithm to detect the overloaded hosts and then migrate some VMs in order to achieve a specified QoS goal. This dynamic VMs consolidation aims to improve the PMs resource utilization (particularly CPU utilization) for stationary workloads, which also can be applied for non-stationary workloads using the Multisize Sliding Window workload estimation technique. Simulation results using workload traces on PlanetLab servers demonstrate that the introduced method outperforms the benchmark methods while meeting the QoS goal. However, this method focused on improving the performance of cloud applications by reducing the number of overloaded hosts, but without explicitly considering energy and cost of VMs migrations, as a part of VMs consolidation decision criterion.
Xu et al. [52] proposed a lightweight interference-aware VM live migration strategy, called iAware. It focuses on the performance of VMs during and after live migration, considering the interference of the migration process on both source and destination PMs. The iAware jointly estimates, analyses and minimizes both the migration time and co-location interference among VM's based on a multi-resource demand and supply estimation model. The experiments are conducted in a real cloud environment with different workloads using a Xen hypervisor cluster platform. The results are compared with traditional interference-unaware algorithms and show that the iAware can estimate VM performance interference during live migration and meet the SLA requirements. However, their work does not consider the energy consumption overhead of VMs migrations.
Beloglazov et al. [53] presented an energy efficient resource management policy for cloud data centers. The proposed method mainly focuses on dynamic re-allocation of VMs using live migration in order to minimize the energy consumption, while maintaining the QoS requirements. They evaluated the proposed method using a CloudSim and the results show a reduction of energy consumption in a cloud data center. However, the proposed method does not show the effectiveness of the heterogeneity of the PMs in terms of energy efficient when performing the live migration of the VMs. Furthermore, Beloglazov et al. [54] presented an energy-aware VM consolidation policies to optimize the resources utilization and energy efficiency in a cloud data center. In this approach, the VMs are migrated from one host to another in order to increase the overall servers' utilization and reduce infrastructure costs (energy costs) by switching off the idle hosts. Thus, upper and lower CPU utilization thresholds for each host are set along with several VM selection policies, in order to identify from which host the selected VMs should be migrated. The experiment results conducted in the CloudSim show that this approach leads to an improvement of energy efficiency in cloud data centers. Likewise, Farahnakian et al. [42] proposed a Self-Adaptive Resource Management System (SARMS) for efficient resource management in cloud infrastructure. The SARMS provides an adaptive utilization threshold (CPU and memory) mechanism to dynamically identify the overloaded and underloaded PMs. This system has two steps, migration of VMs from the overloaded PMs to prevent SLA violations, and consolidation of VMs into a minimum number of active PMs in order to reduce energy consumption. They evaluated the proposed system using the CloudSim based on real workloads from Google and PlanetLab. The obtained results show that the SARMS can achieve performance requirements, while reducing PMs energy consumption and the number of VM migrations. Nevertheless, these approaches do not consider the energy consumption overhead and the costs of VMs consolidation.
Beloglazov et al. [55] proposed a technique for dynamic VM consolidation based on CPU utilization thresholds. This technique focuses on cloud resource management strategies (e.g., VM migration) with the aim to optimize resource usage and reduce energy consumption, while maintaining the SLAs. It can be achieved by migrating the VMs from the underloaded hosts in order to reduce the number of active hosts and saving energy. To re-allocate the VMs, a Modified Best Fit Decreasing (MBFD) algorithm is used to sort the selected hosts based on their CPU utilization and energy efficiency. They evaluated the proposed technique through simulations with different types of workloads using PlanetLab servers. The results show that this technique outperforms other migration policies in terms of the number of VM migrations and SLA violation, while showing a similar level of energy consumption. However, the proposed technique lacks to consider the actual cost and power consumption caused by VMs consolidation.
Also, Malekloo et al. [56] introduced a Multi-objective Ant Colony Optimization (MACO) approach for VMs placement and consolidation algorithms. In this regard, the VMs' placement algorithm aims to minimize energy consumption, CPU resource wastage and communication cost. While, the VM consolidation algorithm aims to reduce SLA violations, VMs migration and the number of active PMs. They evaluated the proposed approach using the CloudSim based on eight performance metrics. The results show that this approach outperforms the other approaches in terms of achieving the balance between energy consumption, system performance and QoS requirements. Yet, this approach focused on minimizing PMs energy consumption without taking into consideration the energy consumption incurred by VMs consolidation.
Zhou et al. [57] proposed an adaptive strategy for energy and performance efficient VM consolidation, called (DADTA). The DADTA strategy aims to minimize energy consumption while satisfying the SLAs in the cloud data center. They applied a specific adjustment of thresholds to adapt the dynamic workload changes and then performed VM consolidation by using the DADTA in order to improve the overall optimization. To evaluate the proposed strategy, a modified prediction model conducted on the CloudSim is used to deal with the time-series data obtained from the Google cluster workload trace, and the findings show that the proposed DADTA outperforms other benchmarks in terms of minimizing the PMs energy consumption and SLA violations. In their work, the consolidated VMs are homogeneous and only considers PMs power consumption.
Moreover, Beloglazov et al. [43] presented adaptive algorithms for dynamic VM consolidation based on a statistical analysis of historical workload data. Statistical models are used to calculate the upper and lower CPU utilization thresholds of each host. If the host is determined to be overloaded, one or more VMs are selected to be migrated from the host to another suitable one in order to optimize the resource usage and maintain a high level of SLAs. On the other hand, if the host is determined as underloaded, all hosted VMs are selected to be migrated from the host and switch it to the sleep mode in order to reduce the energy consumption. They evaluated the proposed algorithms through the CloudSim using workload traces from PlanetLab, considering the heterogeneity of PMs and VMs. The results of the experiments show that the proposed algorithms outperform other dynamic VM consolidation algorithms in terms of the level of SLA violations and the number of VM migrations. However, this work only considers PMs energy consumption and does not refer to VMs energy consumption.
Verma et al. [58] emphasized the importance of taking migration cost into account for a fine-grain VM consolidation strategy. Therefore, Zakarya et al. [59] proposed a VM consolidation technique, named a Consolidation with Migration Cost Recovery (CMCR). This technique aims to explore the ability of the VMs to recover their migration costs. In order to achieve that, the VMs should firstly be migrated to an energy efficient host and then continue to run them for a certain period of time. A linear power model is used to identify the power consumption for the target host in order to check the ability of the VMs to recover their migration costs. They evaluated the CMCR through CloudSim using real workload traces from a Google cluster. The results show that by using the CMCR the majority of the migrated VMs can recover their migration cost. However, their work is applicable only to the hosts that follow a linear power model and does not consider the heterogeneity of PMs or VMs. Similarly, Verma et al. [58] introduced a power-aware application placement framework for virtualized server clusters, called pMapper, which dynamically places the VMs to minimize the power consumption and the migration cost, while meeting the performance requirements. In their framework, they have extended the First Fit Decreasing (FFD) heuristic algorithm in order to migrate the VMs to suitable hosts. This is aimed to minimize the data center's energy consumption by reducing the number of active hosts, while taking into account the VMs migration cost. They have implemented the pMapper framework on IBM testbed with heterogeneous hosts using a set of benchmark applications. The results show that the pMapper outperforms other power unaware algorithms in terms of minimizing the PMs power consumption and VMs migration costs, while meeting the application performance guarantees. However, their framework does not provide any information regarding the migration costs calculation.

Resource Provisioning
Cloud service providers support an on-demand resource provisioning model, called auto-scaling, which provides additional resources requested by applications using vertical and horizontal scaling techniques. Generally, the auto-scaling can be defined as the ability of a system or users to add and remove resources (such as CPU, memory), which is beneficial for adapting to workload variations and ensuring consistent performance with lower costs [8,12]. Cloud providers such as Amazon Web Services (AWS) [60] offer this service.
Auto-scaling is a dynamic property for cloud computing, and it comes in two types, namely, vertical and horizontal scaling. The vertical scaling is used to add or release virtual resources dynamically (e.g., virtual CPUs and memory) inside the VMs, whereas horizontal scaling is used to create or delete VMs, all of which were based on application requirements. However, the latter mechanism may take a few minutes to initiate [10,[61][62][63], which may be unsuitable for VMs that need to rapidly scale during the computation [11,64].
To achieve the scalability of cloud resources a combination of these two scaling techniques can help to find an optimal scaling strategy [63]. However, most of the vertical and horizontal scaling approaches are reactive methods which happen after detecting there are not enough resources for an application [64,65]. Thus, it is desirable if the methods can be scaled earlier than the time when the workload actually increases. This can be achieved by using proactive methods that can predict workloads of applications and scale the resources commensurate with the predicted workload.
A number of solutions have been proposed to support resource elasticity for cloud applications. For example, Ficco et al. [9] presented a new approach for managing elastic resources reallocation in cloud infrastructures using the coral-reefs algorithm and game theory optimization. This approach uses a multiobjective optimization to maintain customers SLAs, minimize resource consumption and cost during the auto-scaling and migration processes. In their work, the coral-reefs algorithm is used to model the elasticity of cloud resources, whereas the game theory is used to optimize the aims of the service provider expressed through resource reallocation strategies with respect to the customer's requirements. The experimental results show that the combination of coral-reefs algorithm and game theory optimization achieves the elasticity of cloud resources and leads to significant performance improvements. However, the energy-related cost when performing the auto-scaling and migration is not considered in their approach.
Likewise, Tighe et al. [66,67] developed a rule-based approach that combines the auto-scaling of applications with dynamic VM allocation to match current workload demands and maintain SLA achievement. In their approach, vertical scaling is performed to scale up and down the VMs according to their resource requirements to run applications, as well as the VMs are consolidated into a minimal number of PMs using live migrations in order to switch off the idle PMs and saving energy costs. As shown on their simulation results, they argued that their combined approach can achieve better application performance with a reduction in VM live migrations compared to the independent approaches. However, their approach only considers the vertical scaling of the scaled resources and do not consider the prediction of these resources. In addition, the costs of the scaled resources are not considered.
Dawoud et al. [68] proposed a dynamic resource provisioning approach that aims to allocate the minimum resources required to handle the future workload demands while maintaining the Service Level Objectives (SLOs). Their approach includes three controllers for CPU, memory, and application to guarantee efficient resource allocation and optimize the application performance. A linear prediction model is used to predict the future resource requirements for efficient allocation and correspond with the workload demands. They have evaluated the proposed approach using the Xen hypervisor with a synthetic workload, and the results show that their controllers are capable to horizontally scale the VMs to correspond with the workload demands while mitigating the SLO violation. However, their approach only considers the horizontal scaling to cope with VMs workload demands without considering the vertical scaling technique. Also, the energy consumption of provisioned resources is not considered.
Moreover, Meng et al. [69] proposed a joint-VM provisioning approach that estimates the VMs capacity needs through statistical multiplexing principles based on their workload patterns. The main idea of this approach is to borrow unused resources from low utilized VMs and reallocated these resources to the VMs with high utilization in order to achieve the application performance requirements. The proposed approach is evaluated based on data collected from commercial data centers using simulations. The results demonstrate that the proposed joint-VM provisioning approach has improved the overall resource utilization by 45% compared to the individual-VM provisioning approaches.
Also, Gandhi et al. [12] investigated the impact of resource auto-scaling on cost, performance and provisioning times for cloud applications. They employed the Amdahl's Law formula to model service time scaling, the queueing-theoretic concepts to model performance scaling, and a Kalman filtering approach to estimate the performance model parameters. They implemented their approach on OpenStack and the results show the ability of the proposed approach to determining the most cost-effective scaling option for a given workload, considering both horizontal and vertical scaling. However, this approach does not consider the prediction of resource requirements and their energy consumption when performing the scaling decisions. Dutta et al. [8] presented an automatic scaling framework called (SmartScale), which uses a combination of horizontal and vertical scaling in order to optimize the resource usage and the reconfiguration cost incurred due to scaling. The SmartScale is a proactive technique that used a polynomial regression in order to estimate the resource requirements to perform the scaling decisions for the next time interval. They evaluated their framework using a real cloud testbed and the results show that the SmartScale can scale the required resources to run applications with the lowest reconfiguration cost. However, this framework does not consider the power consumption of required resources incurred due to scaling decisions.

Overall Discussion
Cloud resource management has the ability to adapt VMs' consolidation and resource provisioning in order to meet the performance requirements of applications, minimize the operation costs and energy consumptions in cloud data centers. Section 4 has reviewed the related work on VMs' consolidation and resource provisioning mechanisms in cloud environments.
In terms of VMs consolidation, a commonly known NP-hard optimization problem is closely related to it, where the most important objectives are minimizing resource usage and energy consumption, while satisfying the SLAs. As discussed in Section 4.1, the work in Ye et al. [45,49,52] aimed to improve the VMs performance during the migration process, considering the application-aware resource consumption characteristic, but their models only focused on the resource consumption and do not consider the energy consumption overhead of VMs migrations. Moreover, the work presented in Farahnakian et al. [42,43,[53][54][55][56] mainly focused on dynamic re-allocation of VMs using live migration to increase the overall servers' utilization and minimize the energy consumption, while maintaining the required QoS. Yet, these approaches focused on minimizing PMs energy consumption without taking into consideration the energy consumption incurred by VMs consolidation. Also, the work presented in Verma et al. [58,59] have addressed the issue with migration cost, considering the energy consumption at both PMs and VMs levels. Though there are still limited as the model in Verma et al. [58] does not provide any information regarding the migration cost calculation, whereas, the work in Zakarya et al. [59] is only applicable to the hosts that follow a linear power model and does not consider the heterogeneity of PMs or VMs. Further, the work presented in Farahnakian et al. [46,51,57] employed workload prediction models based on historical data to eliminate unnecessary VM migrations, minimize energy consumption and SLA violations. These models focused on improving the performance of cloud applications by reducing the number of overloaded hosts, but without explicitly considering energy and cost of VMs migrations, as a part of VMs consolidation decision criterion.
In terms of VMs resource provisioning, a fine-grained resource provisioning while ensuring the performance and the SLAs for applications are required, which makes finding the optimal and efficient scaling option a very challenging problem. In Section 4.2, the work in Gandhi et al. [12] investigated the impact of resource auto-scaling on cost, performance, and provisioning times in order to determine the most cost-effective scaling option for cloud applications. Further, the work presented in Ficco et al. [9,66,67] combined the auto-scaling of applications with dynamic VM allocation to match current workload demands and maintain SLA achievement. However, the energy consumption related to the auto-scaling and migration decisions is not considered in their approaches. Moreover, the work presented in Dawoud et al. [8,68,69] considered the prediction of resources provisioning to handle the future workload demand while maintaining the SLOs, but these approaches do not consider the power consumption of required resources incurred due to scaling decisions.
Thus, there is still a need for predictive modelling that dynamically supports VMs live migration and auto-scaling decisions, considering the trade-off between cost, power consumption, and performance during service operation, which can help cloud providers to make better use of their infrastructures and efficiently manage cloud resources [70,71].
The following Tab. 2 provides a comparison summary of the closely related works on VMs' consolidation and resource provisioning that considers the workload, energy consumption and cost in cloud environments, followed by a comparison summary of the closely related works on the prediction of these mechanisms, as shown in Tab. 3.

Conclusion
This paper has introduced a comprehensive review on the subject of dynamic resource management in cloud computing environments. Firstly, it has introduced the fundamental aspects of cloud computing including its definition, services types, deployment types and virtualization technologies. Secondly, it has presented the concepts of cloud applications and their workload patterns as well as related benchmarks. This is followed by positioning the work in the relevant literature, focusing on cloud resource management issues. A thorough review of related works that focus on VMs consolidation and resource provisioning as well as their predictive technologies has presented. This paper has finally concluded with an overall discussion served as potential research directions, along with a comparison summary of the closely related works.
Funding Statement: The author received no specific funding for this study.

Conflicts of Interest:
The author declare that they have no conflicts of interest to report regarding the present study.