Online Supervisory Control and Resource Management for Energy Harvesting BS Sites Empowered with Computation Capabilities

The convergence of communication and computing has lead to the emergence of Multi-access Edge Computing (MEC), where computing resources (supported by Virtual Machines (VMs)) are distributed at the edge of the Mobile Network (MN), i.e., in Base Stations (BSs), with the aim of ensuring reliable and ultra-low latency services. Moreover, BSs equipped with Energy Harvesting (EH) systems can decrease the amount of energy drained from the power grid resulting in energetically self-sufficient MNs. The combination of these paradigms is considered here. Specifically, we propose an online optimization algorithm, called ENergy Aware and Adaptive Management (ENAAM), based on foresighted control policies exploiting (short-term) traffic load and harvested energy forecasts, where BSs and VMs are dynamically switched on/off towards energy savings and QoS provisioning. Our numerical results reveal that ENAAM achieves energy savings with respect to the case where no energy management is applied, ranging from 57% and 69%. Moreover, the extension of ENAAM within a cluster of BSs provides a further gain ranging from 9% to 16% in energy savings with respect to the optimization performed in isolation for each BS.


I. INTRODUCTION
The full potential of 5G radio access technology can be realized through the use of distributed intelligence, whereby content, control, and computation are moved closer to mobile users, hereby referred to as the network edge. This evolution has lead to the emergence of the Multi-access Edge Computing (MEC) paradigm, which allows network functions to be virtualized and then deployed at the network edge to guarantee the low latency required by some applications. In this paper, we consider a hybrid edge computing architecture where computing servers are co-located with each Base Station (BS), and a centralized controller (a point within range to a set of BSs) is utilized to manage them, deciding upon the allocation of their computing and transmission resources. This type of architecture is in line with recent trends [1].
The convergence of communication and computing (MEC [2]) within the mobile space poses new challenges related to energy consumption, as BSs are densely-deployed to maximize capacity and also empowered with computing capabilities to minimize latency. To cope with these challenges, previous studies have put forward BS sleep modes [3] [4], as BSs are dimensioned for the expected maximum capacity, yet traffic varies during the day. In addition, energy savings within the virtualized computing platform are of great importance, as virtualization can also lead to energy overheads. Therefore, a clear understanding and a precise modeling of the server energy usage can provide a fundamental basis for server operational optimizations. The experimental results in [5] [6] show that the locus of energy consumption for the Virtualized Network Function (VNF) components is the Virtual Machine (VM) instance where the VNF is instantiated and executed. Thus, for a given expected traffic load, the energy consumption can be minimized by launching an optimal number of VMs, a technique referred to as VM soft-scaling, together with BS power saving methods, i.e., BS sleep modes.
Along these lines, we propose a controller-based network architecture for managing Energy Harvesting (EH) BSs empowered with computation capabilities where on/off switching strategies allow BSs and VMs to be dynamically switched on/off, depending on the traffic load and the harvested energy forecast, over a given lookahead prediction horizon. To solve the energy consumption minimization problem in a distributed manner, the controller partitions the BSs into clusters based on their location, then for each cluster, it minimizes a cost function capturing the individual communication site energy consumption and the users' Quality of Service (QoS). To manage the communication sites, the controller performs online supervisory control by forecasting the traffic load and the harvested energy using a Long Short-Term Memory (LSTM) neural network [7], which is utilized within a Limited Lookahead Control (LLC) policy (a predictive control approach [8]) to obtain the system control actions that yields the desired tradeoff between energy consumption and QoS. This work is an extension of [9], where we consider energy savings within a single off-grid BS scenario (i.e., BS powered by either wind and solar energy sources) taking into account the need for MEC in remote/rural areas. In this paper, however, a dense environment is considered, similar to an urban or semi-urban scenario, where each BS is powered by hybrid energy supplies (solar and power grid) and empowered with computation capabilities. Moreover, the optimization problem is extended for multiple BSs where energy management procedures are executed within a BS cluster in contrast with the single BS case of [9].
The rest of the paper is structured as follows. The related work is discussed in Section II, and the system model is presented in Section III. In Section IV, we detail the optimization problem and the proposed LLC-based online algorithm for a single communication site. The multiple BS communication site case is addressed in Section V. Our contribution is evaluated in Section VI, and lastly, concluding remarks are given in Section VII.
II. RELATED WORK AND PAPER CONTRIBUTION Next, we first provide a literature review related to BS sleep modes techniques. Then, we review the mathematical tools that we use in this paper, followed by the literature review related to energy savings in virtualized computing platforms (i.e., works related to soft-scaling). Finally, we put forward our contributions and novelty of our work.
Sleep-mode strategies in mobile networks: cellular networks are dimensioned to support traffic peaks, i.e., the number of BSs deployed in a given area should be able to provide the required QoS to the mobile subscribers during the highest load conditions. However, during off-peak periods the network may be underutilized, which leads to an inefficient use of network resources and to an excessive energy consumption. For these reasons, sleep modes have been proposed to dynamically turn-off some of the BSs when the traffic load is low. This has been extensively studied in the literature, here we highlight the main applied techniques that are related to this work.
Clustering algorithms have been proposed as a way of switching off BSs to reduce the energy consumption. In [10], centralized and distributed algorithms group BSs exhibiting similar traffic profiles over time. In [11], a dynamic switching on/off mechanism locally groups BSs into clusters based on location and traffic load. The optimization problem is formulated as a non-cooperative game aiming at minimizing the BS energy consumption and the time required to serve their traffic load. Simulation results show energy costs and load reductions, while also providing insights of when and how the cluster-based coordination is beneficial.
Reducing the energy consumption involves some tradeoffs in the optimization problem. QoS has been widely used as a tradeoff metric [12] [13]. The Quality of Experience (QoE) is included in [14], where a dynamic programming switching algorithm is put forward. Other parameters that have been considered are the coverage probability and the BS state stability parameter, i.e., the number of on/sleep state transitions. For instance, a set of BSs switching patterns engineered to provide full network coverage at all times, while avoiding channel outage, is presented in [15]. According to the BS state stability concept, a two-objective optimization problem is formulated in [16] and solved with two algorithms: (i) near optimal but not scalable, and (ii) with low complexity, based on particle swarm optimization. The QoE is also affected by the UE position due to channel propagation phenomena. To this respect, in [17] the selection of the BSs to be switched off is taken so as to minimize the impact on the UEs' QoE, according to the distance from the handed off BSs.
To support sleep modes, neighboring cells must be capable of serving the traffic from the switched off cells. To achieve this, proper user association strategies are required. A framework to characterize the performance (outage probability and spectral efficiency) of cellular systems with sleeping techniques and user association rules is proposed in [18]. In that paper, the authors devise a user association scheme where a user selects its serving BS considering the maximum expected channel access probability. This strategy is compared against the traditional maximum SINR-based user association approach and is found superior in terms of spectral efficiency when the traffic load is inhomogeneous. User association mechanisms that maximize energy efficiency in the presence of sleep modes are addressed in [19]. There, a downlink HetNet scenario is considered, where the energy efficiency is defined as the ratio between the network throughput and the total energy consumption. Since this leads to a rather complex integer optimization problem, the authors propose a Quantum particle swarm optimization algorithm to obtain a suboptimal solution.
A marketing approach to foster the opportunistic utilization of the unexploited small cell (SC) BS capacity in dense heterogeneous networks (HetNets) is presented in [20]. There, an offloading mechanism is introduced, where the operators lease the capacity of a SC network owned by a third party in order to switch off their BSs (Macro BSs) and maximize their energy efficiency, when the traffic demand is low. The allocation of the SC resources among a set of competing operators is mathematically formulated as an auction problem.
A comprehensive power management model employing a BS switching on/off mechanism, within a BS system powered by green energy, is presented in [21]. The model considers weather conditions, user mobility, different green energy harvesting rates, energy storage with self-discharge effect, and switching on/off frequency. The authors propose two algorithms: the first decides which BSs are to be active based on the minimum energy cost, i.e., the energy price per time period, while the second one determines the active BSs by first prioritizing the minimum power consumption of the system, and then the energy cost. The relationship between installing a solar harvesting system to power a BS and the energy management under varying demand is investigated in [22]. The authors present a solar installation planning model by explicitly modelling solar panels, batteries, inverters and charge controllers, as well as the cellular network demand and energy management. They found that the solar installation and the energy management of the base stations are so coupled that even the order in which these technologies are introduced can have a major impact on the network cost and performance.
The survey paper [23], presents a taxonomy of existing energy sustainable paradigms and methods to address energy savings in network elements (i.e., BSs) equipped with EH capabilities. Here, the authors discuss the shortcomings of previous studies related to efficient energy management procedures, the lack of relevant discussion related to the integration of EH into future networks, and lastly, energy self-sustainability in future networks. The current work is a technical contribution where we address some of the shortcomings that were identified in [23], also proposing the use of Machine Learning (ML) tools for pattern forecasting and adaptive control schemes for decision making. In addition, this work is in line with the research topics which can be found in our review paper [24].
The majority of the works on BS switching off mechanism considered clusters of BSs from a single mobile operator perspective, where some functions of the BS can be switched off and then the remaining active BSs handle the upcoming traffic. A new approach is presented in [25] which exploits the coexistence of multiple BSs from different mobile operators in the same area. An intra-cell roaming-based infrastructure-sharing strategy is proposed, followed by a distributed game-theoretic switching-off scheme that takes into account the conflicts and interaction among the different operators. Moreover, in [26], the authors investigate the energy and cost efficiency of multiple HetNets (i.e., each HetNet is composed of eNodeBs (eNBs) and SC BSs from one operator) that share their infrastructure and also are able to switch off part of it. Here, a form of roaming-based sharing is also adopted, whereby the operator can roam its traffic to a rival operator during a predefined period of time and area. An energy efficient optimization problem is formulated and solved using a cooperative greedy heuristic algorithm. Regarding the cost efficiency, the cooperation and cost sharing decisions among the operators are modeled using a Shapley Value based bankruptcy game.
Pattern forecasting along with foresighted optimization: control-theoretic and Machine Learning (ML) methods for resource management have been successfully applied to various problems, e.g., task scheduling, bandwidth allocation, network management policies, etc. In the paradigm of supervisory control for managing Mobile Networks (MNs), online forecasting using ML techniques and the LLC method can yield the desired system behavior when taking into account the environmental expectations, i.e., traffic load and energy to be harvested. Next, we briefly review the mathematical tools that we use in this paper, namely the LLC method and LSTM neural network [7].
Control-theorectic algorithms and the LLC method have been used to obtain control actions that optimize the system behavior, by employing a forecasting mathematical model, over a limited look-ahead prediction horizon. LLC is conceptually similar to Model Predictive Control (MPC) [27]. In [28], an online supervisory control scheme based on LLC policies is proposed. Here, after the occurrence of an event, the next control action is determined by estimating the system behavior a few steps into the future, using the currently available information as inputs. The control action exploration is performed using a search tree assuming that the controller knows all future possible states of the process over the prediction horizon. Moreover, in [8], an online control framework for resource management in switching hybrid systems is proposed, where the system's control inputs are finite. The relevant parameters of the operating environment, e.g., workload arrival, are estimated and then used by the system to forecast future behavior over a look-ahead horizon. From this, the controller optimizes the predicted system behavior following the specified QoS through the selection of the system controls.
To model time-series datasets, the LSTM network is used as it is able to handle the long-term dependencies due to its inherent capability of storing past information and then recalling it. In [29], a distributed LSTM online method based on the particle filtering algorithm is presented with an aim of investigating the performance of online training of LSTM architectures in a distributed network of nodes. An LSTM based model for variable length data regression is proposed, and then put into a nonlinear state-space form to train the model in an online fashion. Then, financial and real life datasets are used for performance evaluation, and it is observed that the distributed online approach yields the same results that are obtained in the centralized case, when considering the mean square errors as the performance measure. Moreover, an LSTM forecasting method is utilized in [9] within an LLC-based algorithm to obtain the system control actions yielding the desired tradeoff between energy consumption and QoS, for a remote site powered by only green energy.
Energy savings in virtualized platforms through soft-scaling: with the advent of virtualization, it is expected that the Network Function Virtualization (NFV) framework can exploit the benefits of virtualization technologies to significantly reduce the energy consumption of large scale network infrastructures. In virtualized computing environments, the locus of energy consumption for components is due to the VMs running in the server(s). Thus, energy saving studies within the virtualized computing environment have involved the scaling down of the number of computing nodes/servers (autoscaling [30]), VM migration [31] (movement of a VM from one host to another) and soft resource scaling [32] (shortening of the access time to physical resources), all hereby referred to as VM soft-scaling, i.e., the reduction of computing resources per time instance.
Algorithms for the dynamic on/off switching of servers have been proposed as a way of minimizing energy consumption in computing platforms. In [30], at the beginning of each time slot computing resources are provisioned depending on the expected server workloads via a reinforcement learning-based resource management algorithm, which learns on-the-fly the optimal policy for dynamic workload offloading and the autoscaling of servers. Then in [9], computing resources (VMs) are provisioned based on a LLC policy after forecasting the future workloads and harvested energy. In [31], the Central Processing Unit (CPU) utilization thresholds are used to identify over-utilized servers. Hence, migration policies, enabled by the live VM migration method [33], are applied for moving the VMs between physical nodes (servers). The VMs are only moved to hosts that will accept them without incurring high energy cost, i.e., without any increase in the CPU utilization. Subsequently, the idle servers are switched off.
Power management is also of interest in virtualized computing platforms, i.e., data centers using virtualization technologies. In [32], a power management approach called VirtualPower is presented. The algorithm exploits hardware power scaling, i.e., the dynamic power management strategies using Dynamic Voltage and Frequency Scaling (DVFS) [34] [35], and software-based methods, i.e., scaling the allocation of physical resources to VMs using the hypervisor scheduler, for controlling the power consumption of underlying platforms. Due to the low power management benefits obtained from hardware scaling, a soft resource scaling mechanism is proposed whereby the scheduler shortens the maximum resource usage time for each VM, i.e., the time slice allocated for using the underlying physical resources.
Novelty of this work: here, we consider the aforementioned scenario, where each BS is equipped with EH hardware (a solar panel for EH and an Energy Buffer (EB) for energy storage) and a MEC server co-located with the BS for computation purposes, under the management enabled by the controller.
Motivated by the potential capabilities of EH, MEC and the presence of the controller, 1) we introduce the use of virtualization with the aim of investigating how VMs can be soft-scaled based on the forecasted server workloads, as VMs are the source of energy consumption in computing environments. 2) We put forward the edge controller-based architecture for small cell BSs management, as one of the future trends for small cells [1] in 5G MNs. 3) We reconsider the BS sleeping control mechanism under the new MEC paradigm, which has not been sufficiently covered in the literature. In addition, we use a clustering method for enabling energy savings within the MN. 4) We estimate the short-term future traffic load and harvested energy in BSs, by using LSTM neural network [36]. 5) We develop an online supervisory control algorithm for the radio access (edge) network management based on a predictive method, specifically the LLC method, along with clustering and energy management procedures. The main goal is to enable Energy Savings (ES) strategies within the access network, BS sleep modes and VM soft-scaling, following the energy efficiency requirements of a virtualized infrastructure from [37]. The proposed Normalized traffic load Cluster 1 Cluster 2 Cluster 3 Cluster 4 Figure 2: Example traces for normalized BS traffic loads. The data from [38] has been split into four representative clusters. management algorithm is called ENergy Aware and Adaptive Management (ENAAM) and is hosted in the edge controller. The ENAAM algorithm considers the future BS traffic load, onsite green energy in the EB and then provisions access network resources, per communication site, based on the learned information, i.e., energy saving decisions are made in a forward-looking fashion. The proposed optimization strategy leads to a considerable reduction in the energy consumed by the edge computing and communication facilities, promoting self-sustainability within the mobile network through the use of green energy. This is achieved under the controller guidance, which makes use of forecasting, clustering, control theory and heuristics.
III. SYSTEM MODEL As a major deployment of MEC and in line with current trends for future mobile networks as suggested by prominent network operators (e.g., Huawei Technologies [1]), the considered scenario is illustrated in Fig. 1. It consists of a densely-deployed MN featuring N BSs and co-located cache-enabled MEC servers. Each MEC server hosts M VMs. Each communication site, i.e., the BS and the co-located MEC server, is empowered with EH capabilities through a solar panel and an EB that enables energy storage. Energy supply from the power grid is also available. Moreover, the Energy Manager (EM) is an entity responsible for selecting the appropriate energy source and for monitoring the energy level of the EB. All BSs communicate with a centralized entity called the edge controller, which is responsible for managing the access network apparatuses. The energy level information is reported periodically to the edge controller through the pull file transfer mode procedure (e.g., File Transfer Protocol [39]). Moreover, we consider a discrete-time model, whereby time is discretized as t = 1, 2, . . . , and each time slot t has a fixed duration τ . The list of symbols that are used in the paper is reported in Table I.

A. Traffic Load and Energy Consumption
Mobile traffic volume exhibits temporal and spatial diversity, and also follows a diurnal behavior [40]. Therefore, traffic BS n traffic load profile in time slot t, n is the BS index Γn(t) workload handled by the MEC server at BS n in time slot t Γ ′ n (t) standard (non MEC) traffic at time t θ 0 BS load independent energy consumption or operation energy fmax maximum processing rate for VM m F finite set of available processing rates for VM(m) θ ov m (t) energy overheads incurred when turning on/off VMs θ idle,m (t) static energy consumed by VM m in the idle state θmax,m(t) maximum energy consumed by VM m at maximum processing rate γm(t) workload fraction to be computed by the m-th VM γ max maximum computation load per-VM ∆ maximum per-slot and per-VM allowed processing time θ idle energy consumption of network interfaces in idle mode θ data energy cost of exchanging one unit of data between the server and the BS βmax maximum energy buffer capacity βup, β low upper and lower energy buffer thresholds Variables θtot,n(t) total energy consumption for the communication site n θ BS,n (t) BS n energy cost at t θ MEC,n (t) server consumption due to computation activities θ TX,n (t) data transmission energy consumption between the BS and the MEC server ζn(t) BS n switching status indicator at t M (t) number of VMs to be active in time slot t θ load (t) total wireless transmission power the total amount of load that is served by the BS site βn(t) energy buffer level in slot t Hn(t) harvested energy profile in slot t Qn(t) purchased grid energy in slot t volume at individual BSs can be estimated using historical mobile traffic datasets. In this paper, real MN traffic load traces obtained from the Big Data Challenge organized by Telecom Italia Mobile (TIM) [38] are used to emulate the computational load 1 . Specifically, the used data was collected in the city of Milan during the month of November 2013, and it is the result of users interaction within the TIM MN, based on Call Detail Record (CDR) files for a day considering four BS sites representing the traffic load profiles. A CDR file consists of SMS, Calls and Internet records with timestamps.
To understand the behavior of the mobile data, we have applied the X-means clustering algorithm [41] to classify the load profiles into several categories. In our numerical results, each BS n = 1, 2, . . . , N is assigned a load profile L n (t), which is picked at random as one of the four clusters (each cluster represents a typical BS load profile) in Fig. 2. L n (t) consists of computation workloads Γ n (t) ([MB]) and standard workloads ). According to [42], we assume that 80% of L n (t) is delay sensitive and, as such, requires processing at the edge, i.e. Γ n (t) = 0.8L n (t), whereas the remaining 20% pertains to standard flows, delay tolerant traffic, i.e., The total energy consumption ([J]) for the communication 1 In fact, the dataset is not a true representative of future applications that require processing at the edge, but contains data that is exchanged with the purpose of communication. We nevertheless use it due to the difficulties in finding open datasets containing computing requests. site n at time slot t is formulated as follows, inspired by [9], [43], [44], [45] and [46]: where θ BS,n (t) is the BS energy consumption term, θ MEC,n (t) is the MEC server consumption term due to computation activities, and θ TX,n (t) represents the data transmission energy consumption between the BS and the MEC server.
BS energy consumption: where ζ n (t) ∈ {ε, 1} is the BS switching status indicator (1 for active mode and ε for power saving mode), θ 0 is a constant value (load independent), representing the operation energy which includes baseband processing, radio frequency power expenditures, etc. The constant ε ∈ (0, 1) accounts for the fact that the baseband energy consumption can be scaled down as well whenever there is no or little channel activity, into a power saving mode. θ load (t) represents the total wireless transmission (load dependent) power to meet the target transmission rate from the BS to the served user(s) and to guarantee low latency at the edge. Since we assume a noise-limited channel and the guarantee of low latency requirements at the edge, θ load (t) is obtained by using the transmission model in [43] (see Eq. (5) in this reference). Here, we neglect the imbalance of traffic volumes in uplink and downlink, and also we do not account for the switching energy cost for the BS mode transition [45] due to the fact that future BS functions will be virtualized [47].
MEC server energy consumption: it depends on the number of VMs running in time slot t, named M (t) ≤ M , and on the CPU frequency that is allotted to each virtual machine. Specifically, VMs are instantiated on top of the physical CPU cores, and each VM is given a share of the host server CPU, memory and network input/output interfaces. The CPU is the main consumer of energy in the server [31] due to the VM-to-CPU share mapping. Hence, in this work we focus on the CPU utilization only. With f m (t) ∈ [0, f max ] we mean the instantaneous processing rate [48], expressed in bits per second that are computed, and f max is the maximum processing rate for VM m. In this paper, f m (t) is set within a finite set . . , f max } where f 0 = 0 represents zero speed of the VM (e.g., deep sleep or shutdown). At any given time t, the total energy consumption of a virtualized server, with M (t) running VMs is: where θ op m (t) is the energy consumption of VM m operation and θ ov m (t) ≥ 0 is the energy cost incurred through the turning on/off the VM, i.e., θ ov m (t) > 0 only when VM m is switched on/off and it is zero otherwise. θ op m (t) is obtained using the linear relationship between the CPU utilization contributed by VM m and the energy consumption, from [48] and [49] (see Eq. (4) in the second reference): where θ idle,m (t) represents the static energy drained by VM m in the idle state, and θ max,m (t) is the maximum energy it drains. The quantity, α m (t)(θ max,m (t) − θ idle,m (t)), represents the dynamic energy component, where is a load dependent factor. Note that α m (t) and f m (t) are deterministically related as f max is a constant. θ ov m (t) is obtained from [49] (see Eq. (5) in this reference) as a constant and is typically limited to a few hundreds of mJ per MHz 2 .
Conventionally, for each BS site, the hypervisor, i.e., the software that provides the environment in which the VMs operate, is in charge of allocating f m (t) and the workload fraction to be computed by the m-th VM, named γ m (t). In our setup, we have where equality is achieved when the workload is fully served by the M (t) VMs. We also note that, in practical application scenarios, the maximum per-VM computation load to be computed is generally limited up to an assigned value, named γ max . Motivated by the energy efficient requirements from [37], i.e., the hypervisor's ability to accept and implement policies from a management entity, in this paper, the edge controller usage is pursued. Here, the edge controller determines the f m (t) value that will yield the desired or expected processing time, µ m (t) = γ m (t)/f m (t), considering the workload γ m (t) allotted to VM m. µ m (t) must be less than or equal to the maximum per-slot and per-VM processing time (in seconds), named ∆, i.e., µ m (t) ≤ ∆. Note that ∆ is also the server's response time, i.e., the maximum time allowed for processing the total computation load.
We remark that, as a result of the allocation procedure that is developed in this paper, for any BS site n, the processing rates f m (t) shall be found, similar to [49] (see remark 1 from this reference). Then, the total amount of load that is served by the BS site may be set as: The objective of the considered optimization is to find the operating mode for the BS (either "on" or "power saving"), the number of VMs M (t) that are to be allocated and, for each of them, the processing rate f m (t). In doing so: 1) the amount of delay sensitive load that is not served at the edge, Γ n (t) − M(t) m=1 γ m (t), shall be minimized, while exploiting as much as possible the energy harvested from the solar panels, so that the mobile network will be energetically self-sufficient, and 2) the load is computed in a time shorter than or equal to ∆. The details of the proposed optimization algorithm are provided in Section IV.
Data transmission energy consumption: we assume that the inter-communication between the BS and the MEC server is bi-directional and symmetric. Hence, under steady-state operating conditions, for the communication site n, θ TX,n (t) is obtained as θ TX,n (t) = θ idle (t) + θ data (t) B n (t) by using the VM migration hint from [50], where θ idle (t) (fixed value in J) is the energy drained by the network interfaces in idle mode over a time slot t, θ data (fixed value in J/byte) is the cost of exchanging one byte of data between the MEC server and the BS per time slot t, and B n (t) is the amount of data exchanged. These parameters, θ idle (t) and θ data (t), are obtained from [50]. Note that B n (t) also corresponds to the amount of data to be processed at the MEC server in bytes.

B. Energy Patterns and Storage
The energy buffer is characterized by its maximum energy storage capacity β max . At the beginning of each time slot t, the EM provides the energy level report to the edge controller through the local MEC server, thus the EB level β n (t) is known, enabling the provision of the required computation resources, i.e., the VMs. The energy level report/file from the EM to the MEC server is transferred using the pull mode procedure (e.g., File Transfer Protocol) [39].
In this work, the amount of harvested energy H n (t) in time slot t in the communication site n is obtained from open source solar traces [51] (see Fig. 3). The dataset is the result of daily environmental records. In our numerical results, H n (t) represents a daily solar radiation record for three different areas. From the three solar profiles, each communication site energy profile is picked at a random to represent the daily energy harvested and then scaled to fit the EB capacity β max of 490 kJ. Thus, the available EB level β n (t+1) at the beginning of time slot t + 1 is calculated as follows: where β n (t) is the energy level in the battery at the beginning of time slot t, θ tot,n (t) is the energy consumption of the communication site over time slot t, see Eq. (1), and Q n (t) ≥ 0 is the amount of energy purchased from the power grid. We remark that β n (t) is updated at the beginning of time slot t whereas H n (t) and θ tot,n (t) are only known at the end of it.
For decision making in the edge controller, the received EB level reports are compared with the following thresholds: β low and β up , respectively termed the lower and the upper energy threshold with 0 < β low < β up < β max . β up corresponds to the desired energy buffer level at the BS and β low is the lowest EB level that any BS should ever reach. If β n (t) < β low , then BS n is said to be energy deficient, our optimization in the following section makes sure that β n (t) never falls below β low due to its transmission and computing activities within a time slot. Instead, if for any time slot we have β n (t) < β up , then the following amount of energy Q n (t) = β up −β n (t) is purchased from the energy grid to compensate for the deviation from the desired EB level (due to previous BS activity).

IV. OPTIMIZATION FOR A SINGLE COMMUNICATION SITE
In this section, we formulate an optimization problem to obtain energy savings through short-term traffic load, harvested energy predictions, along with energy management procedures for a single communication site. The optimization problem is defined in section IV-A, and the communication site management procedures are presented in section IV-B.

A. Problem Formulation
At the beginning of each time slot t, the edge controller receives the energy level report β n (t) from each EM (via the MEC application responsible for energy profiles in the MEC server), using the pull mode file transfer. Here, we aim at minimizing the overall energy consumption in the communication site over time, i.e., the consumption related to the BS transmission activity and the MEC server, by applying BS power saving modes and VM soft-scaling, i.e., tuning the number of active virtual machines. To achieve this, we first consider the optimization for a single communication site. We define two cost functions as: F1) θ tot,n (t), which weighs the energy consumption due to transmission (BS) and computation (MEC server); and F2) a quadratic term (Γ n (t) − B n (t)) 2 , which accounts for the QoS cost.
In fact, F1 tends to push the system towards self-sustainability solutions, i.e., ζ n (t) → ε. Instead, F2 favors solutions where the delay sensitive load is entirely processed by the local MEC server, i.e., B n (t) → Γ n (t). A weight η ∈ [0, 1], is utilized to balance the two objectives F1 and F2. The corresponding (weighted) cost function is defined as: we mean the sequence of factors α 1 (1), α 2 (1), . . . , α M(t) (1). Hence, letting 1 be the current time slot and T be the time horizon, the following optimization problem is formulated over time slots 1, . . . , T : subject to: forces the required number of VMs, M (t), to be always greater than or equal to a minimum number b ≥ 1: the purpose of this is to be always able to handle mission critical communications. C3 makes sure that the EB level is always above or equal to a preset threshold β low , to guarantee energy self-sustainability over time. Note that this constraint may imply that in certain time slots the BS is to be switched off, although the workload may be non-negligible. When managing a single BS site (the formulation in this section), this implies that the load will not be served, but this fact may be compensated for when multiple communication sites are jointly managed, e.g., handing off the workload to another, energy richer, BS. This is dealt with in Section V. Furthermore, C4 and C5, bound the maximum processing rate and workloads of each running VM m, with m = 1, . . . , M (t), respectively. Constraint C6 represents a hard-limit on the corresponding per-slot and per-VM processing time.

B. Communication Site Management
In this subsection, a traffic load and energy harvesting prediction method, and an online management algorithm are proposed to solve the previously stated problem P1. In Table II.

Modeling steps
Step 1: load and normalize the dataset Step 2: split dataset into training and testing Step 3: reshape input to be [samples, time steps, features] Step 4: create and fit the LSTM network Step 5: make predictions Step 6: calculate performance measure subsection IV-B1, we discuss the prediction of the future (short-term) traffic load and harvested energy processes, and then in subsection IV-B2, we solve P1 by first constructing the state-space behavior of the control system, where online control key concepts are introduced. Finally, the algorithm for managing the single communication site is presented in subsection IV-B3.
1) Traffic load and energy forecasting: ML techniques constitute a promising solution for network management and energy savings in cellular networks [52] [53]. In this work, given a time slot duration of τ = 30 min, we perform time series prediction, i.e., we obtain the T = 3 estimates of L n (t) andĤ n (t), by using an LSTM network developed in Python using Keras deep learning libraries (Sequential, Dense, LSTM) where the network has a visible layer with one input, one hidden layer of four LSTM blocks or neurons, and an output layer that makes a single value prediction. This type of recurrent neural network uses back-propagation through time for learning and memory blocks for regression [7]. The dataset is split as 67% for training and 33% for testing. The network is trained using 100 epochs (2, 600 individual training trials) with batch size of one. As for the performance measure of the model, we use the Root Mean Square Error (RMSE). The prediction steps are outlined in Table II. Fig. 4a and Fig. 4b show the prediction results that will be discussed in Section VI.
2) Edge system dynamics: we denote the system state vector at time t by x(t) = (M (t), β n (t)), which contains the number of active VMs, M (t), and the EB level, β n (t), for the BS site n. ς(t) = (ζ(t), {α m (t)}) is the input vector, i.e., the control action that drives the system behavior at time t. The system evolution is described through a discrete-time state-space equation, adopting the LLC principles [8] [28]: where Φ(·) is a behavior model that captures the relationship between (x(t), ς(t)), and the next state x(t + 1). Note that this relationship accounts for 1) the amount of energy drained θ tot,n (t), that harvested H n (t) and that purchased from the power grid Q n (t), which together lead to the next buffer level β n (t + 1) through Eq. (4), and 2) to the traffic load L n (t), from which we compute the server workloads Γ n (t), that leads to M (t) and to the control ς(t). The network management algorithm in the edge controller, the ENAAM algorithm, finds the best control action vector for the communication site, following a model predictive control approach. Specifically, for each time slot t, problem (6) is solved, obtaining control actions for the whole time horizon t, t + 1, . . . , t + T − 1. The control action that is applied at time t is ς * (t), which is the first one in the retrieved control sequence. This control amounts to setting the BS radio mode according to ζ * (t), i.e., either active or power saving, and the number of instantiated VMs, M * (t), along with their obtained {α * m (t)} values (see remarks 1 and 2 below). This is repeated for the following time slots t + 1, t + 2, . . . .

Remark 1 (Role of prediction)
: State x(t) and control ς(t) are respectively measured and applied at the beginning of time slot t, whereas the offered load L n (t) and the harvested energy H n (t) are accumulated during the time slot and their value becomes known only by the end of it. This means that, being at the beginning of time slot t, the system state at the next time slot t + 1 can only be estimated, which we formally write as: the same applies to the subsequent time slots in the optimization horizon t + 2, t + 3, . . . , t + T − 1. For these estimations we use the forecast values of loadL n (t) and harvested energyĤ n (t), from the LSTM forecasting module.
Remark 2 (VM number and workload allocation): a remark on the provisioned VMs per time slot per-MEC server, M (t), is in order. Specifically, the number of active VM (i.e., the VM computing cluster) depends on the predicted load,L n (t + 1), where the expected server workload isΓ n (t + 1) = 0.8L n (t + 1). Each VM can compute an amount of up to γ max . Then, an estimate of the number of virtual machines that shall be active in time slot t to serve the predicted server workloads is here obtained as: M (t) = (Γ n (t + 1)/γ max ) , where · returns the nearest upper integer. We heuristically split the workload among virtual machines by allocating a workload γ m (t) = γ max to the first M (t) − 1 VMs, m = 1, . . . , M (t) − 1, and the remaining workload γ m (t) =L n (t + 1) − (M (t) − 1)γ max to the last one m = M (t).
Controller decision-making: the controller is obtained by estimating the relevant parameters of the operating environment, i.e., the BS loadL n (t) and the harvested energyĤ n (t), and subsequently using them to forecast the future system behavior through Eq. (8) over a look-ahead time horizon of T time slots. The control actions are picked by minimizing J(ζ, α, t), see Eq. (5). At the beginning of each time slot t the following process is iterated: 1) Future system states,x(t + k), for a prediction horizon of k = 1, . . . , T steps are estimated using Eq. (8). These predictions depend on past inputs and outputs up to time t, on the estimated loadL n (·) and energy harvestingĤ n (·) processes, and on the control ς(t + k), with k = 0, . . . , T − 1.
2) The sequence of controls {ς(t + k)} T −1 k=0 is obtained for each step of the prediction horizon by optimizing the weighted cost function J(·), see Eq. (5).
3) The control ς * (t) corresponding to the first control action in the sequence with the minimum total cost is the applied control for time t and the other controls ς * (t + k) with k = 1, . . . , T − 1 are discarded.
The algorithm is specified in Alg. 1 as it uses the technique in [8]: the search starts (line 01) from the system state at time t, x(t), and continues in a breadth-first fashion, building a tree of all possible future states up to the prediction depth T . A cost is initialized to zero (line 01) and is accumulated as the algorithm travels through the tree (line 06), accounting for predictions, past outputs and controls. The set of states reached at every prediction depth t + k is referred to as S(t+k). For every prediction depth t+k, the search continues from the set of states S(t + k − 1) reached at the previous step t + k − 1 (line 03), exploring all feasible controls (line 04), obtaining the next system state from Eq. (8) (line 05), updating the accumulated cost as the result of the previous accumulated cost, plus the cost associated with the current step (line 06), and updating the set of states reached at step t + k (line 07). When the exploration finishes, the initial action (at time t) that leads to the best final accumulated cost, at time t+ T − 1, is selected as the optimal control ς * (t) (lines 08, 09, 10). Finally, for line 04, we note that Γ n belongs to the continuous set [Γ n ,L n (t + k − 1)]. To implement this search, we quantized this interval into a number of equally spaced points, obtaining a search over a finite set of controls.

V. MULTIPLE COMMUNICATION SITES
In this section, we extend the work from section IV by considering the energy savings for multiple communication sites. We formulate an optimization problem to obtain energy savings through short-term traffic load and harvested energy predictions, clustering, along with energy management procedures for the clustered BS sites. The problem formulation for multiple communication sites is described in section V-A, then cluster formation is discussed in section V-B, and the edge management procedure for each cluster, enabled by the edge controller, is presented in section V-C.

A. Problem Formulation
Our objective is to improve the overall energy savings of the network by clustering BSs based on their location (or distance measures) similarity, and then optimizing the energy savings within each cluster by employing the single optimization case described in section IV. From an energy efficiency perspective, in a cluster of BS nodes, one BS (or more) might have a preference of switching off, by first offloading its (their) traffic load to its (their) neighboring BS that have enough spare capacity for handling extra traffic load, and then switching off. The whole offloaded traffic load from the BS, denoted by BS n, is allocated to the neighboring cluster member (active BS) in which orthogonal resource allocation helps mitigate intra-cluster interference, such that the selected neighboring BS, denoted by BS n ′ , is allocated the incremental load, denoted by L nn ′ (t) ∆ = L n (t). Whenever a BS is switched off, it should maintain service to its users via a re-association process in order to offload the users to the neighboring active BS having extra resources for handling upcoming extra traffic load. The re-association process involves notifying the connected users to try and connect to neighboring BSs with extra resources.
In the view of the above, we consider that all BSs are grouped into sets of clusters O = {O 1 , . . . , O |O| }. Here, a given cluster O i ∈ O, with i = 1, . . . , |O|, consists of a set of BSs that coordinate with the controller. The clustering mechanism is discussed in Section V-B. For each cluster O i ∈ O, we aim to minimize the energy consumption, i.e., the consumption due to BS transmission and the running VMs in the servers, using BS power saving modes and VM soft-scaling per active cluster member. To do so, we define a cost function which captures the individual communication site energy consumption and its QoS. The (weighted) cost for each cluster member, BS n ∈ O i , is redefined as: where ζ n (t) is the activity status of BS n (either power saving or active), {α m (t)} n is the set of factors for the allocated VMs at BS n. Moreover, Λ n (t) ← L n (t) if BS n only handles its own traffic, whereas Λ n (t) ← L n (t) + ∆L n (t), in case one (or multiple) BSs are switched off in time slot t and its (their) traffic is redirected (handed off) to BS n. The computation of ∆L n (t) is addressed in section V-C. The per cluster cost Υ Oi (ζ i , α i , t) is the aggregated cost of all cluster members, Υ Oi (ζ i , α i , t) = ∀n∈Oi J n (ζ, α, t). Hence, over time horizon, t = 1, . . . , T , the following optimization problem is defined: subject to: is the collection of variables to be reconfigured for all the BS clusters (the whole MN), for all time slots t = 1, . . . , T . As for the constraints, C7 and C8 ensure that each BS is part of only one cluster. Solving P2 in Eq. (10) involves BS clustering, the forecasting method from section IV-B1, a heuristic rule for the selection of which BSs have to be switched off, and the ENAAM algorithm from section IV-B3. Once P2 is solved, the control action to be applied at time t, per cluster O i , corresponds to the elements in {ζ i , α i } that are associated with the first time slot 1 in the optimization horizon. As above, Eq. (10) can iteratively be solved at any time slot t ≥ 1, by just redefining the time horizon as t ′ = t, t + 1, . . . , t + T − 1.

B. Cluster Formation
Clustering algorithms have been proposed as a way of enabling energy saving mechanisms in BSs, where groups of inactive BSs or BSs with low loads are switched off. With the advent of EH BSs, the BSs with β n (t) < β low can be switched off, while still guaranteeing the QoS through the other active BSs. That is, within each formed cluster, the controller tries to minimize the cost function, which captures the trade-off between the energy efficiency and the QoS of each cluster member. The key step in clustering is to identify similarities or distance measures between BSs in order to group BSs with similar characteristics. In this paper, we use the location of the BSs as it defines the relative neighborhood (the distance measures) with the other BSs. Using the location of the BSs and the distance between the BSs, we obtain a distance-based similarity matrix W d . In addition, we assume that the network topology is static during the clustering algorithm execution.
In the next section V-B1 we detail the clustering measure that we use to obtain the similarities between BSs based on location, followed by the distance-based clustering algorithm in section V-B2.
1) Relative neighborhood based on BS adjacency and Gaussian similarity: similar to [11], we model the MN as a graph G = (N , E), where N represents the set of BSs, while the set E contains the edges between any two BSs. There is an edge (n, n ′ ) ∈ E if and only if n and n ′ can mutually receive each other's transmission. In this case, we say that n and n ′ are neighbors. We use a parameter r nn ′ to characterize the presence of a link between nodes, where r nn ′ ∈ {0, 1}. Let y n be the coordinates of BS n ∈ N in the Euclidean space. The relative neighborhood of BS n is defined by the nearness of the BSs in its e d -radio propagation space (or neighborhood): If n ′ ∈ Z n we say that BSs n and n ′ are neighbors, and we set r nn ′ = 1, otherwise r nn ′ = 0. The links between the vertices in N are weighted based on their similarities. Based on the distance between BS n and n ′ , we can classify the BSs based on their location using the Gaussian similarity measure [11] (a classification kernel function used in machine learning), which is defined as: where 2σ 2 d adjust the impact of the neighborhood size. In Eq. (12), we assume that the BSs located far from each other have low similarities, compared to those that are close to each other, as those that are close are more likely to cooperate with each other. The distance-based similarity matrix W d is formed using w d nn ′ as the (n, n ′ )-th entry. 2) Distance-based clustering: the BS clustering is performed after obtaining the similarity matrix W d of the MN graph G = (N , E). Given the matrix W d , we employ a centralized clustering method, specifically the K-means [54], as the matrix provides the full location knowledge. K-means partitions the set of nodes into clusters in which each node belongs to the cluster with the nearest mean distance. In addition, the value of K, i.e., the number of clusters (|O i |), is known prior and is a design parameter. This algorithm requires knowledge of all the BS locations, thus, it is categorized as a centralized method. In our case, this process does not incur any computation delay as the edge controller is assumed to have high computation capabilities.

C. Edge Network Management
Our aim is to implement and validate an LLC framework for dynamic resource provisioning in multiple communication sites with the goal of achieving energy savings within the access network through BS sleep modes and VM soft-scaling. Given the formation of clusters, load and energy forecasting, our next goal is to developed a mechanism for solving P2 (Eq. (10)) where each cluster of BSs adjust its transmission parameters and its computing cluster entities based on the forecast information. In order to minimize the per cluster cost function, we introduce the notion of network impact in Section V-C1, whereas we describe the edge management procedure in Section V-C2.
1) Network Impact: The dynamic BS switching off strategies may have an impact on the network due to the traffic load that is offloaded to the neighboring BSs. To avoid this, the BS to be switched off must be carefully identified within a BS cluster. To determine whether a particular BS can be switched off or not, we follow the work done in [55]. As an example, we consider one cluster O i , together with its cluster members n ∈ O i , then from it we choose one BS, BS n, where BS n neighbors set is denoted by N n . Note that the BS n ′ ∈ N n is the BS to which the traffic load will be offloaded to after turning off BS n. Also, BS n can only be switched off if there exists a neighboring BS n ′ that satisfies the following feasibility constraint [55]: where L n ′ (t) is the original BS n ′ traffic load and L nn ′ (t) is the incremental traffic load from BS n (the switched off BS) to BS n ′ (the neighboring BS). We recall that the load L n ′ (t) is normalized with respect to the maximum load that a BS can sustain, so the inequality in Eq. (13) means that it is feasible for BS n ′ to take the extra load from BS n. To quantify how the incremental system load affects the overall network load due to the switching off process, we introduce the notion of network impact. For every BS n within cluster O i , i = 1, . . . , K, its network impact due to the offloaded system load onto one of the neighboring BSs is defined as: Here, the maximum network impact value I n (t) over the neighboring BSs is considered as a measure for each BS towards switching off and generating extra traffic loads for its neighboring BSs. In this work, considering cluster O i , we switch off the BS n * that has the least network impact, i.e., n * = argmin n∈Oi I n (t).
The BS that takes the load from n * is selected as the BS n ′ that minimizes L n ′ (t) + L n * n ′ (t) over the set of active BSs that are on within the cluster O i . For BS n ′ , we then set L n ′ (t) ← L n ′ (t) + L n * n ′ (t). This procedure is sequentially repeated for all the cluster members until there is no active BS whose neighbors satisfy the feasibility condition of Eq. (13). Note that here, we focus only on which BS to switch off, as for the BS turning on state, we assume that the commitment time (time configured so that the BS automatically wakes up without external triggers) is a system parameter that is pre-configured when the BS is switched off.
2) Edge management procedure: Here, we propose a distributed edge network management procedure that makes use of the ENAAM algorithm (see section IV-B3). The decision making criterion only depends on the BS information and on its neighboring BSs, thus, the BS switching off decision can be localized within each cluster. To decide which BSs shall be switched off, we follow a sequential decision process. While this is heuristic, it iallows coping with the high complexity associated with an optimal (all BSs are jointly assessed) allocation approach. The edge management procedure is as follows.
For each BS cluster O i , with i = 1, . . . , K, do: 1) Initialize an allocation variable ∆L n (t) = 0 for all BSs n ∈ O i . Compute I n (t), using Eq. (14), for all BSs n and obtain the BS with the least network impact n * (t), using Eq. (15). Switch off BS n * (t) and assign its load to the neighboring BS n ′ ∈ O i that minimizes L n ′ (t) + ∆L n ′ (t)+L n * n ′ (t). Update the extra allocation for BS n ′ as ∆L n ′ (t) ← ∆L n ′ (t) + L n * n ′ (t). Recompute I n (t) for all the BSs that are still on and identify the next BS that can be switched off, i.e., the one with the least network impact. This procedure is repeated until none of the BSs in the cluster verifies Eq. (13). At this point, we have identified all the BSs n * that shall be switched off in O i . 2) For each active BS n ′ ∈ O i , the ENAAM algorithm is executed using L n ′ (t) + ∆L n ′ (t), where ∆L n ′ (t) = 0 if BS n ′ does not take extra load, whereas it is greater than zero otherwise. Note that, ∆L n ′ (t) corresponds to the total traffic that is handed over to BS n ′ , possibly from multiple nearby BSs. Edge network management complexity: The algorithm is independently executed for each cluster and the corresponding time complexity is obtained as follows. Considering the action Step 1, from above, the time complexity associated with the computation of the BS having the least network impact is linear with the size of the cluster |O i |. Once that is computed, the complexity associated with updating the load allocation for the active BSs is |O i | − 1, which leads to a total complexity of |O i |(|O i | − 1) = O(|O i | 2 ). Moreover, such process is iterated for each BS that is switched off. In the worst case, where all the BSs but one are switched off, the final complexity of step 1 is O(|O i | 3 ). As for Step 2, from above, the computation complexity depends on the ENAAM algorithm, which is independently executed by each active BS. Thus, in the worse case (no BSs are switched off), the total aggregated complexity is: O(|O i |N x N ς T ), which is linear in all variables, namely, number of cluster members, number of BS states, number of actions and time horizon T .

VI. PERFORMANCE EVALUATION
In this section, we show some selected numerical results for the scenario of Section III. The parameters that were used for the simulations are listed in Table III.

A. Simulation Setup
We consider multiple BSs, each one co-located with a MEC server and a coverage radius of 40 m. In addition, we use a virtualized server with specifications from [56] for a VMware ESXi 5.1-ProLiant DL380 Gen8. Our time slot duration τ is set to 30 min and the time horizon is set to T = 3 time slots. The simulations are carried out by exploiting the Python programming language.

B. Numerical Results
Pattern forecasting: we show real and predicted values for the traffic load and harvested energy over time in Figs. 4a  and 4b, where we track the one-step predictive mean value at each step of the online forecasting routine. Then, Table IV shows the average RMSE of the normalized harvested energy and traffic load processes, for different time horizon values, T ∈ {1, 2, 3}. Note that the predictions for H(t) are more accurate than those of L(t) (confirmed by comparing the average RMSE), due to differences in the used dataset granularity. However, the measured accuracy is deemed good enough for the proposed optimization.
Single communication site: Figs. 5a and 5b are computed with η = 0 using Cluster 1 and Solar 1 as traffic load and harvested energy profiles for each BS (see Figs. 2 and 3). Moreover, γ max = 5 MB and 10 MB, respectively. They show the mean energy savings achieved over time when on-demand and energy-aware edge resource provisioning is enabled (i.e., BS sleep modes and VM soft-scaling), in comparison with the case where they are not applied. Our edge network management algorithm (ENAAM) is benchmarked with another one that heuristically selects the amount of traffic that is to be processed locally, B n (t) ≤ Γ n (t), depending on the expected load behavior. It is named Dynamic and Energy-Traffic-Aware algorithm with Random behavior (DETA-R). Both ENAAM and DETA-R are aware of the predictions in future time slots (see Section IV-B1), however, DETA-R provisions edge resources using a heuristic scheme. DETA-R heuristic works as follows: if the expected load difference isL(t + 1) −L(t) > 0, then the normalize workload to be processed by BS n in the current time slot t, B n (t), is randomly selected in the range [0.6, 1], otherwise, it is picked evenly at random in the range (0, 0.6).
Average results for the ENAAM scheme show energy savings of 69% (γ max = 10 MB) and 57% (γ max = 5 MB), while DETA-R achieves 49% (γ max = 10 MB) and 43% (γ max = 5 MB) on average, where these savings are with respect to the case where no energy management is performed, i.e., the network is dimensioned for maximum expected ca-  The results show that the maximum load allocated to each VM, γ max , has an impact towards energy savings. An increase in energy savings is observed when γ max = 10 MB due to the fact that the number of VMs demanded per time slot is reduced, when compared to the allocation of γ max = 5 MB.
The ESs evolution with respect to η is presented in Fig. 6, taking into account the load allocated to each VM, γ max . The results were obtained using Cluster 1 and Solar 1 as traffic load and harvested energy profiles (see Fig. 2 and Fig. 3). As expected, a drop in energy savings is observed when QoS is prioritized, i.e., η → 1, as in this case the BS energy consumption is no longer considered. It can be observed that ENAAM achieves a 50% (or above) from η = [0, 0.4] when γ max = 5 MB and from η = [0, 0.7] when γ max = 10 MB. This shows that the higher the load allocated to each VM, the lesser the energy that is drained, as few VMs are running. DETA-R operates at below 50% for all η and γ max values.   Fig. 3. Each BS randomly picks its own traffic load and harvested energy profile at the beginning of the optimization process. Here, to select the BS to be switched off, we use the management procedure of section V-C. As for DETA-R, a BS is randomly selected to evolve its operating mode to power saving mode and offload its load to a nearby BS (in this case, the least loaded neighboring BS is selected), without taking into account its network impact measure. Fig. 7a shows the average energy savings obtained when clustering is adopted, i.e., here, the cluster size is increased from |O i | = 1 to 10 and η = 0. The obtained energy savings are with respect to the case where all BSs are dimensioned for maximum expected capacity (maximum value of θ tot,n (t), with M = 27 VMs, ∀ t, ∀ n ∈ O i ). It should be noted that the energy savings increase as the size of the cluster grows, thanks to the load balancing among active BSs, which cannot be implemented in the single communication site scenario (i.e., when BSs are independently managed).
Then, Fig. 7b shows the average energy savings with respect to η, when the cluster size is set to an intermediate case (|O i | = 6). Again, here the energy savings are obtained with respect to the case where all the BSs are dimensioned for maximum capacity. As expected, there is a drop in the energy savings achieved as the value of η increases, as QoS is prioritized. It can be observed that ENAAM achieves a value of 50% or above when η = [0, 0.8] (at γ max = 10 MB) and when η = [0, 0.6] (at γ max = 5 MB). DETA-R achieves value above 50% or above when η = [0, 0.4] (at γ max = 10) and η = [0, 0.1](at γ max = 5 MB).
Comparing Figs. 6 and 7b, an average gain of 9% on the energy savings is observed when clustering is applied, by considering the mean energy savings with respect η achieved with ENAAM for both cases. From Fig. 7a we see that this gain can be as high as 16% for ENAAM with γ max = 5 MB (red curve) and bigger for the DETA-R approach. These results support the notion that performing a clustering-based optimization is beneficial thanks to the additional cooperation within each neighborhood of BSs. This cooperation allows to switch off more BSs through load balancing, increasing the energy savings while still controlling the users' QoS.
VII. CONCLUSIONS In this paper, we have envisioned an edge network where a group of BSs are managed by a controller, for ease of BS organization and management, and also a mobile network where the edge apparatuses are powered by hybrid supplies, i.e., using green energy in order to promote energy self-sustainability and the power grid as a backup. Within the edge, each BS is endowed with computation capabilities to guarantee low latency to mobile users, offloading their workloads locally. The combination of energy saving methods, namely, BS sleep modes and VM soft-scaling, for single and multiple BS sites helps to reduce the mobile network's energy consumption. An edge energy management algorithm based on forecasting, clustering, control theory and heuristics, is proposed with the objective of saving energy within the access network, possibly making the BS system self-sustainable. Numerical results, obtained with real-world energy and traffic load traces, demonstrate that the proposed algorithm achieves energy savings between 57% and 69%, on average, for the single communication site case, and a gain ranging from 9% to 16% on energy savings is observed when clustering is applied, with respect to the allocated maximum per-VM loads of 5 MB