An Integrated Affinity Propagation and Machine Learning Approach for Interference Management in Drone Base Stations

Drone small cells (DSCs) can provide on-demand air-to-ground wireless communications in various unexpected situations, such as traffic jam or natural disasters. However, a DSC needs to face the challenges such as severe co-channel interference, limited battery capacity, and fast topology changes. Aiming to improve energy efficiency of DSCs and quality of services of customers, this paper presents a learning-based multiple drone management (LDM) framework by controlling the transmission power and the 3-dimension location of DSCs based on location data, and reference signal received power of users. Since the labeled throughput data are typically not available in emergency situations, we develop unsupervised learning DSC management techniques: 1) affinity propagation interference management scheme to mitigate interference and energy consumption, and 2) K-means position adjustment to adjust the new 3-dimension positions of drones. Our numerical results show that the proposed LDM framework combining with affinity propagation clustering and k-means clustering can enhance the energy efficiency of DSCs by 25% and the signal-to-interference-plus-noise ratio of ground users by 56%, respectively.

DSC customers for burst communications services include flash mobs, vehicles in traffic jams, rescue teams in natural disaster areas, and troops on the battlefield [2]- [4]. Compared to the fixed terrestrial small cells, the advantages of DSCs include: 1) line-of-sight (LoS) communications links to ground users, 2) the mobility of changing base station locations for tracking moving users, and 3) the flexibility of installing base stations over the hazardous areas, etc. [5]. However, DSCs need to overcome many challenges, including three-dimension interference, and instant network topology changes and limited power capacity.
In this paper, we focus on three challenging issues, including three-dimension interference, and instant network topology changes and limited power capacity. The first two challenges are related to the interference management issues of DSCs with dynamically changing topology. Co-channel interference in DSCs will affect system performance in a more dynamic and ever unexpected scenario than the traditional small cells because DSCs need to change their positions to track mobile users. For example, multiple DSCs may fly closely and cause severe co-channel interference. As for the challenge of the limited power capacity, it has been reported that a small cell using the same battery as the drone will reduce the flight time of drones by 16% [6]. Therefore, It is of importance that adjusting the transmit power and positions of DSCs to improve the system energy efficiency.
A few interference mitigation techniques for DSCs are reported in the literature [7]- [9]. However, these work focused on DSC interference management for uniformly distributed user locations, rather than clustering phenomenon of flash mobs. For clustering flash mobs, the interference scenarios and traffic demands change frequently. Intuitively, decreasing the transmission power of DSCs is one of resolution to decrease high interference in the DSC networks. However, too many low-powered DSCs will also degrade the total system throughput performance. Hence, it is important to calculate an appropriate number of low-powered DSCs from both viewpoints of the system throughput and energy efficiency. However, it is quite challenging to achieve the two objectives of high throughput and energy efficiency simultaneously, especially in a dynamic environment to provision immediate on-demand burst communication services.
Data-driven resource management techniques can extract the knowledge of a complex system in a dynamic environment, This  which attracted a lot of attentions recently [10]- [12]. The main concepts of data-driven approaches are delivered by computational intelligence and machine-learning methods. Machine learning algorithms can be categorized into supervised and unsupervised learning [13]. The supervised and unsupervised learning techniques are applied for different wireless network problems. Supervised learning techniques are suitable for wireless network problems where prior output data about the environment exists [14]. On the other hand, the unsupervised learning techniques are suitable for solve wireless network problems when the labeled data are difficult to obtain in the temporary dispatching network. Our previous work [15] proposed an unsupervised learning approach to reduce interference and energy consumption for the plug-andplay ultra-dense small cell (UDSC) networks (i.e., an instant network topology). Nevertheless, the advantages of a DSC in flexibly changing locations has not been exploited in [15].
In this paper, we develop a learning-based multiple drone management (LDM) framework to mitigate the interference of DSC, aiming at maximizing the DSC energy efficiency while guaranteeing the required data rates for ground users.
The key techniques of the proposed LDM framework include the affinity propagation interference management (APIM) scheme for controlling transmit power and the K-means position adjustment (KMPA) scheme for position rearrangement, respectively. We adjust the transmit power and position of the DSCs to resolve severe co-channel interference. Both methods have an impact on the energy efficiency of the DSCs. We pay attention to the impact of power consumption of the DSCs on the hover time of the Drones. So we discuss the energy efficiency impact in the DSC network by adjusting the transmit power and position of drones. The main contributions of this paper are summarized as follows: • Propose APIM scheme to reduce interference and enhance 25% energy efficiency compared to the baseline scheme where all the DSCs transmit with the maximum power. In the affinity propagation clustering (APC) algorithm, the reference signal received power (RSRP) of each ground user can be utilized to find the hidden mutual interference structure. • Propose the KMPA scheme to improve 56% of the minimal user throughput compared to without the KMPA scheme. In the K-means clustering (KMC) algorithm, the observed data of the location of users can be utilized to find the hidden mutual distance structure. This issue occur when a DSC is switched to the low-power mode, the neighboring DSCs should adjust their positions to serve the non-served users of the low-power DSC in the same group. The rest of this paper is organized as follows. Section II gives a brief literature survey on the related works of UAV interference mitigation and positioning. We introduce the considered cell architecture, user group mobility model, radio propagation model, and performance metrics in Section III. We formulate the optimization problem to maximize the system energy efficiency of the DSC network in Section IV. The LDM framework is discussed in Section V. The APIM scheme and the KMPA scheme are presented in Sections VI and VII, respectively. Section VIII gives our simulation results. Finally, Section IX provides our concluding remarks.
• UAV deployment issue: In [16], a new UAV-mounted mobile base stations (MBSs) placement algorithm was proposed to place the MBSs sequentially from the periphery of the service area in an inward spiral manner until all users are covered. The proposed algorithm can minimize the number of required MBSs to cover all users. In order to alleviate overload issue caused by flash mob traffic in 5G networks, the authors of [17] proposed a proactive drone-cell deployment framework. This framework includes a prediction scheme and an operation control scheme to identify the appropriate number and locations of drone-cells. In [18], the authors designed the per-Drone Iterated Particle Swarm Optimization (DI-PSO) algorithm to find the optimized deployments corresponding to different numbers of drone-cells. The drone-cell deployments generated by the DI-PSO algorithm can achieve a higher coverage ratio when compared with the pure PSO based approach. • Interference mitigation for multiple UAV base stations: The authors of [7] determined the UAV locations using the genetic algorithm to optimize the system-wide spectral efficiency of the public safety communications heterogeneous network (HetNet). Then, the inter-cell interference arising in the HetNet is mitigated by jointing 3rd Generation Partnership Project (3GPP) Release-11 further-enhanced inter-cell interference coordination.
In [8], by considering the practical mobility constraints of commercial drones, the authors developed the mobility control algorithms by using game theory for drone base stations, which are constantly moving at a fixed height above their cells. A drone mobility algorithm was designed to reduce the inter-cell interference and improve the spectral efficiency of drone cells. In [9], an interference control problem was investigated under a dense DSCs downlink network, which was modeled as a mean-field game. All DSCs adjust their altitude to improve the available signal-to-interference-plus-noise ratio (SINR). In [19], the initial coverage area of the DSC was formulated in the presence of full interference between the two closest DSCs. With the objective of offering the maximum coverage for the target area, the optimal altitude and the distance between two DSCs over the area was investigated. In [20], the authors proposed a joint power control and trajectory design that can provide more flexibility to mitigate the co-channel interference. Consequently, their proposed method can maximize the minimum throughput over all ground users in the downlink communication. We observe that the joint position management and interference mitigation techniques of the DSC networks has been rarely investigated yet. To close this gap, one should simultaneously consider time-varying numerous group users, and effective interference reduction with adaptive small cell switching on/off for the dynamically multiple DSCs system. By frequently observing the DSC operation data, our proposed LDM framework can perform power control for small cells and manage the position of drones in such a dynamic environment to reduce interference and satisfy user's communication quality. Fig. 1 shows the considered scenario in this paper, where the service areas of DSCs are not covered by the macrocell. A DSC serves one group mobile users with LoS communications link. It is assumed that these mobile users served by DSCs are involved in certain team activities. Our system model consists of three parts: the initial deployment architecture of the cell, the group user mobility model, and the radio propagation model.

A. Cell Architecture
We assume that each DSC is positioned in the center of its served users. The macrocell and DSCs are connected by the multiple drone relays via a wireless backhaul with millimeter wave. In order to improve transmission performance and coverage, data transmissions between the DSCs and ground users are operated at sub-6 GHz at the same time. We consider a 3D Cartesian coordinate system (x n , y n , h) The horizontal coordinate of each ground user n is located at where F is denoted by the drone set. Second, the drones operate at an altitude of h above users.

B. Group User Mobility Model
We jointly consider the random waypoint mobility (RWM) [21] and reference point group mobility (RPGM) model [22], where mobile users are involved in team activities. In the RWM model, the moving velocity ν n (t) ∈ [ν min , ν max ] and the direction θ n (t) ∈ [0, 2π] of user n at time t are chosen based on the memoryless random process [23]. When user n moves, we assume ν n (t) is a constant [24]. Given the time interval for data collection T int , we have where x n (t) and y n (t), respectively, represent the x-coordinate and y-coordinate of destination of user n at time t.
The predefined paths of a group leader are generated by the RWM model. The group members follow their group leader based on the RPGM model. The group members are randomly distributed around their group leader. As shown in Fig. 2, the difference vector M is defined as the deviation of a user's movement from its group leader [25]. Denote δ a and δ s as the angle deviation factor and the speed deviation factor, respectively, where −1 < δ a < 1,and −1 < δ s < 1. The average direction and moving velocity of group users can be expressed as Adjusting δ a and δ s can represent various group mobility scenarios [25].

C. Radio Propagation Model
In the DSC network, the LoS communication links exists between the DSCs and the mobile users. Following the freespace path loss model, the channel gain H f ,n from drone f to where α is the path loss exponent and d f ,n denotes the distance from drone f to user n, i.e., Thus, the received signal strength Φ and the RSRP σ for the DSC network can be expressed as where P f ,s is the transmission power of served user data from drone f and P f ,r is the transmission power of the reference signal from drone f.

IV. PROBLEM FORMULATION
In this section, we express the total power consumption P total and the total cell throughput R total in Sections IV-A and IV-B, respectively. Then, we define the equation of the system energy efficiency EE(Mbits/J) with R total and P total for the DCS network in Section IV-C. Finally, we formulate the optimization problem with the joint power control and position adjustment of DSCs to maximize the system energy efficiency of the DSC network in Section IV-D.

A. Power Consumption Model
In the DSC network, the battery of the drone supports the energy of the installed small cell. The small cells have three operation modes, active mode, deep sleeping mode, and light sleeping mode [26]. When the small cell is in the light sleeping state, the radio frequency (RF) components and the power amplifier (PA) can be deactivated to save energy. Furthermore, compared to the small cell operates in the active or sleeping modes, the addition transient time of the cell on/off process in the light sleeping state is very small [27]. Therefore, the extra power consumption of cell on/off process can be ignored. Then, the small cell f consumes P f that can be expressed as where P act,f and P sle,f are the power consumption of the active mode and the sleeping mode in the small cell of drone f, respectively. Δ is PA efficiency and P 0 is the basic circuit consumption. In addition, β f ∈ {0, 1}. If small cell f is in the active mode, β f = 1; otherwise, β f = 0. Thus, the total consumed power of all F small cells P total can be express as

B. Total Cell Throughput and Link Reliability
f n denotes that the f -th DSC to serve the n-th user with the maximum received signal from this DSC. f n can be expressed as where U f is the total number of the f -th DSC service users. If the n-th user is served by the f -th DSC, then Υ f ,n is equal to 1; otherwise Υ f ,n is equal to 0. The downlink SINR Γ fn ,n can be obtained by where η 0 is the thermal noise and the Φ l,n is the interference signal strength from DSC l. The total cell throughput R total for F DSCs can be expressed as where B is the channel bandwidth for each cell. We assume that each channel is fully loaded [28], while each ground user can use all the available bandwidth. In practice, U fn is not unbounded because it can be determined based on the type of the small cells [29]. As a result, B U fn can satisfy the minimum demanded bandwidth of each ground user. The link reliability probability L rel can be written as which presents the Γ fn ,n is higher than the required effective threshold Γ th .

C. Energy Efficiency
The energy efficiency EE(Mbits/J) for the DCS network is defined as the ratio of the total cell throughput R total to the total power consumption P total , which can be written as Energy efficiency is an important indicator in developing the network system, with the goal of reducing the energy required to deliver products and services.

D. DSC Energy Efficiency Optimization Problem
We formulate the optimization problem in terms of the joint power control and position adjustment of DSCs to maximize the system energy efficiency of the DSC network as where 1) means that the link reliability is greater than Γ PT ; 2) and 3) operates on the sleeping mode or the active mode in each DSC; 4) signifies that one DSC is classified to a group, which in G i is group i; and 5) is the operation range of the transmission power operation. After analyzing the optimization formula, it is difficult to optimize using traditional optimization algorithms, such as mixed integer programming. In Section VIII, we compare the normalized execution time and energy efficiency of our proposed LDM framework (i.e., the joint KMPA scheme and APIM scheme with the sleeping mode) with the results of the exhaustive searching algorithm (i.e., the optimal solution).

V. LEARNING-BASED MULTIPLE DRONE MANAGEMENT
We design an efficient LDM framework for multiple DSCs to execute automatically optimization and to satisfy the user's communication requirements in the DSC network. A learningbased method can discover the useful knowledge of the complex and dynamic system based on the observed data. All the observed data can be collected by the central control unit of a macro cell. Fig. 1 shows the central controller connects all the DSCs via wireless links. In the LDM framework, the observed data from the DSC networks are collected first. Based on the collected data, LDM adopts APIM and the KMPA to decide the power level configuration and the new appropriately position of the DSCs in the dynamic system, respectively. APC and KMC are the two important unsupervised learning techniques suggested for DSC interference management. The former does not have the prior knowledge of the number of groups, while the latter assumes this knowledge is known. The APC algorithm exchanges messages between pairs of objects (i.e., data points) until a set of exemplars for different groups emerges. In KMC algorithm, the number of clusters is known and the cluster center is not the object (i.e., the center of data point in each cluster). Based on the different requests for applications, we can select the appropriately clustering method to be employed.
In Fig. 3, we show the detail of the LDM framework when implementing the APIM scheme and the KMPA scheme. The operation procedures of LDM include five steps: 1) data collection, 2) data pre-processing, 3) APC for drones, 4) KMC for mobile users, and 5) dynamic network reconfiguration.
1) Data Collection: Various operation data in all DSCs can be collected by the control unit of the macro-cell, including RSRP, DSC ID, user ID, transmission power per DSC, channel usage per DSC, and so on. 2) Data Pre-Processing: The data pre-processing module has two main functions: data cleaning and data formatting. Firstly, the irrelevant data (i.e., channel usage per DSC) can be removed by the data cleaning process. Secondly, the useful data can be reorganized to the requested format for the learning-based methods in the data formatting process. For example, DSC ID, user ID, and RSRP information from mobile users are selected.
In the requested format of clustering techniques, the data can be reorganized to the similarity matrix, which is the interference relationship among the DSCs in this paper. 3) Affinity Propagation Clustering for Drones: Based up the similarity matrix, APC can automatically find the number of groups and the stronger interfering DSCs. In essence, the exemplar resulted from APC represents the strongest interfering source compared to other group members in the same cluster. After idenfying these interfering DSCs, we can lower their transmission power to deep and light sleeping mode to reduce the overall interference in the DSC dispatched multi-cellular network.

4) K-Means Clustering for Mobile Users: If an interfering
DSC switches to the deep sleeping modes, its group members (i.e., the other DSCs in the same cluster) need to serve the customers of this sleeping-mode DSC. To this end, its groupmembers need to adjust their positions to serve the mobile users of the sleeping-mode DSC in addition to their own mobile users. We propose to adopt the KMC algorithm to calculate the new positions of each DSC based on the locations of the target served mobile users. 5) Dynamic Network Reconfiguration: Based on the LDM frameworkm the dispatched DSC system can be reconfigured to the new power levels with new DSC's positions, thereby improving the expected network performance. Fig. 3. The learning-based multiple drones management (LDM) framework includes the data collection, the data pre-processing, unsupervised learning for drones and group mobile users, and dynamic network reconfiguration.
In summary, we use two unsupervised learning DSC management techniques: 1) affinity propagation interference management scheme to mitigate interference and energy consumption, and 2) K-means position adjustment to adjust the new 3-dimension positions of drones. In Sections VI and VII, we introduce affinity propagation interference management and K-means position adjustment in detail, respectively.

VI. AFFINITY PROPAGATION INTERFERENCE MANAGEMENT
In the LDM framework, the APIM scheme aims to reduce interference and energy consumption in the DSC networks. If a DSC interfering the non-served users of neighboring cells, this interfering DSC should lower transmission power subject to the constraint of maintaining the quality of services (QoS) requirement of its serving users. However, selecting the proper DSCs to reduce its transmission power is challenging because all DRCs always cause interference to their neighboring cells. Nevertheless, the interference degree of DSCs can be estimated by observing the RSRP information from the neighboring cells.
Based on the interference relations among the dynamic DSC network, clustering techniques can group DSCs. Compared to other group members, the exemplars in a cluster has the strongest interference. It is implied that this DSC shall reduce its transmission power or switch to the deep sleeping mode to decrease interference. Nevertheless, how to decide the number of clusters in a DSC networks and design an effective approach to identify the cluster's exemplar is an open issue.
APC technique [30] is the key to the APIM, which can decide the number of groups and the corresponding exemplars. Without requiring any training data, APC can measure the similarity between pairs of objects. Initially, all the objects are the candidates of exemplars. The messages of availability and responsibility between objects pass iteratively until a set of exemplars converge [31]. The exemplar in a cluster has the largest similarity among all the objects in a cluster. The similarity of the objects in the same cluster is larger than those of the objects in other clusters [32]. Now we introduce the similarity matrix of APIM (denoted by S) for F DSCs: where l and f are represented to the columns and rows of the matrix, respectively, and the similarities s(f, l) between DSCs are set to the interference degree from DSC f to the multiple non-served users in other DSC l as shown in Fig. 4. The diagonal elements s(l, l) in S are called preferences, and can be the minimum value from the off-diagonal elements (i.e., the input similarities) [33]. The similarities s(f , l ) = n∈U f Υ l,n are represented by the sum of the interference power, i.e., the red dotted line. Υ f ,n is the RSRP of ground user n from DSC f and n ∈ U l means that the ground user n is associated by DSC l.
To execute APC, the following procedures for calculating responsibility and availability in APC are repeated until the stopping criterion is satisfied, i.e., the selected exemplars are no longer changed [33]. The formulas used in APC are as follows: The APIM scheme mainly has three execution steps as follows.
• The responsibility value r(f, l) from DSC f to candidate exemplar l and the availability value a(f, l) from candidate exemplar l to DSC f are two messages passed between DSCs. In its initialization, we assume that the availability value a(f , l ) is zero. Firstly, the responsibility value r(f, l) is obtained by (19). When other potential exemplars l for DSC f are taken into account, the responsibility value r(f, l) reflects the accumulated evidence on how well-suited DSC l is to serve as the exemplar for DSC f [30]. Under this circumstance, if the responsibility value r(f, l) is bigger, it means the candidate exemplar l is more well-suited for DSC f than other candidate exemplars l. • Secondly, the off-diagonal of the availability value a(l, l) and the diagonal elements of the availability value a(f, l) are computed by (20) and (21), respectively. When other DSCs f are taken into account, the availability value a(f, l) reflects the accumulated evidence on how appropriate it would be for DSC f to pick DSC l as its exemplar [30]. Note that, if the availability value a(f, l) is larger, it means candidate exemplar l is more well-suited to DSC f. The similarities s(f , l ) = n∈U f Υ l,n are represented by the sum of the interference power. 2: The responsibility value r(f, l) from DSC f to candidate exemplar l and the availability value a(f, l) from candidate exemplar l to DSC f are two messages passed between DSCs. 3: The responsibility value r(f, l) is obtained by (19). 4: The off-diagonal of the availability value a(l, l) and the diagonal elements of the availability value a(f, l) are computed by (20) and (21), respectively. 5: The candidate exemplar solution can be converged by repeating (19) through (22). 6: The DSCs set D APC ,i , the exemplar in the group i are derived by affinity propagation cluster (APC). 7: Then, the interference DSC is switched to the sleeping mode to reduce interference.
the exemplar. The solution can be converged by repeating (19) through (22). Thus, based on the final criterion value c(f, l), the APC can find the appropriate exemplars and corresponding groups. Algorithm 1 shows the pseudocode for the APIM scheme.

VII. K-MEANS POSITION ADJUSTMENT
The KMPA scheme in the LDM framework is used to decide the new positions of drones for enhancing the QoS in the DSC networks. When the DSC is switched to the deep sleeping mode, its users should be served by adjacent DSCs in the same cluster. To satisfy user's demands, decreasing DSC-touser distance result in a better received signal strength for all users [8]. Because the served users of some DSCs increase, the DSCs should be positioned in the center of the new user crowds to satisfy user's communication quality.
The KMC is required to specify the number of clusters K in advance. After F DSCs are partitioned by the APIM scheme of the LDM framework, the number of DSCs D APC ,i in group i is known. Fig. 5(a) show the scenario of three DSCs (i.e., drone 1, drone 2, and drone 3) in group i. Consequently, |D APC ,i | = 3 in the i th group. When drone 2 is switched to the sleeping mode, the number of the active small cells of drones in group i is |D APC ,i | − 1, as shown in Fig. 5(b). The covered users U APC ,i in group i need to be served by the active DSCs D APC ,i,act . As a result, we can specify the number of clusters (K) that is equal to |D APC ,i |−1 in advance.
KMC can partition a data set into K clusters according to the distance from each data point to the cluster center [34]. We assume that data points are the horizontal coordinates of the covered users in group i. Initially, the most common method is to randomly select K samples from the data points the i th group [32]. Here we select the current horizontal coordinates of D APC ,i as the initial cluster centers in order to reduce the convergence time of the KMC algorithm.
Next, the KMC algorithm enters the iteration stage as follows.
1) calculate the similarity that is the distance from each horizontal coordinate of the covered user L n,i to cluster center μ i,j . μ i,j is the centroid of a cluster j (denoted by C j ). Then, each L n,i can be assigned to the cluster whose centroid yields the smallest within-cluster sum of squares [35].
where ∀p, 1 ≤ p ≤ K, and j = p. · 2 is the squared Euclidean norm. 2) Calculate the centroids for next iteration based on the covered user assigned to each cluster in the first step [35]. Then, we have If L n,i in the clusters do not change, the iteration stops. The goal of the KMC algorithm is to make the smallest value of After the partitioning, the cluster center μ i,j of cluster j in group i can be found. As illustrated in Fig. 6(b), the cluster center μ i,2 is central to user cluster 2. We can define the cluster centers as the new coordinates of the DSCs so that the edge users get better communication quality. We find that DSC f has the closest distance with the cluster center μ i,j . Then, we have where D f is the current position of DSC f. The specified DSC f can fly to the position of the cluster center μ i,j to satisfy the user communication quality. Algorithm 2 shows the pseudocode for the KMPA scheme.

VIII. SIMULATION RESULTS
Now we introduce the considered simulation platform to evaluate the performance of the proposed APIM and KMPC in the LDM framework. Fig. 1 shows the considered scenario, where drones' small cells are assumed to be maintained at h = 10 m [2] and can change their locations depending on the assigned service area. The moving speed of the group mobile users is about 3 m/s. A small cell can operate in sleeping mode or active mode [26]. Table I shows the simulation parameters [20], [28], [36]. Fig. 6 compare the energy efficiency of APIM with that of the conventional K-means clustering with power control and the always-on case. The number of DSCs is 12. APIM can adjust its transmission power to 0 (the sleeping mode), 0.1W, or 0.5W. The K-means clustering, based on negative squared Euclidean distance, divides DSCs into K clusters and finds K cluster centers. If the final cluster center is not a real location of the deploying DSC in a cluster, we choose the closest DSC to execute power control (i.e., the sleeping mode, 0.1W, or  0.5W). In the sleeping mode, the LDM will initiate KMPA and APIM. From the figure, we have the following observations. 1) Compared to the always-on case, the APIM scheme can improve energy efficiency by 25%, 20%, 14%, and 10% when transmission power is equal to 0 W, 0.1 W, and 0.5 W, respectively. 2) The performance of the K-means clustering with power control is worse than that of the proposed APIM. This is because the final cluster center maybe not a real location of the DSC and the similarity are set as the negative squared Euclidean distance. 3) In the sleeping mode, APIM with KMPA and APIM without KMPA achieve the similar energy efficiency performance. The advantage of KMPA is to maintain the minimum user throughput, which will be discussed next. Besides, we also consider the energy consumption for supporting the flying of drones. Our approach includes adjusting the transmit power of small cells and the positions of the drones. According to [8], the energy consumption of the drone's flight speed below 8 m/sec is similar to the energy consumption of the drone hovering. In [8], the highest drone's energy consumption is 150 W. In our simulations users are assumed to move at a speed of about 3 m/sec (10.8 km/hr), which is smaller than 8 m/sec. Thus, the energy consumption for supporting flying of drones can be considered to a fixed value in the case that a flying drone is designed to serve the ground users. If the transmission energy consumption from small cells is considered, the basic circuit power consumption P 0 is equal to 156.8 W in our case.
Our proposed power control can improve the energy efficiency of drone small cells by 25% from the communication module, and about 4% from both communication module and flying mechanical module. This observation implies two things. First, reducing energy consumption for supporting the flying of drones is a key to the success of drone small cells, which is worthwhile being investigated further . Second, the power system of drone small cells should separate the communication module from the flying mechanical module. Fig. 7 shows the minimum user throughput performances of APIM in three different modes: low transmission power 0.1W, the pure sleeping mode, and the sleeping mode with KMPA. We change the number of users per DSC from three to seven. We observe the following phenomena: 1) In the sleeping mode, APIM without and with KMPA can improve the minimum user throughput by 190% and 310%, respectively, comparing to the APIM scheme with transmission power 0.1W. 2) The APIM with KMPA scheme can deliver 56% improvement in terms of the minimal user throughput compared to the APIM without KMPA scheme. 3) A DSC with a higher number of users has smaller users throughput because all DSCs share the same bandwidth. In addition, we further evaluate the link reliability of the served users for data transmission, where the link reliability requirement L rel is set to 90%. Table II presents the performance of link reliability of the LDM mechanism with APIM and KMPA schemes in comparison with the other methods. Firstly, when the SINR constraint requirement Γ th = −1 dB, the link reliability of the DSC in the always-on case can achieve 90%. Secondly for Γ th = 3 dB, the link reliability of the joint KMPA scheme and APIM scheme with the sleeping mode can achieve 96%. Thus, our proposed method can improve the system energy efficiency and achieve the link reliability requirement for each served user. Hereinafter, we compare the performance of our proposed LDM framework (i.e., the joint KMPA scheme and APIM scheme with the sleeping mode) with the exhaustive searching algorithm (i.e., the absolutely optimal solution). We compare the normalized execution time and the normalized energy efficiency of our proposed method to the exhaustive searching algorithm. First, the normalized execution time is based on the maximum execution time of the exhaustive searching algorithm. The normalized execution time versus the ratio of the number of users to the number of DSCs is shown in Fig. 8. When the ratio of the number of users to the number of DSCs increases, the execution time of the exhaustive searching algorithm grows dramatically. Our proposed method can maintain about the same execution time as the ratio of the number of users to the number of DSCs increases. Secondly, for the energy efficiency performance, the normalized energy efficiency is equal to the maximum energy efficiency of the exhaustive searching algorithm. The normalized energy efficiency versus the ratio of the number of users to the number of DSCs as shown in Fig. 9. Compared to the system energy efficiency of the exhaustive searching algorithm, our proposed method can achieve 96% improvement in the case of the ratio of the number of user to the number of DSCs is equal to 7. However, the labeled energy efficiency is required in the exhaustive searching algorithm. Our proposed LDM mechanism with the unsupervised learning can perform power control and determine the position of drones without labeled data in the temporary dispatching DSC network.

IX. CONCLUSION
In this paper, we presented a machine learning-based multiple drone management (LDM) framework to manage inter-DSC interference and energy consumption. The proposed LDM consists of two key learning techniques: affinity propagation and K-means. The former is deigned to mitigate the inter-DSC interference, while the latter is developed to determine the positions of DSC for enhancing the signal quality. Our simulation results show that the proposed LDM framework can enhance the energy efficiency of DSCs by 25% and the SINR of ground users by 56%, respectively. For the temporary and dynamic DSC networks without labeled data (e.g., system throughput), the hidden information of instantly observed data can be found by the affinity propagation unsupervised clustering algorithm. In principle, the RSRP of each ground user is utilized to find the hidden mutual interference structure. Furthermore, the K-means clustering algorithm is adopted to find the hidden mutual distance structure of the location of serving users. Based on the above two learning features, our proposed can be applied to the temporary wireless network architecture. In the future, the proposed LDM framework will be extended to investigate the height and power of the DSC and the corresponding cross-tier interference in integrated terrestrial and drone base station networks.