UAV cooperative search in dynamic environment based on hybrid-layered APF

Unmanned aerial vehicle (UAV) detection has the advantages of flexible deployment and no casualties. It has become a force that cannot be ignored in the battlefield. Scientific and efficient mission planning can help improving the survival rate and mission completion rate of the UAV search in dynamic environments. Towards the mission planning problem of UAV collaborative search for multi-types of time-sensitive moving targets, a search algorithm based on hybrid layered artificial potential fields algorithm (HL-APF) was proposed. This method consists of two parts, a distributed artificial field algorithm and a centralized layered algorithm. In the improved artificial potential field (IAPF), this paper utilized a new target attraction field function which was segmented by the search distance to quickly search for dynamic targets. Moreover, in order to solve the problem of repeated search by the UAV in a short time interval, a search repulsion field generated by the UAV search path was proposed. Besides, in order to solve the unknown target search and improve the area coverage, a centralized layered scheduling algorithm controlled by the cloud server (CS) was added. CS divides the mission area into several sub-areas, and allocates UAV according to the priority function based on the search map. The CS activation mechanism can make full use of prior information, and the UAV assignment cool-down mechanism can avoid the repeated assignment of the same UAV. The simulation results show that compared with the hybrid artificial potential field and ant colony optimization and IAPF, HL-APF can significantly improve the number of targets and mission area coverage. Moreover, comparative experiment results of CS mechanism proved the necessity of setting CS activation and cool-down for improving the search performance. Finally, it also verified the robustness of the method under the failure of some UAVs.

The UAV collaborative searching refers to process of detecting targets and reducing regional uncertainty fulfilled by multiple UAVs. Among them, the mission planning technology is the key to improving search efficiency and UAV survival rate in UAV collaborative search [5][6][7][8][9]. UAV collaborative mission planning methods can be divided into two types: centralized control ones and distributed control ones respectively [10]. The centralized optimization approach act in a simple manner and have the ability of global optimal. Max-sum [11] is a classic centralized control architecture algorithm that can be used to solve the problem of UAV collaborative mission planning. It uses message passing that can be configured to work in an approximate mode. The main drawback of max-sum is that it cannot adapt to large-scale UAV swarm and online re-planning in dynamic environments. Genetic algorithm (GA) [12,13] is another common algorithm. According to biological evolution theory, mechanisms such as selection, evolution, and mutation can be used to improve the fitness of individuals. Jia et al. [14] proposed an improved GA to solve the multi-constrained task assignment problem of heterogeneous UAVs. Ramirez Atencia et al. [15] presented a new multi-objective genetic algorithm for solving complex mission planning problems involving a team of UAVs and a set of ground control stations. However, none of them considered online re-planning in dynamic environments.
Distributed planning algorithms [16][17][18] are capable of parallel computing, which means UAVs can dynamically join and exit. Thus, this type of planning algorithms has high robustness and can be applied to a dynamic environment. The auction-based is a classic distributed algorithm. The assignment of the task and the receiver respectively auction and bid on the task according to its own revenue function and bidding strategy. However, the bidding negotiation overhead consumes more time and computational resources compared to other approaches. Swarm intelligence methods such as ant colony optimization (ACO) [19][20][21] have also been used to solve the problem of UAV collaborative search. Zhen et al. [22][23][24] proposed an intelligent cooperative mission planning scheme for UAV swarm, to search and attack the time-sensitive moving targets in uncertain dynamic environment, by HAPF-ACO. However, this method can easily fall into a local optimum and cannot cover the whole area. Alotaibi [25] considered the use of a team of multiple UAVs to accomplish a search and rescue (SAR) mission. In the context of search and rescue, the locust-inspired approaches for multi-UAV task allocation (LIAM) [26,27] and layered search and rescue (LSAR) methods are studied separately. Among them, LIAM is a distributed architecture, and LSAR is a centralized architecture. Through the comparison of the two methods, it can be seen that when the number of UAVs swarm is small, LASR can achieve better results, while the number of UAVs swarm is larger, LIAM performs better.
In this paper, we propose a hybrid control architecture to combine the advantages of both centralized and distributed architectures. Aiming at the UAV search problem in dynamic environments, this paper considers multiple types of targets, and proposes a search algorithm based on a hybrid layered artificial potential fields algorithm.
The main contributions of this article are: 1. An improved distributed APF algorithm, including a segmented target attraction field function to adapt to distributed multi-target multi-UAV search tasks and a search repulsion field generated by the UAV search path to avoid intervals repeated searches. 2. A centralized CS scheduling algorithm, including a new type of search map used to help CS evaluate the search situation in sub-areas and a priority function to sort the scheduling priorities of sub-areas. Meanwhile, some mechanisms such as CS activation and UAV assignment cool-down are added, which can effectively improve the scheduling efficiency of CS. 3. An HL-APF algorithm based on a hybrid control architecture, in which the centralized CS scheduling algorithm can be optimized globally to improve the overall area coverage and discover unknown targets; the distributed IAPF algorithm is optimized locally to increase the coverage of sub-area and discover targets with prior information.

Methods
In this section, the system Model, improved APF algorithm, CS Scheduling and UAV decision are introducted respectively.

System model
The mission of dynamic target collaborative search with UAV swarm can be described as: there is a task area D ∈ L 2 which has N T dynamic moving targets and N p unknown threats. According to the prior information, the target information including position, speed and direction is partially known. There are also a swarm of UAVs connected through a cloud server to accomplish the mission. The UAVs in the swarm are isomorphic, which namely each UAV has the same functional and performance constraints. As shown in Fig. 1, the mission area is discretized and grided and the length and width of each grid are L x , L y respectively. The five-pointed shape represents the target and the filled circle represents the threat. The detect performance of UAV is modeled as round and the detect radius is R. The targets and threats that appear in the detection range of the UAV can be detected by the UAV. Assume that the maximum turning angle of the UAV is ϕ max , hence the possible positions of the UAV at the next moment are marked in dark.
The purpose of collaborative search of UAV swarm is to find as many targets as possible under constrained conditions. The target search benefit J T can be defined as: where N T is the total number of targets. a ik indicates whether the target is detected. v i indicates the value of the target detected by the UAV. Define area search benefit J E as: where the value of grid (m,n) can be 0 and 1. If the grid (m,n) has been searched, grid (m,n) (k) is 1, otherwise it is 0. The area search benefit is expressed as the search coverage of the mission area.
The constraints are: where C m is the maximum turning radius constraint of the UAV, which is determined by the maneuverability. ϕ max is the turning radius of the UAV at the moment k. C c is the minimum safe distance constraint between the UAVs. The UAV needs to maintain a reasonable safe distance to avoid collisions, and d min is the minimum safe distance between the UAVs, d ij (k) is the distance between the i-th UAV and j-th UAV at the moment k. C b is the boundary constraint, (x, y) is the coordinates of the UAV in the mission area, and the UAV cannot fly out of the mission area during the execution of the mission. C r is obstacle avoidance constraints and R l is the threat radius, d il (k) is the distance between the i-th UAV and the l-th threat. For multi-UAV mission planning, the optimization model is: where ω 1 and ω 2 respectively represent the weight coefficient of the target search benefit and area search benefit. The value of the coefficient is determined according to the specific combat mission, when ω 2 is 0, this is a pure target mission, such as subsequent continuous monitoring, attack, etc., when ω 1 is 0, this is an scan search mission. X i and ∼ X i respectively represent the status of the UAV and its communicable UAVs, and J E i (X i , ∼ X i ) respectively are the target search benefit and area search benefit are the above-mentioned various constraints.

Improved APF algorithm
The basic idea of the artificial potential field [28,29] is to regard the movement of the UAV in the task space as a force movement in the virtual force field. The target point forms a attraction field, which generates gravity and attracts the target to fly towards it. The no-fly zone or threat forms a repulsive force field, which generates a repulsive force to push the target away from the threat, and the UAV moves under the combined force of gravity and repulsive force. This section will introduce the search algorithm based on the improved artificial potential field (IAPF), which mainly includes the target attraction field, the search repulsion field and other repulsive field.

Target attraction field
Based on the APF method, the target will generate a gravitational field which can attract UAV. The basic gravitational field has a linear relationship with the distance between UAV's current position and the target. In a multi-target mission, this gravitational field will cause the UAV to be oscillate between targets. For this reason, the target attraction field is redefined as follows: where k i,att is the magnitude of the target attraction field, which is proportional to the value of the i-th target. v j is the unit vector of the movement direction of the j-th UAV. d i,j is the distance between the i-th target and the j-th UAV. is the field range brought about by the uncertainty of the target, which can be set as UAV detect radius R. In this range, the direction of the target attraction field received by the UAV is determined by its own motion direction, and the field's amplitude is fixed. When the UAV enters this range, it does not need to change the flight direction, and it can continue to search for the possible location of the target. x i,j is the unit vector of the direction from the UAV to the target. L i,max is the maximum range of the target attraction field, which can be set as: where N max is the maximum number of iterations. N T,known is the number of known targets. Setting the maximum range can not only reduce the computational complexity effectively, but also help the UAV focusing on the possible unknown targets. The action of the target attraction field is shown in Fig. 2. It can be seen that when the possible location of the target is within the detection range of the UAV, the UAV can maintain the current flight direction and complete the search. Meanwhile, it also avoids that the UAV receives the opposite force when it just flies away from the target point, forcing it to turn around. When the distance exceeds the maximum range, the UAV will search for other targets at this time. In other cases, the UAV receives a gravitational force in the direction of the potential target, so that the gravitational force drives the UAV to fly toward the target. Compared with the algorithm using target probability map, the method in this paper does not need to store and update the probability map, which reduces the computational complexity while ensuring the search efficiency.

Search repulsive field
In order to avoid repeated searches in a short time interval, the related work [24] is based on the ant colony algorithm, which reduces the pheromone in the searched path according to the rules to increase the probability of the UAVs flying to the undetected area. Each iteration needs to update the pheromone of the entire task area, which also has the problem of high computational complexity. Based on the artificial potential field algorithm, this paper designs a search repulsion field, so that the UAV will leave a series of search repulsion fields on its search path. Futher, in order to adapt to the uncertainty of the moving target, the search repulsion field decays with time. This method reduces the amount of calculation while ensuring search efficiency. Define the k-th search repulsion field center position left by j-th UAV as: where N srp is the time interval constant of the search repulsive filed, which can be set as . Loc uav (j, k) indicates the position of the j-th UAV at time k. Define search repulsive field between the i-th search repulsive field and j-th UAV as: where k srp is the search repulsion field constant. L srp1 and L srp2 is the uncertainty range and the maximum range of action respectively. β is the time decay factor, which represents the uncertainty of the environment. The larger β is, the more likely the UAV will perform a second search in the same area.

Other repulsive field
According to reference [24], the threat repulsive field between i-th threat and j-th UAV is designed as: where k trp is the threat repulsive field constant, d max,i is the radius of action of i-th threat, d 0 is the minimum safe distance. The repulsive field between the i-th and j-th UAVs is designed as: where b and c are determined. The magnitude of the repulsion between the UAVs depends only on the distance.

CS scheduling
In the first stage of mission, the improved artificial potential field algorithm can help the UAV to quickly search for the targets with prior knowledge (such as the start location). Meanwhile, the target search benefit J T increase faster. In the later stage, when most of targets with prior knowledge have been detected, the search efficiency of UAV swarm is reduced. In order to solve this problem, a layered scheduling algorithm determined by the cloud server is added.

Partitioning
According to the idea of the reference [25], the task area is partitioning. Divide the entire mission area into N × N sub-areas. The length and width of each sub-area are L/N.

Search map
CS managers sub-regions by storing search graphs in order to make assignment decisions. Define the basic search map as: where la i,j =1, if grid(i, j) has been covered, otherwise la i,j = 0 .
Here, considering dynamic targets, some area may need to be searched twice. After each iteration, the searched area needs to be decayed by time. Meanwhile, in order to distinguish the searched area from the unsearched area, it is necessary to set infimum for the searched area. Moreover, there are no-fly zones in the area, and the search map of the detected no-fly zone is set as a constant. This value is recommended to be between infimum and 1, which has been adapted to the UAV's search for the area. In addition, in order to solve the problem of repeated assignment of the same location, the future flight target point of the UAV is also regarded as the detected area. In summary, the improved search map is defined as Algorithm 1, where τ is the attenuation coefficient and is used to characterize the dynamic environment. γ is the threat search map constant. UAV m is the m-th UAV in the CS. UAV m _Detect_Range is the UAV m 's detect range and UAV m _Assign_Detect_Range is the detect range when UAV m arrives at the designated place. Threat m is the m-th known threat of CS. Threat m _Range is the range of Threat m . CS obtains the global search coverage of the UAV swarm by storing and updating the search map. Therefore, CS can start from the overall situation and help the UAV swarm to jump out of the local optimum, thereby improving the overall coverage.

Sub-area priority
According to the priority of each sub-area, CS selects the closer and more idle UAV for the sub-area need to be detected. Define the search value of k-th sub-areapart_value k as the sum of the search map values of the sub-area: (12) part_value k = s_map(i, j), grid(i,j) in part k The smaller thepart_value k , the more the sub-area needs to be detected. The idle state of each UAV is defined as the amplitude of its potential field force. The more force the UAV receives, the busier the UAV. Above all, define the k-th sub-area priority as: where w p0 , w p1 , w p2 are the search map, distance and field coefficient respectively. d i,k is the distance of i-th UAV and k-th sub-area's center. F i is represents the magnitude of the potential field force of the i-th UAV. CS traverses all sub-area to get the minimum value of the priority. It should be noted that the priority also implies the assigned UAV.

CS control mechanisms
The CS control mechanism includes two aspects. One is that in the initial stage of the mission, in order to maximize the use of the prior information of the targets, the UAVs search targets by the IAPF algorithm. Meanwhile, the CS does not participate in the control of the UAVs and is in an inactive state. When most of the known targets are found, CS participates in the control of the UAVs and is active. We call it the CS activation mechanism and define CS_ACT as the activated state of CS. When the UAV arrives at the center of the designated sub-area during the CS scheduling, it needs to search the sub-area autonomously to improve the search efficiency of the sub-area. Uninterrupted CS scheduling will cause the UAV to fly back and forth between the center points of the sub-region, while ignoring the search for other locations in the sub-region. Therefore, CS needs a cool-down mechanism for the scheduling of each UAV. In the cool-down time, the UAV's state is unready and can not be controlled by CS.

Flow chart of CS
Based on the above design, the flow of the CS Scheduling is shown in Fig. 3. In order to make full use of the target prior information, when the number of undetected targets with prior information is less than or equal to one, CS_ACT = 1. Define the ready statue of UAVs as that the UAV has not been assigned and has not been assigned during the previous iterations. The purpose of setting the assigned cool-down period is to effectively search the sub-areas after the assigned UAV goes to the target sub-areas. When the number of UAV in the ready state is more than zero, CS get the minimum priority of sub-areas and set the assigned k-th UAV CS_flag true.

Hybrid-layered APF
In summary, this paper proposes the hybrid-layered APF method, including distributed artificial potential field algorithm and centralized layered algorithm scheduled by CS. Under the hybrid control architecture, the UAV will not only conduct self-organized search, but also be scheduled by CS. Under the condition of CS, the UAV will fly as far as possible to the center of the k-th sub-area that needs to be detected command and meeting the constraints. Assuming the current grid is s i , the transition rule when CS_FLAG is ture are designed as: where s j is the next grid. is the candidate grid set that meeting the constraints. d jk is the distance between s j and the center of k-th sub-area.
When the UAV reaches the center of the designated sub-area or finds that the center of the sub-area is within the no-fly zone, CS_flag is set to 0. In this situation, the UAV will enter the cool-down state and will not be assigned by CS in the next N cool iterations. Meanwhile, the UAV will make decisions based on the IAPF.
When CS_flag is 0, the UAV get the total field of current grid and surrounding grid. When the field direction of the surrounding grid of the current grid is consistent, it means there may be targets or no-fly zones around. At this point, the UAV will make decisions based on the direction of the field. Otherwise, the UAV will keep the current flight direction. The UAV decision under IAPF is designed as: where θ j is the angle between the path from the current grid to the candidate grid s j and the direction of the current grid potential field. std(F i ) is the standard deviation of the field direction among the current grid and the surrounding grids.

Results and discussion
In order to verify the superiority of the HL-ACO algorithm for the search problem of UAV swarm, the Python based simulations are carried out.

Mission scenarios and parameter settings
To set-up our simulation, we consider that the mission area is 100 km × 100 km and is discretized to 100 × 100 grids. The number of targets is 20. The targets are divided into five types and each type has 4 targets. The target parameters are shown in Table 1.
The target 13-20 information is unknown, and the information in other target tables is known. To verify the obstacle avoidance performance of the mission planning algorithm, 5 unknown threats are added in the above mission area. The threat information is shown in Table 2. Assume that the speed of UAV is 100 m/s and each decision step is 10 s, so the UAV moves a grid in a single iteration. The UAV's maximum turning angle is 45 • , and the detection distance is 3 km. Some relevant parameters in the HL-APF algorithm are shown in Table 3.

Algorithm comparison
Assuming that 10UAVs perform the missions, the following three algorithms are investigated: Algorithm 1 HAPF-ACO [24]: It is based on a distributed ACO algorithm with an improved transition rule considering the range constraint, and the APF is introduced considering the dynamic updating of the TPM. The target attraction field, threat repulsive field and repulsive field are also constructed for the environmental cognition The UAV search trajectory after 250 iterations obtained from the above three algorithm is shown in Fig. 4. The solid line is the target's trajectory, the dotted line is the trajectory of the UAVs, and the "+" sign marks the location where the target was found. Furthermore, 100 simulations were performed for each algorithm, 250 iterations per iteration. Then we summarize the number of discovered targets and area coverage average of the 100 experiments. The final statistical results is shown in Fig. 5. It can be seen from Fig. 4 that there is no patchy unsearched area in the HL-APF method, and both HAPF-ACO and IAPF have patchy unsearched area. From the final statistical results, it can also be seen that compared to HAPF-ACO and IAPF, HL-APF has achieved more number of found target and better area coverage rate in the later stage of mission execution. This means that the addition of the CS assignment mechanism greatly improves the search efficiency of the UAV swarm.

Analysis of the number of UAVs
It is assumed that the swarm is composed of 10 UAVs ,15 UAVs and 20 UAVs respectively. The UAV search trajectory using HL-APF after 250 iterations obtained from the 15 UAVs and 20 UAVs is shown in Fig. 6. Furthermore, 100 simulations were performed for each set of numbers, 250 iterations per iteration. Then we count the number of discovered targets and area coverage average of the 100 experiments. The statistical results are shown in Figs. 7 and 8.
It can be seen from Fig. 7 that as the number of UAVs increases, the detection efficiency of the swarm also increases. This proves that this method is suitable for mission planning of large-scale UAV swarm. Meanwhile, it can also be seen that as the number of UAVs increases, the overall detection performance also has an upper limit. When   the number of UAVs is 20, the average number of targets found is 19.9, and the average area coverage is 99.8% . It means that the UAVs swarm is close to completing the task perfectly. Furthermore, it can be seen from Fig. 8 that compared with HAPF-ACO and IAPF, HL-APF achieves close to the performance of the first two under 20 UAVs when the number is 10. To sum up, the empirical results confirm the superiority of method.

Analysis of the CS mechanisms
It is assumed that the swarm is composed of 10 UAVs. In order to compare the impact of CS_ACT and cool-down mechanisms performaned on search effectiveness mentioned in Sect. 2.3.4, the following three groups of controlled experiments are set up. Then we count the number of discovered targets and area coverage average of the 100 experiments. The statistical results are shown in Table 4.     It can be seen that the CS_ACT mechanism can help increase the number of target discoveries when achieving similar area coverage by comparing the results of set 1 and set 2. This is because in the initial stage of the mission, IAPF can help UAVs to quickly approach those targets with prior information. At this time, the activation of CS will reduce the search efficiency. In the later stage of the mission, as most of targets with prior information are detected, the activation of CS can help increase coverage and discover unknown targets. The comparison between set 1 and set 3 proves the importance of cool-down in improving mission coverage and number of targets found. UAV with cool-down mechanism can fully search the sub-areas, and avoid the constant moving between different sub-areas.

Robustness analysis
It is also assumed that the swarm is composed of 10 UAVs. In order to analyze the robustness of HL-APF, we randomly set part of the UAVs disable in the swarm at different times. The number of disabled UAVs is 4, and the iteration of failure is 50, 100, 150 and 200 respectively. Then we count the number of discovered targets and area coverage average of 100 experiments. The final statistical results is shown in Fig. 9.
It can be seen that when some UAVs fail, the remaining swarm can still continue to complete the search task and obtain considerable search performance. The method in this paper is robust. Furthermore, the UAV's failure time is related to the influence of the swarm search performance. In the early stage of the mission, if a swarm attrition occurs at 50 o'clock, it will significantly affect the number of search targets and coverage. In the later stage of the mission, the failure of some UAVs will hardly affect the search. This is because in the later stages of the mission, most of the targets and areas have been