Path planning of multiple UAVs using MMACO and DE algorithm in dynamic environment

Cooperative path planning of multiple unmanned aerial vehicles is a complex task. The collision avoidance and coordination between multiple unmanned aerial vehicles is a global optimal issue. This research addresses the path planning of multi-colonies with multiple unmanned aerial vehicles in dynamic environment. To observe the model of whole scenario, we combine maximum–minimum ant colony optimization and differential evolution to make metaheuristic optimization algorithm. Our designed algorithm, controls the deficiencies of present classical ant colony optimization and maximum–minimum ant colony optimization, has the contradiction among the excessive information and global optimization. Moreover, in our proposed algorithm, maximum–minimum ant colony optimization is used to lemmatize the pheromone and only best ant of each colony is able to construct the path. However, the path escape by maximum–minimum ant colony optimization and it treated as the object for differential evolution constraints. Now, it is ensuring to find the best global colony, which provides optimal solution for the entire colony. Furthermore, the proposed approach has an ability to increase the robustness while preserving the global convergence speed. Finally, the simulation experiment results are performed under the rough dynamic environment containing some high peaks and mountains.


Introduction
From the last few decades, the advancement in the field of aeronautics and astronautics has widely increased. The unmanned aerial vehicle (UAV) is the best invention of this domain. [1][2][3] In this era, UAVs have been widely used for the military missions but it can widely use for the applications in rescue, surveillance and mapping scenarios. Moreover, we can also use UAVs for long missions, such as in remote sensing and dangerous areas, where accurate and flexible maneuvering is required. 4 In order to enhance the performance of combat scenarios and reduce the overall mission accomplishment time, a swarm of UAVs takes part. 5 The path-planning process is an important area of interest in the usage of multiple UAVs (M-UAVs). 6 It requires an optimal path or route from the initial position (base station) to the targeted position while avoiding obstacles and consuming minimum fuel. 7 Optimization issues are encountered to real-world problems in different areas such as mathematics, engineering, science and economics. Scientific algorithms are used to resolve these optimizations issues; it may require more extensive calculations as the problem size gets bigger. Thus, we need optimization techniques that require less memory and computational power but give better outcomes. From the last three decades, scientist made stochastic-based different bio-inspired optimization algorithms which are more accurate and efficient to compare with the analytical methods. [8][9][10][11][12] The autonomous flight control is one of the most essential feature in modern control UAVs to adapt the dynamic environment and to find the most suitable and optimal route for the mission. 13 When using M-UAVs, we also have to deal with additional problems like simultaneous arrival and collision avoidance and so on. One way to solve these problems is to use classical algorithms because they are quite efficient, but they require a lot of time and calculations. 14 On the other hand, many scientists have made evolutionary bio-inspired algorithms such as ant colony optimization (ACO) by Colorni et al., 15 particle swarm optimization (PSO) by Kennedy and Eberhart, 16 differential evolution (DE) by Storn and Price 17 and genetic algorithm (GA) by John Holland. 18 These algorithms are quite efficient as well as reliable and quicker in finding the optimal solution to complete the mission requirement robustly. The combination of these methods has emerged to have quicker convergence speed and superior solutions. Examples of these hybrid algorithms include those between PSO and GA, 19,20   and ACO and DE. [24][25][26] However, in this research, our main concern is on ACO; it follows the hunting behavior of actual ant colonies to solve the optimization problems. It consists of synthetic ant colonies which work together to find the best possible solution. The ants convey the information to each other using virtual pheromones. While ACO is a very effective method, proven by its widespread usage, it has some disadvantages, too. Its convergence speed is slow, and it can fall into local optimum. Occasionally, it comes up with identical outcomes that lower the probability of getting the best result.
Cekmez et al. 27 present a multi-colony ACO to counter the previous aforementioned issues of slow convergence speed and local optimum. In contrast to normal ACO, the authors suggest to use multiple ant colonies. They utilize different pheromone tables for every colony to explore the area completely. However, in every iteration, each colony provides varied outcomes. After that, all colonies interchange their optimal solution with the surrounding colonies. Then, every colony modifies their pheromone tables with the acquired data. Shao et al. 28 also discuss a similar path-planning problem concerning M-UAVs. They explore a distributed cooperative particle swarm optimization (DCPSO) method intended for secure navigation of each UAV.
Dewangan et al. 29 used a grey wolf optimization (GWO) technique to resolve the three-dimensional (3D) path-planning problem of UAV. In addition, the major task is to search the most reasonable path and to avoid collision and other obstacles. In Gu et al., 30 M-UAVs trajectory planning is done using PSO with (receding horizon) framework. Moreover, obstacle avoidancebased approach is developed to eliminate collision between UAVs. Furthermore, a decentralized control hierarchy for an individual UAV is proposed to control individual UAV. An intelligent Be´zier curve-based model for path planning is proposed in Tharwat et al. 31 The main aim of the designed algorithm is to search the shortest and smooth path between the start point and the targeted point. Beside this, chaotic particle swarm optimization (CPSO) strategy has also been applied to enhance the control points of the Be´zier curve.
The inspiration of this research work has been taken from the work in the previous literature; 26,27 the presented study states the issues of UAV path planning by taking the benefit of smart bio-inspired optimization algorithms, in the presence of dynamic environment. In this research, we design a novel maximum-minimum ant colony optimization (MMACO) with DE to make a metaheuristic hybrid algorithm. Although combining these two algorithms increases the complexity of the system, but it will surely provide a globally optimal solution for the path planning of multi-colonies with M-UAVs.
The primary achievements of this research are as follows: 1. A multi-colonies optimization having different sub-ant colonies exists in the entire ant colony, where each sub-colony performed independently to explore the overall area. 2. Three different colonies have been made, and each colony is independent to find the optimal or shortest distance to reach the targeted area. 3. To enhance the performance of the entire colony, each colony needs to share its knowledge with others. 4. To find the best and the shortest route from the initial point to the targeted area from a colony in which the upgradation of route and lemmatized the number of participants are done by MMACO. 5. Finally, DE will globally determine which is the most suitable sub-colony to reach the targeted area in optimal time.
The reminder of this manuscript is structured as follows. In section ''Problem statement,'' we define the problem statement and mission requirement. Cooperative path planning, collision avoidance and coordination between UAVs are discussed in section ''Cooperative path planning.'' It is followed by section ''Hybrid MMACO-DE algorithm,'' in which our designed hybrid MMACO-DE algorithm is discussed. Section ''Simulation results and discussions'' presents the simulation results and discussion. Finally, the overall manuscript is concluded in section ''Conclusion.''

Problem statement
The M-UAVs path-planning problem is shown in Figure 1. To consider three UAVs take off from three different locations of an entire colony. The termed different locations are called as sub-colonies, that is, 1, 2 and 3, target the designated area. Moreover, there are some obstacles in the form of mountains and high rocks in the route, due to these aforementioned threats in the fly zone need to escape from them robustly.
Furthermore, during the whole scenario, UAVs are required to maintain safe distance to avoid collision and arrive at their targeted area by time synchronization to adjust the length and the route of the complete tour.

Mission requirement
Our main concern is to find which colony is most suitable to target the designated area optimally.

Cooperative path planning
To compare with the path planning of individual UAV, M-UAVs required perfect coordination between neighbor UAVs, to meet with the specific point or area without any collision. The requirements to achieve the mission and physical parameters are different from a single UAV. The cooperation and coordination between several UAVs bring different co-constraints, that is, timing coordination and avoidance to collide with each other. In this study, we mainly focus on collision avoidance and coordination between UAVs.

Collision avoidance
An additional requirement of path planning is to avoid collision and to minimize the threat of collision, which is also termed as air coordination or formation. In our scenario, the 3D map is used and we must ensure that all UAVs' altitude must change according to the given condition of the unavoidable mountain peaks in the route. Due to the above route, we must maintain safe distance between the UAVs and apply some proper coordination strategy. 13 The path-planning constraints of M-UAVs also maintain the time and space correlation between all UAVs route. We consider short safe distance d ss between UAVs in our scenario to avoid collision. The distance of flight between UAVs must satisfy the relationship where y 1 ðtÞ; y 2 ðtÞ and y 3 (t) are the positions of UAV1 to UAV3, respectively. As per the mission requirements, when UAVs approach to the targeted area, they get closer and closer to each other. At that time, the safe distance will settle between different iterations at different stages, which is rewritten as where T l, m is about 85% of the full flight duration; D and d will adjust according to the specific planning area conditions.

Coordination between UAVs
To meet with the spatial and temporal requirements for the group of UAVs, we introduce two coordinate coefficients that must add in each UAV as a function of its path or route. It is expressed as Therefore, f s and f t are the coefficients of spatial and temporal coordinates. It is expressed as where T 1 ; T 2 and T 3 are the expected arrival time of all three UAVs to the destination and N is the constant. Now, in terms of the proposed hybrid control strategy based on path-planning algorithm, each sub-colony UAV is performed independently to explore the overall area. 13 At the same time, when we evaluate the fitness of each aircraft, the information between all UAVs must communicate with each other. In the M-UAVs evolution, DE will deal this and set the optimized path for an individual one. The path-planning coordination of complete scenario is presented in Figure 2.
For the path-planning scenario, each UAV communicates with other UAV and calculates its spatial and temporal coordination coefficient. Finally, the proposed algorithm will select the most suitable path, which will not only meet the fitness but also our mission requirements.

Hybrid MMACO-DE algorithm
In this section, we design an improved metaheuristic hybrid algorithm to resolve the path-planning issues in a 3D environment. The metaheuristic algorithm consists of MMACO and DE. The main aim is to design this algorithm and to apply it in the scenario of multicolonies path planning of M-UAVs. In order to enhance the effectiveness of path planning, UAVs must arrive at the targeted area using the shortest path, complete the mission and improve the reliability of the whole scenario. First, we apply the classical model of ACO and after that speed up the whole strategy by applying our hybrid metaheuristic algorithm. Finally, only the most optimal colony will be selected to complete the mission requirement.
As shown in Figure 3, the entire ant colony divided into three sub-colonies, where each sub-colony is restricted only best ant to construct and update the route. Moreover, each colony executes as an independent agent, to fulfill path-planning task more robustly. The main objective is to find the most optimal distance from the entire colony, that is, which sub-colony is most suitable to reach the targeted area. Figure 3 represents the block of MMACO-DE strategy; it also shows the knowledge sharing between multi-colonies path planning of M-UAVs.

Basic ACO algorithm
The natural representation of basic ACO algorithm depends on ant colonies. Real ants are proficient to find the shortest path from a food source to their colony. They hunt the food without any visual clues by manipulating and updating the pheromone data. Ants walk on the earth, deposit the pheromone and track in probability the pheromone earlier dropped by the former ants. 32 The advantage of the algorithm is that in the end all ants follow the shortest path. A procedure where ants utilize their pheromones to search for a shortest path between colony and food source is presented in Figure 4 below.
The action of real ants is inspired by the ACO, where a set of virtual ants collaborate with each other. For the solution of problem by interchanging the information through pheromones placed on the edges of the grid boundaries.

Improved ACO algorithm
The parallel process of ACO in general comprises two essential practices: adaptability and collaboration. In adaptability, the contender solutions are carried out to readapt their organizations on the source of collecting the information. On the other hand, in collaboration stage, the contender offer interchanges the information to offer the best possible solution. The first ACO strategy applied to the traveling salesperson problem (TSP) is to discover the minimum closed-loop path between two cities or node. Even though the UAV path  planning is to discover the best optimistic flight path, UAV is able to complete the desired task and avoid the unwanted threats, that is, mountain peaks and different threats. 33 ACO provides flexible route to solve M-UAVs pathplanning issue under hazardous battlefield environment. The scenario is designed for the nth UAV in the path planning of ith UAVs, let m ants participate from the initial position, the ants choose the next nodes in the grid according to the adaptability rule. Hence, additional ants mean UAV route is tracked by the higher probability that a route chosen by the other ants. The above process of ACO assures that mostly all ants walk along the shortest UAV route in the end. The main features of ACO related to the ant characteristic are the heuristic function and pheromone denoted as h and t. In this research, we investigate the shortest path between the colonies and the targeted destination node between two cities termed as a and b, respectively. Now, define the transformation probability from node a to b from lth ant and e is the edge in the grid rewritten as 15 where the term accept l is the flexible domain of the lth ant. The parameters a and b control the comparative significance of the trail pheromone versus reflectivity. The sum of pheromone by the nth UAVs ant between the node a to b is written as t n, ab t ð Þ = t n, ab : The sum of pheromone trail t provides the clue to the ants to select the upcoming node. In ACO, ants drop their pheromone on the edge of the boundaries they passed. Ants allocate for the nth UAV will able to select the node, which have relatively greater pheromone than ith UAVs. The experimental desirability from node a to b is expressed as where d b;da denotes the distance between node a, node b and destination area, that may clue the ants situated at node a tend to select the nodes that are closest to the destination area. In addition, the ants in the procedure built their routes, the trail values of the route edges (a, b) and the pheromone updated using the following equation where r & (0, 1) is the local pheromone decline constraint, it represents the rate of evaporation between time t and t + 1, respectively Now, Dt p n, b, l is the amount of pheromone trail exist on the boundaries or edge of (a, b) and the node b by the lth ant of the nth UAV lie among t and t + 1. In the traditional ant hierarchy structure, it is expressed as where Q is the constant and j n, l denotes the route cost of the lth ant.

maximum-minimum ACO
To increase the performance of classical ant system, improve the issues related to the initial stagnation. In order to fasten the convergence speed, the following enhancement in the algorithm is established in the ACO strategy. 32 Now, the m ants are allocated for the nth UAV, and a number of routes are created in individual iteration. 13 The mean cost m of the routes is calculated by If and only if the route cost of the lth ant in the nth iteration fulfills j n, min (t)5j n, l (t), the lth ant updates the pheromone using equation (11). The separate solution among the best iteration and best global ant is to update the trail pheromone, and hunt stagnation could proceed further. This type of stagnation must escape the probability of choosing the upcoming resolution; it will directly depend upon the pheromone trails. Although by limiting, the effects of pheromone trails during it can easily reject the variance among the pheromone trails at the implementation of this strategy. To accomplish this mission, ACO executes maximum and minimum pheromone trails for all the pheromone trails termed as t max and t min . Now, improving the trail pheromones in the last of iteration, the following equation is used to update or advance the pheromone points DE DE is the best and versatile optimization method; it formerly designed as an algorithm for the global continuous optimization. It consists of three basic constraints named as mutation, crossover and selection operator. Initially, it generates some random solution to search the target. After that, it enlarges the alteration vector concerning two population vectors. First, it generates the trail mutation and hereinafter it joins the trail mutation and the target ones to produce an updated individual. If an updated individual has good fitness results, it will accept and replace it with the previous individual. 33 In this research, we check the pheromone on the route that left by the ants in MMACO as the objective in DE. To solve the path-planning issues, the best ant of each colony represents the colony and only the best ant will be able to update the pheromone trails for the whole scenario. Some modifications are done in the model of MMACO strategy to divide the whole colony into three different colonies, and the colony number is termed as col that restricts the total number of ants m. The pheromone left between two nodes is denoted as t = (t n ); n = 1, 2, 3 ... colonies, respectively. Now, the DE mutation method is applied on it and the upgraded trail spreading is created using the equation where r is the positive integer, by taking threepheromone trails for three different colonies that is randomly chosen by the ant colonies, the real constant factor f. It helps to control the differential variation (t 2r À t 3r ). In order to increase the variation of pheromone trail among two nodes, by initializing the benefit of DE crossover operator to make a new pheromone trail t 1n that produced by the mutation, it joins with the present target pheromone trail t n . 26,33 Now, the designed MMACO-DE strategy produces a new matrix to store the pheromone which is rewritten as . . .
Colony that can be expressed as where t ab n is the amount of pheromone between two nodes or cities a and b of the nth ant colonies. Afterward, the mutation t ab 2 n is the nth AC pheromone trial between both the cities. Subsequent to this crossover process to t ab n and t ab 1n and the positive integer number randb, cross over constant CR value lies in between 0 and 1. The greater the value of crossover the higher the possibility of crossover process. If the value of CR = 0, it means that no DE process will occur. Now, it is clear that the newly produced pheromone matrix t 2n will confidently get at least one component from the mutation pheromone process t 1n . On the other hand, the pheromone trail will not update at all, resulting in weakening pheromone exchange among all the ant colonies or entire ant colony. 25,26,33 For the 3D path planning of UAVs, all the ants in each colony develop their own route using the transition probability p a, b that is calculated by the pheromone matrix t n . The term L best 1n is the shortest route between the entire routes find by the ants. After that, we will compare the original and the newly generated pheromone. Next to this in Duan et al., 26 the author made a single solution for this and called the selection as ''the greedy'' model. It states that if the updated pheromone trail has the better value than the previous one, it will accept this and reverse into the matrix of pheromone trail for the next iterations. Else, the original pheromone will take place among both the cities of ant colonies. Now, the crossover process to the pheromone trail is rewritten as where the original pheromone left by the nth ant colonies is termed as t n;ðtÞ , for the tth number of iteration. Now, the updated pheromone trails of the nth ant colony after DE mutation and crossover process. Now, the t 0 n;ðtÞ is equivalent to the pheromone matrix; it has high value that lies in between t n;ðtÞ and t 2n;ðtÞ . The shortest route L best 1n is found by the t n;ðtÞ original pheromone ant colony, whereas L best 2n is length of optimal route found by the t 2n;ðtÞ , that is, the updated value of the pheromone of the nth ant colony. After the selection process, the nth ant colony which create their routes using pheromone trails t 0 n;ðtÞ or t 2n;ðtÞ discharge their pheromone concern the route they already covered and update the selected pheromone trail t 0 n;ðtÞ to become new trail pheromone as t n;ðt + 1Þ . 26,33 After that, pass the updated pheromone to each ant colony for the next iteration to search new and more feasible path exploration. The overall flow of the proposed hybrid algorithm is defined below in Figure 5.
Following are the steps of our proposed hybrid algorithm: Step 1: Initiate all the constraints of ACO, that is, set the number of and maximum number of iterations along with the maximum number of ants utilize to explore the path-planning scenario.
Step 2: Divide the entire ant colony into three different located ant colonies or sub-colonies. However, the number of ants in each colony logged col_m(n), for the ith colonies.
Step 3: Initialize the edge pheromone for every subcolony path where ants are allowed to construct the route and to move to the next node.
Step 4: Check whether the ants reached at the target node or not; if no, repeat step 3; if yes, proceed to the next step.
Step 5: Now calculate the route covered by the ants and making condition best ant update the route for every colony. Also calculate the cost from city a to b of the lth ant that belongs to the nth UAV.
Step 6: Mutation and crossover process is applied on the original trail pheromone t n and every ant colony best ant is passed from their past iteration and produced the new trial pheromone t 2n .
Step 7: Set n = 1 for each ant colony of the nth ant colony; finally, it arrives the target point to construct the route as per trail pheromone t n . Calculate the distance of route travel by the each ant and select the best shortest path, and save it as L best 2n . Step 8: Now, separate each ant of the nth ant colony that visits the entire nodes in 3D dimensional path to increase the trail pheromone t 2n . Calculate the tour distance reach by each ant and select the shortest one and denote it with L best 2n .
Step 9: Compare L best 1n and L best 2n and start the DE selection operation, set as t n or t 2n . If L best 1n is greater than L best 2n , select yes; otherwise, no for the next step.
Step 10: Two conditions are there; if in the previous step, condition is yes, it will allow to explore new path and update the pheromone; otherwise, do not need it. Selection process of DE will select the route.
Step 11: Return to Step 6 till N c 5N c max .
Step 12: The designed proposed algorithm gives the best optimal route in 3D environment.

Simulation results and discussions
The simulations of path planning via multi-colonies with M-UAVs using MMACO-DE algorithm are divided into two different scenarios. Now, complete the overall scenario constraints of each UAV including the length or distance of flight, attitude angle, altitude and 3D land environment. When UAV fly from one point to the nearest way point, it requires some time to fine tune its attitude and altitude to reach at the nearest node. The length between two nodes or way points are calculated using the formula In equation (17), L i is the distance between two nearest nodes and L max is the maximum flight distance. To escape from the threats, that is, high peaks and mountains, UAV needs to change its flight path. By adjusting the attitude angles of UAV, that limits its maneuverability and practically their real attitude angles are smaller than the maximum attitude angles. In terms of maneuverability, the roll angle is calculated by cos u max 4 a i Ã a i + 1 = a i j j a i + 1 j j ð Þ ; i = 1; 2; 3; . . . ð Þ where a i ; a i + 1 is the trajectory between two nodes and u max is the maximum roll angle of UAV. When UAV is flying in hilly areas, in order to avoid from collision from the high peaks, it needs to change its altitude. To consider the maximum allowed altitude of UAV Z max and the real-time altitude Z i now, the real altitude should not greater than the maximum allowed altitude, that is, Z max . Z i . In order to find the possibility and efficiency of the designed hybrid algorithm for path planning of multicolonies with M-UAVs, two different simulation scenarios were simulated. Both these simulations were performed in Simulink MATLAB 2016, along with the programming on an Intel Core i7 8th generation. The initial parameters for both scenarios are set to be a = 2, b = 3, r = 0:75, N c max = 25, Q = 10, t min = 0:1, t max = 1, m = 10. In both the simulations scenarios, the 3D land environment is about 20 km longitudinal, 20 km wider and about 2 km in altitude. The dynamic 3D environment constraints are defined in Table 1.

Scenario 1
In order to verify the effectiveness of our prosed algorithm first, we compare our proposed hybrid algorithm with classical ACO. In this case, two different algorithms are initiated from the same base station and to reach at the designated area by applying cooperative path-planning methodology of UAV. The pathplanning optimization results are shown in Figure 6(a) and (b), respectively, and it is clearly evident that our designed algorithm reaches at the destination using the optimal route. Figure 7 shows the estimation costs of both the algorithms, and Table 2 defines the overall comparison of these strategies.

Scenario 2
In this case, three best (ants) UAVs that belong to three different colonies for an aerial combat formation of path planning are assigned to reach at the targeted area simultaneously. All the initial parameters in this case are same except the number of colonies, that is, col = 3. Moreover, in this case, the air collision avoidance and coordination between all UAVs reflected robustly in order to escape from collision and to reach at the destination node using cooperative path-planning constraints. Figure 8 shows the simulated path-planning scenario of multi-colonies with M-UAVs achieved perfectly. Figure 9 presents the estimation costs of different colonies, and Table 3 presents the comparison of all colonies. Finally, Figure 10 presents the best-optimized route found by our designed algorithm. The simulation results show that the proposed MMACO-DE strategy can successfully find the global optimize route and fulfill the mission requirement successfully.

Conclusion
This study uses hybrid metaheuristic algorithm obtained via MMACO in combination with DE strategy. The proposed algorithm provides optimal 3D route for the path planning of M-UAVs. As seen in the simulation, the ant colony is further divided into small    colonies to explore optimal colony for mission completion. The above designed hybrid algorithm offers a platform for multi-colonies path-planning concept that is implemented for the real-world scenario.