Particle Swarm Optimization for Target Encirclement by a UAV Formation †

: This paper presents an idea of using particle swarm optimization (PSO) to tune the control system of a decentralized unmanned aerial vehicle (UAV) formation. Simulations were run on a consensus-based decentralized UAV formation. Vector ﬁeld guidance was used to control the formation. A ﬁtness function is proposed that is based not only on the error of distance to the circular path, but also on the relative inter-UAV distance error. To demonstrate the effectiveness of the proposed method, the obtained results of such tuning are compared to those obtainable by the conventional trial and error method.


Introduction
Use of autonomous robots in groups is a promising area of research in today's mobile robotics, and it receives much attention. Decentralized control of autonomous robots is one of the more complex yet effective approaches. Thus, many papers cover decentralized control applications in ground-based robots [1][2][3] and autonomous unmanned aerial vehicles (UAVs) alike [4,5]. Most papers cover decentralized control of rotary-wing UAV groups, mainly quadcopter formations [6,7].
Currently, control of decentralized swarms of autonomous robots [8][9][10] and unmanned aerial vehicles (UAVs) [11,12] is a promising area of research. However, beside control and trajectory-planning algorithms, these formations require optimizing their transient trajectories. Since autonomous robot formations, including UAV formations, are complex nonlinear interconnected systems, soft computing could be effective for such optimization.

Preliminary Remarks and Statement of Problem
The assumptions here are the same as in [12,27]: no wind, inter-UAV communication enabled and sufficiently accurate computation of the relative inter-UAV distance in the group.
The dynamics of a decentralized UAV formation can be tested via full models as well as via high-level models that approximate the movement of the formation provided that the UAVs are equipped with fine-tuned onboard autopilots. Full models are preferable, e.g., for final testing of control algorithms or for testing formation-wide stability. High-level models, also referred to in the literature as guidance models, are more suitable for simulating spatial movement planning algorithms as well as for trajectory optimization. However, full models are still useful when the trial and error method is used to find the initial values for the formation controller coefficients that are further to be used for trajectory optimization.
UAV formation trajectory optimization is a subtask of cooperative target tracking. This task is sometimes referred to as collective circumnavigation or target encirclement. The idea is to maintain a certain preset distance not only between the UAVs (through specified angular values) but also to the target encirclement orbit, which is a moving path. Formal statement of the problem can be found in Section 4. We covered a similar problem in [28], where we used a genetic algorithm to solve it.

Consensus-Based UAV Formation Control Algorithm
This paper uses a decentralized neighbor-to-neighbor interaction topology, where UAVs only receive data from the neighboring aircraft. The topology can be defined as a graph referred to as the "interaction graph" hereinafter. Paper [12] shows a mathematical model of interaction in a consensus-based UAV formation. The same model is described later in this section. Let N be the set of all UAV agents.
Let e θ ∈ R N×1 be this vector, where R N×1 is a space of N × 1-dimensional matrices with components from R. To find this vector, use some elements of the vector of all possible relative phase shift angle errorsē θ = ê i, j ∈ R N(N−1)×1 , whereê i, j is the value of error for the directly interacting ith and jth agents. The choice is dictated by the interaction architecture; in this research, the control action vector is set as such for open-chain interaction in the same manner as described in [12,27]: where ,P θ T is a system control vector in the space of relative distances (an (N − 1)-dimensional space generated by the interaction graph incidence matrix columns), and H θ is a matrix that specifies the agents for agent-to-agent distance measurements, defined as follows: where H θ ∈ R N×N , q i ∈ R 1×N and the positions of "1" and "-1" in q i are determined according to the structure of the interaction graph.
is the total of the current UAV phase angles in an inertial coordinate system; is the vector of current phase shift angles for directly co-engaged agents, calculated by the triple scalar product, e.g., when the final movement is directed clockwise, the following applies: and e i, i+1 = 2π − β in other cases, where d k , k ∈ N is the vector of aircraft-to-moving-target distance at a given time, n = (0, 0, 1) T ; M θ ∈ R N×N is an interaction matrix that in cases of decentralized neighbor-neighbor interactions as herein is as follows: is a matrix derived from the matrix M θ H −1 θ by removing the Nth column. For collective target encirclement, this paper uses the same control laws as in [12,27]. For control based on angular errors, it uses the speeds of the UAVs in the formation. The following control command vector v c is set for UAV speeds: where (1), k θ is the positive tuning coefficient, kθ is the positive tuning coefficient for the derivative signal, v f is the maximum norm of the additional velocity vector that is to be adjusted for the constraints of the real-UAV dynamics, and v is the ultimate linear cruise speed of the UAVs provided that the target is stationary. Path error-based control relies on the heading angles of the UAVs in the formation. The following control law from [12,27] is applied to the heading-angle command vector Ø c with slightly modified coefficients: where d i is the ith UAV-to-target distance,ḋ i is the corresponding derivative signal, k i o is the tuning coefficient for the distance-to-circular-path signal for the ith UAV, k iȯ is the tuning coefficient for the distance-to-circular-path derivative signal for the ith UAV, ρ is the radius of the circular path that the UAV follows whilst encircling the target, ϕ i is the phase angle of circumvolution around the target for the ith UAV.

Implementation of Particle Swarm Optimization
The standard particleswarm solver from MATLAB2015b with default parameters was used for particle swarm optimization. A four-UAV formation was tested for this paper, hence eight tuning parameters. These parameters are coefficients in the control law (3): We also added upper and lower bounds as follows: These constraints were chosen in order to preserve the UAV formation stability. Stability can be lost if the control law coefficients go beyond certain limits in the absence of adaptive control.
The initial guess for the vector K was as follows: For the fitness function, we chose the following: where t n is the particle swarm optimization time; the remaining parameters are defined in Equations (2) and (3). Thus, this solution optimizes not only for the error of each UAV's distance to the ultimate orbit of target encirclement but also for the relative neighbor-toneighbor distance errors. The formal statement of the goal would be as follows:

Simulation Parameters
For simulation, we ran a high-level UAV model from [29]. The formation consisted of four UAVs of the same type. To make an initial guess, we also ran full UAV models of this formation. This allowed us to find, by trial and error, a controller coefficient that would keep the entire formation system stable. Control laws (2) and (3) were used in the simulation. The simulation parameters are shown in Table 1.  Target speed (m/s) v target 2 Target course angle (rad) χ target π/4 Vector of initial UAV coordinates in the ICS (m)  Figure 1 shows how the fitness function changed during optimization. Apparently, the function's value stopped declining drastically after 25 iterations. However, a significantly increasing number of iterations would be required to further reduce the value.

Simulation Results and Discussion
Running the optimization algorithm returned the following K = K opt values that were used in the simulation:   Figure 2 shows UAV formation angular errors before and after optimization. As can be seen in the graphs, optimization enabled the formation to reach the pre-specified relative angular positions somewhat faster. Figure 3 shows how path errors changed in the UAV formation. As can be seen from the graphs, UAV1 and UAV2 showed the most drastic changes. Apparently, transient trajectories before and after optimization are different ( Figure 4). Even though the trajectories look similar in the figure, they are still different. That is especially noticeable in the trajectories for UAV1 and UAV2.    Notably, although the controller coefficients were tuned only for the control law (3), the fitness function included the total angular error of the formation (4). The reason for this was that control by path errors is tied to control by angular errors in a decentralized UAV formation system. This connection can be seen, among other things, in the simulation results in Figure 2.

Conclusions
The paper demonstrates a successful use of particle swarm optimization for target encirclement and tracking by a UAV formation. The formation itself was a vector fieldcontrolled decentralized formation. Simulations showed a reduction in the proposed fitness function as well as a change in the pattern of transient trajectories. A close connection was found between optimizing the path error controller optimization and the quality of transient trajectories for angular errors in the UAV formation.