Iterative learning-based formation control for multiple quadrotor unmanned aerial vehicles

A double-layer formation control is proposed to solve the repeated tasks for multiple quadrotor unmanned aerial vehicle systems. The first layer aims at achieving a formation target in which the iterative learning control is designed based on relative distance with neighbor unmanned aerial vehicles and absolute distance with virtual leader unmanned aerial vehicle. The formation controller is responsible for keeping the formation shape and generating the desired flying trajectories for each drones. The second layer control aims at achieving a high-precision tracking to desired flying trajectories which are generated from the formation controller. A double closed-loop proportional–derivative strategy is designed to ensure the accuracy of trajectory tracking for each individual drone. Simulations for the circle formation mission of the multiple quadrotor unmanned aerial vehicle system are given to verify the efficiency of the proposed method.


Introduction
The formation control of multiple quadrotor unmanned aerial vehicles (UAVs) is a central branch of research in recent years because of its potential applications in military, civilian, science, and technology. [1][2][3] In a formation control system, a group of UAVs are assembled to complete a task cooperatively, which is called as formation cooperation control. 4 It is noted that leader-follower strategy is one of the most commonly used methods due to its ease of implementation and analysis for the formation control system. 5,6 It guarantees that each UAV follows a leader with a desired trajectory which is based on the local information. 7,8 To solve the leader-follower formation problem, Hua et al. have proposed a finite-time control scheme and a prescribed performance control method for a group of quadrotors UAVs. 9 Dong et al. have studied time-varying formation tracking analysis and design problems for second-order multi-agent systems with switching interaction topologies, where the states of the followers form a predefined time-varying formation while tracking the state of the leader. 10 Based on the measurement of relative information, a combined controller-estimator has been designed for unmanned aircraft systems. 11 In the presence of external disturbances, a leader-follower formation control problem has been studied for multiple quadrotors UAVs. 12 A novel L 1 adaptive-based control method has been investigated by augmenting a fixed gain linear quadratic regulator control for leader-follower-based decentralized formation flying. 13 A leader-follower faulttolerant formation control method has been studied to let follower UAVs converge into the convex hull when there are only a part of follower UAVs can obtain the reference information from the leader UAV. 14 Although fruitful results on leader-follower formation problems can be found in recent publications, there is still a lot of space for further investigation such as finite-time problems.
The approaches mentioned above are studied to solve tasks over an infinite time. However, in some repeated tasks, such as forest fire detection and power inspection, UAVs have to fly back after a period of time due to the limitation of the battery power. 15 For this situation, the accomplishment of task depends on the repeated operation in finite time. Note that the iterative learning control (ILC) is an effective approach in dealing with the control tasks which are repeated in a finite duration. 16 ILC approach improves the performance of the system that executes the same task from time to time by learning from previous execution. 17 Moreover, ILC approach does not strictly require the precise system model and parameters, which is helpful in the real applications. 18 There are some publications to design ILC algorithm to control multi-agent systems. [19][20][21] The consensus seeking problem based on ILC strategy has been studied, 22 and the multi-agent networks can achieve consensus and improve the system performance through multiple times. An ILC algorithm based on leader-follower formation tracking problem has been presented for a multi-agent system. 23 An ILC algorithm has been studied for all agents who track a time-varying desired trajectory in a directed graph. 24 However, in most of the above-cited literature, the formation system requires that each drone should be controlled by the same control method, thereby limiting the diversity of the system. It is noted that proportional-integral-derivative control strategy has been applied in engineering due to its convenience and flexibility. 25,26 Therefore, it is meaningful to a double-layer control structure, where an ILC formation controller and a proportional-derivative (PD) controller are in the outerloop layer and in the inner-loop layer, respectively.
In this article, an ILC-based formation control problem is studied to track a time-varying reference trajectory in a directed graph for the multiple quadrotor UAV system. The whole control scheme consists of a double-layer control strategy. The first layer which is also called outer-loop layer contains an ILC formation controller. The ILCbased formation controller is designed according to the relative distance with other neighbors and the absolute distance with the virtual leader UAV. The second control layer which is also called inner-loop layer contains a double closed-loop PD controller for the trajectory tracking of the individual quadrotor UAV. Main contributions of this article are demonstrated as follows: In many literature, the formation system requires that each drone should be controlled by the same control method which limits the diversity of the system. Facing this problem, a double-layer control structure is designed.
In the first layer, an ILC formation controller is designed to keep the formation shape and generate desired flying trajectories for each drones. In the second layer, a double closed-loop PD controller is designed to keep every drone tracking its trajectory.

Preliminaries and problem formulation
Quadrotor UAV system description Figure 1 clearly shows the motion situation of a quadrotor UAV. The body frame B and the earth frame E are assumed to be at the center of gravity of the quadrotor UAV. It is a complex nonlinear system with multiple-input and multiple-output, strong coupling, and under-actuated characteristics. Its actuator number is less than its degree of freedom, and only four actuators (control inputs) are required to control six variables: the coordinates x, y, and z and Euler angles variables , q, and denoted as roll, pitch, and yaw, respectively. , q, and are the angles of rotation about the x-axis, y-axis, and z-axis with q 2 Àp=2; p=2 ð Þ , 2 Àp=2; p=2 ð Þ , and 2 Àp; p ð Þ, respectively. The rotation matrixes are where M M q M ½ T is the torque matrix along the body axes, t 1 , t 2 , t 3 , and t 4 are drag torques by motors and l is the center distance between the gravity of quadrotor UAV and each propeller.
Usually the quadrotor UAV flies with a small Euler angle, so % 0 and q % 0. Then equation (3) is simplified as According to equation (5), let Linking the equations (2), (4), (6), and (7), the model of the quadrotor UAV is described as Graph-based interaction of multiple quadrotor UAVs The graph theory is often used to describe the communication interaction among the multiple quadrotor UAVs. The vertices of the graph are the quadrotor UAVs, and the edges are used to describe the information flow from one quadrotor UAV to another. Each UAV has right to access the information with its neighbors depending on the communication interaction for a multiple quadrotor UAV system. Suppose G ¼ ðV ; e; HÞ represents the directed graphs, where V ¼ f1; 2; . . . ; ng is the UAV set and e V Â V is the edge set and H ¼ ½h ij 2 R nÂn is the weighted adjacency matrix. All neighbors of the i-th quadrotor UAV is presented by N i . If there is an edge between two quadrotor UAVs ðj; iÞ 2 e, that means the i-th quadrotor UAV can directly access the information from the j-th quadrotor UAV, that is h ij > 0, otherwise h ij ¼ 0. For any connection topology G, the Laplacian matrix is defined as . . . ; d nn g, and d ii ¼ P n j¼1;j6 ¼i h ij , d ii is the in-degree of the i-th quadrotor UAV. More details about graph theory are available in the study by Royle. 27 The formation system model and control problem statement The mission of the formation controller is to generate the desired motion trajectory for each UAV based on the desired formation shape. This formation shape is described the relative deviation from the reference trajectory of the virtual leader. According to the knowledge of graph theory, we abstract each quadrotor in the formation system into a rigid body as a vertex in the graph, the position and attitude are controlled by the velocity and angular velocity.
The dynamic model can be described as where i ¼ 1; 2; . . . ; n, k, and t denote the iteration and time.
and U i;k ðtÞ are the control inputs that denote the velocities and angular velocities of the quadrotor UAV. Because of quadrotor UAV is under-actuated system, we only need to control the four variables We need to design an appropriate ILC algorithm to solve the formation control problem such that the formation shape could be kept. In the formation system, the reference trajectory Y r t ð Þ could be time varying and usually is generated by a virtual leader quadrotor UAV (UAV r), The formation requires that each quadrotor UAV keeps a desired deviation from the reference trajectory for all the time to guarantee the desired formation shape. Thus, the formation control objective is

Control strategy
ILC formation control for multiple quadrotor UAVs Figure 2 shows the whole control structure for the formation system. The formation controller in the first layer is designed based on ILC strategy to solve the repeat mission. It generates the desired motion trajectories Y i;d for each single drone. The formation control system converges gradually when the learning process repeats. The second layer aims at the precision tracking of the individual UAV, so the double closed-loop PD control strategy is designed as shown in Figure 3. Suppose that the directed graph G has a spanning tree and there exists a special vertex that can be connected to all other vertices through the paths. It means that the information of the UAV r is available to a part of the quadrotor UAVs, and the rest of the UAVs could get the trajectory information from its neighbors. Consider a directed network of n quadrotor UAVs. Each quadrotor UAV obtains information by the onboard sensors and the communication device. Since the reference trajectory Y r t ð Þ is available to a part of the quadrotor UAVs in the formation system, for convenience, some assumptions are made [A1] Reference trajectory Y r ðtÞ is generated by a virtual leader UAV. The trajectory of the leader can be given arbitrary, which means that the leader's trajectory can be predefined according to the task requirements such as circle and square trajectory. The information flow is one-way delivery, which means that only the UAVs connected with the virtual leader can get the trajectory information, the virtual leader cannot get any information from the topology network.
[A2] Assume the nonnegative scalar r i ! 0 indicates the accessibility of Y r ðtÞ by quadrotor UAV i. Note that r i > 0 when UAV i can get access to the leader, r i ¼ 0 when UAV i cannot get access to the leader.
[A3] The initial state satisfying X i;k ð0Þ ¼ X io . It is noted that each UAV has right to access the information with its neighbors depending on the communication interaction for the multiple quadrotor UAV system. The objective considered in this article is to find an appropriate ILC algorithm to keep the formation shape and generate the desired trajectories for each quadrotor UAV in the formation system. Therefore, ILC controller should make sure that the actual relative distance between UAV i and UAV j converges to the desired relative distance to guarantee the desired formation shape for all the time, that is, lim  Figure 2. The control structure of formation system.
where Y is the differential operator. U i;0 ðtÞ is the initial input, M is learning gain matrix, and l ij and k i are scalar learning gains, and satisfy l ij ( M, l ij , and k i need to be designed.
There are two cases in the ILC control law: (1) If Y r is available to UAV i, the control law is constructed by two parts. The first part is determined by the relative distance d ij ðtÞ between quadrotor UAV i and quadrotor UAV j, and the second part is determined by the absolute distance d i ðtÞ between the quadrotor UAV i and the virtual leader quadrotor UAV r. (2) If Y r is not available to UAV i, the control law is only constructed by relative distance d ij ðtÞ.
Theorem 1. Give the multiple quadrotor UAVs formation system (9) in the directed graph G and design the ILC algorithm (11), the formation objective (10) (11) can be rewritten as Thus Due to jI n À L^ H þ KP ð ÞCM j j j < 1, so The proof is completed, the formation control system will converge gradually with the learning process repeats.
Thus, the formation shape can be kept and desired trajectories are generated for individual quadrotor UAVs.

Control strategy for individual quadrotor UAV
ILC law is designed for formation system to generate the desired trajectory for each quadrotor UAV in ILC formation control for multiple quadrotor UAVs subsection. The desired motion trajectories Y i;d for single UAV i is generated. Therefore, the control objective in the following section is to design appropriate control algorithm for each UAV to track the desired trajectory, that is, the position and Euler angles should follow the desired points. Considering the characteristics of quadrotor UAV, the double closedloop PD control strategy is designed. Figure 3 shows the mechanism of double closed-loop PD control for individual UAV. The quadrotor UAV system is divided into the position subsystem and the Euler angles attitude subsystem. The outer loop is for the position control and the inner loop is for controlling Euler angles. The objective of the outer loop controller is to track the desired positions x d ðtÞ, y d ðtÞ, and z d ðtÞ, and the inner loop controller guarantees the Euler angles qðtÞ and ðtÞ converge to their desired angles q d ðtÞ and d ðtÞ which are calculated from the outer loop controller. The position PD controller and Euler angles PD controller will generate the appropriate control inputs u 1 ; u 2 ; u 3 , and u 4 to keep the quadrotor UAV position and Euler angles stability.
To design the PD controller, d and q d need to be calculated at first. From equation (8), we let Then Replace the actual Euler angles with the desired Euler angles, and the desired pitch angle is obtained Multiply both sides of equation (16) by sin d Àcos d ½ , and the desired roll angle is written as Thus, the PD control inputs for quadrotor UAV are where p d q d r d ½ T can be calculated by equations (3), (17), and (18). k pÁ and k dÁ are the appropriate proportion and differential gains, respectively, which should be pre-tuned.

Simulation results
This section presents the simulation tests of the ILC formation control for multiple quadrotor UAVs formation system and PD control for individual quadrotor UAV.

ILC formation control results for multiple quadrotor UAVs
For multiple quadrotor UAVs which consists of six quadrotor UAVs, Figure 4 shows the directed graph of the formation system. The reference trajectory ðY r Þ is not available to all quadrotor UAVs in the formation system except for UAV1 and UAV5. Each quadrotor UAV in the formation system keep a desired deviation from the reference trajectory for all the time. The weight h ij 2 ½0; 1 for the graph edges represents the reliability and importance of communication interaction between quadrotor UAVs, it is affected by the relative distance and the communication delay between the two quadrotor UAVs.
It is noted that communication resources are limited and the quality of communication is affected by practical environments. Moreover, each UAV communicates with neighboring UAVs is dependent on its mission requirements in practice. Considering the practical limited communication resources and the mission requirements, a directed graph with variable topologies is used to describe the communication interactions among multiple quadrotor UAVs.
The reference time-varying trajectory of the virtual leader is   The ILC formation test results are shown as Figures 5 and 6. Figure 5 shows the formation trajectory in 3D;       UAV could get the information from Y r . Figure 7 shows the tracking error convergence of the formation system, and it can be seen that the system error basically converges to 0 at 300 iterations. Sup t jjY i;d ðtÞ À Y i;k ðtÞjj 1 is the infinity norm with the desired trajectory Y i;d and the actual trajectory Y i;k along the time axis. Table 1 lists the comparison of performance index by ILC, s ¼ mean(e) is average error of the formation system, MSE represents mean squared error, MSE ¼ mean(e 2 ), MAE is mean absolute error, MAE ¼ mean( e j j 2 ). Moreover, the effectiveness of the proposed ILC formation controller is verified by comparing with the leader-follower formation control in Zhao and Wang. 28 The leader-follower formation test results are shown in Figures 8 and 9. Table 2 presents the comparison of performance index by leader-follower. It is clear that the performance of the formation system is good by ILC method. However, for leader-follower formation control, the formation errors are larger than ILC formation. Because each UAV in leader-follower formation control strategy can only receive information from the leader but not from neighboring UAVs. The result shows the formation system complete the repeat task through the communication among the quadrotor UAVs with the ILC process. Table 3 lists the parameters of the quadrotor UAV. Simulation results of the PD control for individual quadrotor UAV are shown in Figures 10 to 13. Figures 10 and 11 show the trajectories of the six individual quadrotor UAVs and the snapshot t ¼ 100s. Figure 12 shows the evolution of the Euler angles. Figures 13 and 14 clearly show the tracking errors of the actual trajectories and the desired trajectories of six individual quadrotor UAV, errors fluctuate within a small range, so the six quadrotor UAVs all converge to their desired trajectories with the increase of the time. Table 4 clearly lists the comparison of performance index by PD control, they are all small, so it verifies the effectiveness of the PD control for individual quadrotor UAV.

Conclusions
A double-layer control method for the formation control of a multiple quadrotor UAV system is presented. In the first layer, a formation controller is responsible to keep the formation shape and generate the desired motion trajectories  for the second layer. ILC method is designed to keep a time-varying shape for the repeated mission. The reference trajectory was generated by a virtual leader quadrotor UAV, and it was not available to all quadrotor UAVs. Compared with leader-follower formation control, iterative learning is carried out in two directions: time axis and iteration axis, and the formation shape can be kept well by ILC based on the relative distance with other neighbor UAVs and absolute distance with the virtual leader UAV.
In the second layer, the double closed-loop PD control structure is applied for the tracking of the position subsystem and the Euler angles subsystem of the individual quadrotor UAV. Simulation results show the formation shape can be kept well and individual quadrotor UAV all tracks their desired trajectories in a finite time.
The generality of the formation system allows ILC formation control algorithm to be applied in a formation system with more quadrotor UAVs or other agents so that tasks of higher complexity can be completed. Facing more complexity tasks, the formation system may be constructed by different types of UAV with different control methods, and the underlying controller of the individual quadrotor UAV need not to be changed because of the double-layer control of the formation system. A variety of formation shapes can be achieved by neighbor-to-neighbor communication interaction through the iteration learning process. For further application, the way to realize collision avoidance and topologies switching for multiple quadrotor UAVs formation system should be taken into consideration.

Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work is supported by the National Natural Science Foundation of China (61973023, 61573050) and Beijing Natural Science Foundation (4202052).