Integrating Local Motion Planning and Robust Decentralized Fault-Tolerant Tracking Control for Search and Rescue Task of Hybrid UAVs and Biped Robots Team System

In this study, we integrate a local motion planning and robust <inline-formula> <tex-math notation="LaTeX">$H_{\infty} $ </tex-math></inline-formula> decentralized observer-based feedforward reference tracking fault-tolerant control (FTC) of a hybrid UAVs and biped robots team system (URTS) for the purpose of search and rescue (S&R). A system architecture of performing S&R tasks for each agent in URTS is proposed to explain how to integrate reference trajectory planning and tracking control in URTS for S&R usage. In order to optimally allocate tasks to each agent in URTS, a task allocate problem is investigated. In order to optimally plan a path for each agent in URTS to reach these allocated task locations, a path planning problem is formulated. To deal with complex S&R terrain, we decompose the path planning problem into three steps, i.e., (i) global path planning, (ii) behavior decision and (iii) local motion planning. Through such decomposition, some roadmap-based path planning algorithms can be applied to the global path planning of agents in URTS. By the behavior decision, we can decide what behavior to follow the global path according to the terrain environment. Next, we focus on the local motion planning problem of flying behavior for UAV and walking behavior for robot, and then the tracking control problem for UAVs and robots in the hybrid team system. By a proposed novel feedforward linearization control scheme, the robust <inline-formula> <tex-math notation="LaTeX">$H_{\infty} $ </tex-math></inline-formula> decentralized observer-based feedforward reference tracking FTC design is significantly simplified for each agent in URTS. A novel smoothing signal model of fault signal is embedded to achieve the active FTC through observer estimation. Then, the design of the robust <inline-formula> <tex-math notation="LaTeX">$H_{\infty} $ </tex-math></inline-formula> decentralized observer-based feedforward reference tracking FTC strategy is transformed into a linear matrix inequality (LMI) -constrained optimization problem of each agent. With the help of MATLAB LMI Toolbox, the robust <inline-formula> <tex-math notation="LaTeX">$H_{\infty} $ </tex-math></inline-formula> decentralized observer-based feedforward reference tracking FTC design problem of each UAV and robot in URTS is effectively solved. Finally, the simulation results are used to demonstrate the integration of local motion planning with the S&R tasks of hybrid URTS and to verify the effectiveness of the proposed robust <inline-formula> <tex-math notation="LaTeX">$H_{\infty} $ </tex-math></inline-formula> decentralized observer-based feedforward reference tracking FTC method of hybrid URTS under the external disturbance and the actuator and sensor fault.

UVs can perform more complex tasks and are more robust due to a large number of agents [2]. However, the cost is that the design of such a multi-agent system (MAS) becomes more intricate as there are more problems to be resolved, such as formation, collision avoidance between agents, task allocation, and cooperation between agents [3]. In addition to the number increasment, a heterogeneous multi-agent system (HMAS) combining various types of UV is also valued [4]. Compared with homogeneous MAS, it can adapt to a wider variety of application scenarios because each agent has different aptitudes.
To construct an unmanned HMAS, three required key capabilities are perception, decision-making and control. Perception is to obtain information through the sensor (e.g., localization or computer vision), decision-making is to make decisions through the sensor information, and control is to execute the decision through the actuator. To limit the scope of this paper, we focus on decision-making and control problem only.
Three main problems of decision-making in an unmanned HMAS are task allocation, path planning and collision avoidance. Task allocation is to optimally assign tasks to each agent under some constraints such as agent capabilities, fuel cost, time cost, etc. [5]. Path planning is to optimally plan paths for each agent while subject to constraints such as agent kinodynamic (combining ''kinematics'' and ''dynamics'') properties, distance, obstacles collision, etc. [6]. Collision avoidance is to avoid collision with obstacles. Although collision avoidance is often concerned in path planning, the collision avoidance system is also independently studied because of the requirements for the safety and reliability of the actual system [7].
Although there are many types of UVs to make up an unmanned HMAS, Unmanned Aerial Vehicle (UAV) and Unmanned Ground Vehicle (UGV) have been the subject of major recent research because of their availability and applicability. Additionally, the complementarity between them also makes such a system more potential [8]. In other words, UAV is widely used in reconnaissance due to the high mobility. However, the carrying capacity of UAV is very low compared to UGV since there is no ground support. In contrast, UGV has higher carrying capacity but is easily restricted by ground obstacles and cannot move at high speed. For these reasons, a hybrid UAVs-UGVs team system will be more appealing. To discuss more concretely, we consider a hybrid UAVs-UGVs team system for S&R usage. For the need for search mobility, we choose quadrotor aircraft as UAV. In order to deal with the complex terrain of the S&R environment, we choose biped robot (referred to as ''robot'' in this article) as UGV. Even though other types of UGVs like wheeled robots and vehicles are easier to handle than biped robot, the high degree of freedom and the compatibility of the human environment still makes it a good candidate of UGV in a S&R system. Since the agent analyzed in the URTS architecture is a real mechanical body, this allows such an HMAS to be directly applied to practical applications. Actually, the proposed architecture can also replace biped robots with wheel robots or wheeled vehicles, but the corresponding behavior layer, local planning and reference tracking tracking block should be with some modifications.
The main challenges from the cooperation system including both UAV and UGV are the task allocation problem, path planning problem and control probrem. Since the URTS is multi-agent and there are multi-tasks and multi-paths in practice, we must allocate tasks and plan paths to each agent to optimize the group benefits of the team. In addition, when solving the estimation and control problems in URTS, it will encounter the influence of coupling effects, such as vortex between UAVs or co-channel interference in communication between agents. The robust decentralized H ∞ observer-based tracking control strategy in the MAS also needs to be considered in the team formation tracking control of UAVs and biped robots hybrid team system.
To the best of the authors' knowledge, most of the literature focus on only one specific problem in such an unmanned multi-agent S&R system, such as task allocation problem, path planning problem or control problem. Additionally, few literatures illustrate the relationship between these problems. This leads us to propose a system architecture of performing S&R tasks for each agent in URTS to illustrate these problems and the relationship between them. The flowchart of the system architecture is also given in Fig. 1. We divide it into five main hierarchical processes, i.e., (i) Task Allocation, (ii) Global Path Planning, (iii) Behavior Decision, (iv) Local Motion Planning and (v) Reference Tracking Control. It is because the URTS needs to be able to assign different tasks to agents to perform first. After a task is assigned, if the task is to reach a goal location, a path to reach it needs to be planned. To make agent move on the path, a behavior corresponding to the environment is required to be determined. Then, a local motion corresponding to the behavior of the agent needs to be planned. Finally, a controller must be designed to track the trajectory of the motion. In order to further limit the scope of the study, we will focus on the latter two processes. But to illustrate how the whole system works, the first three steps are also briefly stated.
The local motion planning is the bridge between the global path planning and reference tracking control since the path found by path planning algorithm and the path enforced to follow by a controller are not necessarily the same. The reason is that path planning algorithm usually treats the agent as a point, while the local path planning treats the actual agent in the physical world as a mechanical system for the tracking control design. A mechanical system means that there exist kinodynamic constraints. This makes certain paths impossible to follow for an actual agent, such as paths that are not smooth, have too large curvature, or require too large velocity and acceleration. Although some literatures directly tackle the kinodynamic constraints on path planning problem [9], this paper splits path planning into three steps, i.e., (i) Global VOLUME 11, 2023 Path Planning, (ii) Behavior Decision and (iii) Local Motion Planning to deal with complex S&R terrain. Through this decomposition, we can focus on the local motion planning of specific behaviors. The local motion planning of flying behavior for UAV and walking behavior for robot is studied in this paper, especially the latter. The local motion planning of biped robot walking, i.e., stable walking pattern generation, is a popular research topic due to its challenge [10].
The reference tracking control is to control an agent to follow a desired reference trajectory. There are many reference control strategies for MAS. According to the way of the design of controller, it can be divided into centralized control and decentralized control in control field [11]. Centralized control means there exists a powerful central controller in MAS to gather the state information of MAS and send the control command back to each agent to reach a global goal. Due to the powerful nature of the central controller, control commands can be determined well and quickly. But when it fails, the whole system will be completely paralyzed. In contrast, decentralized control means that each agent has its own controller to collect and control the agent's own state information. Under this architecture, although the global goal cannot be achieved, the possibility of paralyzing the entire system due to the failure of the controller can be avoided.
Besides, the formation control is also a topic in MAS [12]. Its purpose is to keep a MAS in a formation while moving for some allocated tasks. Although formation control provides a simple framework for the reference control of a large number of agents, considering the complexity of the disaster relief environment, formation will make the application of URTS inflexible. It is because we expect that agents in URTS need to organize multiple teams of different scales and types to deal with multiple tasks of different scales and types in a disaster relief environment. In this situation, it is more reasonable to treat each agent in URTS as an independent individual to be controlled to follow a specific trajectory for its allocated task to form a team formation.
In order to cope with the fault in the actual system, the fault-tolerant control (FTC) has also been widely studied. According to the way of handling the fault, it can be divided into the passive FTC and the active FTC [13]. The passive FTC treats the fault as an unknown system perturbation and designs a control law to tolerate it. In contrast, the active FTC will first estimate and identify the fault and then compensate it through the feedback controller. Despite the extra complexity in controller design, the active FTC will outperform the passive FTC due to the extra estimation steps. Recently, a distributed platooning control of automated control of automated vehicles subject to replay attacks based on proportional integral observer is discussed in [14]. The proposed method can also deal with platooning control problems subject to cyber-attack or faults. However, since there exist some different mechanical structure between vehicle and UAV or robot, the local motion must be planned additionally. A censensus control of muti-agent systems using fault-estimation-in the loop developed via dynamic event-triggered scheme in [15]. Since the decentralized control method is proposed to ensure the team formation tracking error to converge to zero, the consensus control problem is not considered in this study.
Based on the foregoing discussions, a robust H ∞ decentralized observer-based feedforward reference tracking FTC scheme is proposed to deal with the control problem in hybrid URTS.
The contributions of this study are described as follows: 1) The local motion planning, hybrid agent model, feedforward linearization method and smoothing signal model of fault signals are integrated to achieve the robust decentralized H ∞ fault-tolerant observer-based team formation tracking control for search and rescue task of hybrid UAVs and biped robots team formation system. 2) A system architecture of performing S&R tasks for each agent in hybrid URTS is proposed so that the local motion planning and reference tracking control issues involved in each agent of hybrid URTS and their systematic relationships can be defined and resolved. 3) A transformation from the path planned by a roadmap-based path planning algorithm to the reference trajectory required for S&R task of team formation tracking control design is proposed to enable some common roadmap-based path planning algorithms to guide the team formation tracking control of agents in hybrid URTS to reach the goal location without obstacle collision. 4) A general agent dynamic model and a novel feedforward linearization control scheme are proposed so that the robust H ∞ decentralized observer-based feedforward reference tracking FTC problems of the heterogeneous agents in URTS, i.e., UAVs and biped robots, can be solved simultaneously to achieve S&R task under external disturbance and actuator and sensor fault.
The remainder of the paper is organized as follows. In Section II, a system architecture of performing S&R tasks for each agent in hybrid URTS is proposed and the function and relationship among its components, i.e., Task Allocation, Global Path Planning, Behavior Decision, Local Motion Planning and Reference Tracking Control are described. In Section III, the dynamics models of agents in URTS are given to plan the motion of UAV flying and biped robot walking behavior and the control strategy of agents. In Section IV, a robust H ∞ decentralized observer-based feedforward reference tracking FTC is proposed for the agents in URTS with the help of a general agent dynamic model. In Section V, a simulation example is given to illustrate the operation of the system architecture of performing S&R tasks for each agent in hybrid URTS and to verify the effectiveness of the proposed tracking control method. In Section VI, a conclusion is maded.
Notation 1: diag(A 1 , A 2 , . . . , A n ): a block diagonal matrix with main diagonal blocks A 1 , A 2 , . . . , A n . A T : transpose of A. A > 0: a positive definite matrix. (a n ): a sequence. (a k n ): a subsequence of a sequence (a n ). [a j,k ]: A matrix with the entries a j,k in the jth row and kth column. |S|: size of a set S. ⊗: Kronecker product. I n : n-dimension identity matrix.
sum of a matrix A and its tranposed, i.e., Sym(A) = A + A T .

II. PRELIMINARIES OF URTS FOR S&R USAGE
The URTS will start with a given S&R area, and end with the S&R task completed. The URTS is composed of N T teams and a ground station. Each team contains N A agents with 1 UAV and N A − 1 robots. Hence, the jth agent in the ith team is denoted as α i,j , where i = 1, 2, . . . , N T and j = 1, 2, . . . , N A . The UAVs are chosed as the first agents in each team, i.e., α i,1 , i = 1, 2, . . . , N T . Each agent has environmental sensing capability and load capability, while the ground station is responsible for computing and decisionmaking. Besides, there are communication channels between agents and ground station through wireless network.
To complete search tasks, each team is designed to be responsible for a small area of the overall S&R area, and each agent will be assigned an appropriate path to cover the unsearched area. To complete rescue tasks, whenever a target (e.g., victims or disaster area) is found by the machine vision of nearby agent, the ground station will assign some agents to the location of target.
If a task is to reach a location of certain goal, we need to find a collision-free path to reach it. Hence, each agent will also sense distance-related information about its surrounding and send it back to the ground station. The ground station will combine this information with the goal location determined by the task content and then plan a path to avoid obstacles and other agents nearby through a path planning algorithm.
Since URTS operates in a complex terrain environment and the agents in URTS have multiple behaviors to act with environment, it is still impossible to use a unified path planning algorithm in URTS. In other words, it is still difficult to find a path that simultaneously satisfies obstacle avoidance, reaches the goal location, satisfies the kinodynamic constraints induced by the mechanical body of agents, and can be used as a reference trajectory for the dynamic model of agents. Therefore, to address this challenge, this article splits path planning into three steps, i.e., (i) Global Path Planning, (ii) Behavior Decision and (iii) Local Motion Planning. Global Path Planning plans a global path by regarding the agent as a point. The global path tells the agent where to go. Behavior Decision decides what behavior is to follow the global path according to the global path and its surrounding terrain environment (e.g., the biped robot decides to follow it by walking or running behavior). Local Motion Planning plans local motion for specific behaviors by regarding the agent as a mechanical body. The local motion is a prescribed reference path for an actual mechanical system to follow. After the reference path is converted into a reference trajectory, the agent can follow the global path to reach the goal location by tracking the reference trajectory through the reference tracking controller.
The UAV and biped robot overall have the same system architecture for performing S&R task in URTS except for some subprocess differences. In the article [16], it gives the architecture of reaching destination for self-driving urban vehicles system. We extend this to an architecture of performing S&R tasks for unmanned HMAS. In our architecture, we can handle multiple agents and do not limit the physical entity of the agent to a vehicle. Followed by the concept in [16], a system architecture of performing S&R tasks for each agent in hybrid URTS is proposed as shown in Fig. 1. The detailed functions of remaining 5 blocks, i.e., Task Allocation, Global Path Planning, Behavior Decision, Local Motion Planning and Reference Tracking Control, will be explained in the following subsections.
Remark 1: In this paper, we will only state and solve the problems in Local Motion Planning and Reference Tracking Control, i.e., how to plan the local motion of UAV flying and biped robot walking and how to design the robust H ∞ decentralized observer-based feedforward reference tracking FTC strategy for each agent in URTS. Nevertheless, the reason why we still state the problems in Task Allocation, Global Path Planning and Behavior Decision for performing S&R task in hybrid URTS is to clarify the systematic relationship between these blocks, which provides a blueprint for those who want to construct an unmanned HMAS like URTS.

A. TASK ALLOCATION
In the URTS, it can be expected that each agent α i,j will be assigned to several specific tasks T k , k ∈ Z, such as searching a specific area, or delivering supplies to disaster area, etc. However, the number of agents and tasks is more than one, each agent has different capabilities (e.g., moving speed or load capacity) and state (e.g., the relative distance between the agent itself and the target or the amount of supplies carried), and each task has different characteristics (e.g., urgency, position, or amout of supplies needed). Therefore, the results of task allocation can be ''good or bad'', which leads us to finding the optimal allocation. This problem is referred to a task allocation problem or multi-robot task allocation problem. A problem formulation and a mathematical model of problem can be found in [17]. Although many different formulations and models have been employed to solve the task allocation problem, the common goal is to find a set of agent-task pairs (α i,j , T k ) to achieve a specific cost function. In this paper, we assume that the tasks have been properly assigned and every agent knows a goal configuration q goal to reach at every moment form the task content. q goal then passes to next block, i.e., Global Path Planning block, as shown in Fig. 1. Remark 2: The task allocation block is like a commander since it is used to assign task for agents. Thus, if the real S&R system has human experts as commanders, he can replace its FIGURE 1. The system architecture of S&R tasks in hybrid URTS. In the figure, the three capabilities required for unmanned systems are marked, i.e., perception, decision-making and control. The 2 blocks on left hand side are used to convert the low-level sensor information into high-level information. The Simultaneous Localization And Mapping (SLAM) block converts sensor information and distance information into the current configuration q start of agents and an occupancy map C of URTS. The visual object recognition block provides distance information and object information through the analysis of sensor information. The object information provides agent machine vision that enables it to determine an appropriate behavior (e.g., a robot can see an obstacle and decide to climb through it). The 5 blocks on right hand side are the flowchart of an agent performing a S&R task. From the top to the bottom, it is the decision of the goal configuration q qoal of agents, the planning of the path (σ n ) of agents, the decision of the behavior (β n ) of agents corresponding to the path, the planning of the reference path r [k] of agents corresponding to the behavior, and the low-level reference tracking control of agents. A block with two vertical lines inside in flowchart represents a predefined process which has more detailed subprocesses. Behavior Decision block is described detailly in Fig. 2. Local Motion Planning block to generate the desired reference is described detailly in Fig. 3. Reference Tracking Control block is described detailly in Fig. 5.
job or make decisions together with it to maximize the rescue value.
Remark 3: Although each agent has its own team, agents can also work across teams. For example, if the result given by the task allocation algorithm contains the agent-task pairs (α 1,5 , T 1 ) and (α 2,2 , T 1 ), then the agent α 1,5 in team 1 and the agent α 2,2 in team 2 will execute task T 1 together.

B. GLOBAL PATH PLANNING
After a goal configuration q goal is assigned for each agent, next step is to find a collision-free path from current configuration q start obtained by SLAM to arrive q goal . There must exist multiple feasible paths to go. Similar to task allocation, we usually want to find an optimal path. There are several path planning algorithms to handle this problem. Due to regarding agent as a point in this block and the developmental and universal nature of roadmap-based path planning algorithm, this paper considers it as the path planning method in Global Path Planning of URTS. This method attempts to discretize the search space into interconnected roads and find the path on it.
According to the way of pathfinding, it can be divided into multi-query planner and single-query planner [18]. Multiquery planner will first construct a roadmap and then use a graph search method on it to query the best path, such as Probability Road Map (PRM), Visibility Graph, and Voronoi Diagrams [19]. Single-query planner will complete the pathfinding by constructing and querying simultaneously, such as Rapidly-exploring Random Tree (RRT), Expansice Space Tree (EST), and Ariadne's Clew [18]. However, the environment is dynamic rather static for URTS so some extra structures need to impose on the aforementioned planner. Some common dynamic planners can also be found in [18], such as PRM with D* search algorithm, dynamic RRT, and extended RRT.
Remark 4: To avoid agents colliding with each other, the concept of multi-agent path planning is proposed [20]. However, URTS operates in a large environment so the probability of collision is small and the agents have the ability to communicate. Therefore, an alternative solution is to use single-agent path planning together with the mechanism of waiting for the other agent to pass first in the event of a collision.
By treating a roadmap-based path planning algorithm as a black box, the output is a sequence (or waypoints), and the three inputs are current configuration q start , goal configuration q goal , and configuration space (or occupancy map) C. q start and C is obtained by SLAM, and q goal is obtained by the previous block, Task Allocation block, as shown in Fig. 1. C is a space containing all possible configurations of agents which are composed of free space C free and obstacle space C obs , where C = C free ∪ C obs and C free ∩ C obs = ∅. For a simpler explanation of how the URTS works, the following assumptions are made.
Assumption 1: The locating ability of the URTS is perfect so every agent can know its current configuration q start .
Assumption 2: A Task Allocation algorithm is already designed so that every agent can know its goal configuration q goal .
Assumption 3: The URTS is supposed to have a perfect real-time mapping ability so a real-time configuration space C can be obtained.
Assumption 4: UAVs do not consider obstacle collision, so the path of UAVs can be directly assigned rather than found by planner. Robots do not consider obstacle collision in the direction perpendicular to the ground.
From above assumptions, a global path of agent can be expressed as a sequence: Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.
where k f is the time step when reaching goal. (σ n ) then passes to next block, i.e., Behavior Decision block, as shown in Fig. 1. Remark 5: Since the path planning is dynamic, (σ n ) is composed of multiple segments actually. Let (σ k n ) be the subsequence of (σ n ), where k n is the time step when a replanning decision is occured. Then, the segments of path from the result of the replanning in time step k n can be expressed as sequences (σ m ), m ∈ Z ∩ [k n , k n+1 ). For agents, the replanning decision can be due to a goal changing that is made by human or Task Allocation block. For robot, it can be due to a collision detected by a dynamic roadmap-based planner.
Remark 6: As we treat the agent as a point in Global Path Planning. To avoid the collision due to the actual geometric size of agent, it can be take into account in the obstacle space

be the minimum ball that covers the agent, where o is the geometric center of agent and r is the radius of ball. Then the safety obstacle space is denoted as
The global path (σ n ) tells agents where to go but not how since it regards the agent as a point. Therefore, agents need to determine an appropriate behavior to follow (σ n ) according to the machine vision provided by the object information. Taking robot as an example, it may walk, run, climb, or jump to follow (σ n ) according to real scenario. These behaviors with changing position are classified as ''Moving'' behavior in this paper. Besides, the agents in the hybrid URTS do not always moving. Sometimes they have to suspend to take an action (e.g., getting and putting supplies, rotating in place to collect more environment information) or deal with some unexpected situations (e.g., no path found, the robot falls). These behaviors without changing position are classified as ''Action'' behavior in this paper. More behaviors can be added so that the agent can have more ways to act with environment but there must have a corresponding behavior every moment otherwise the agent will lose control. The sequence of these behaviors can be expressed as: where B is the set of behaviors. It means that the global path (σ n ) is divided into many segments and each segment corresponds to a specific behavior. By the object information in Fig. 1, an appropriate behavior can be decided. However, it will be a rather complicated project, so this article will not discuss in detail. The flowchart of Behavior Decision and the behavior set B of agents in URTS can be roughly described in Fig. 2. (β n ) then passes to next block, i.e., Local Motion Planning block, as shown in Fig. 1.
The communication between UAVs, robots and ground station often fails in practice. In order to cope with this issue, a standby behavior can be added in the behavior set B as shown in Fig.2(b). When the communication fails, the behavior layer will immediately tell the agent to switch to the standby behavior until the communication is restored.

D. LOCAL MOTION PLANNING
After a specific behavior β n ∈ B is determined, the next step is to plan a local motion to achieve the specific behavior. Local Motion Planning block is like Global Path Planning block but with smaller scale and higher resolution and precision. Collision checking is needed since we consider agent as a point in Global Path Planning block but it is a real mechanical body here. Furthermore, the kinodynamic constraint is handled in this block. A local motion is a reference path r[k] (it is also a sequence but this article uses the notation of discrete signal r[k] to represent it) which describes how a mechanical body changes in space over time in the physical world. The flowchart of Local Motion Planning block is shown in Fig. 3.
To limit the scope of this article, we only focus on the local motion planning of flying behavior for UAV and walking behavior for robot. A more detailed description will be given in the next section. r[k] then passes to next block, i.e., Reference Tracking Control block, as shown in Fig. 1.
Even the architecture we proposed is aimed at the needs of search and rescue(S&R), so a biped robot that can cope with complex environments is selected as the UGV. Besides, we consider a multi-team architecture, which can allow URTS to be scaled up and used more widely. The proposed architecture can also replace biped robots with wheeled robots. However, the proposed architecture can also replace biped robots with wheeled robots robots, but the corresponding behavior layer, local motion planning and reference tracking control block should be re-analyzed.

E. REFERENCE TRACKING CONTROL
To analyze the reference tracking control problem in the continuous time domain, the reference path r[k] will be first converted to a continuous signal r ′ (t) by D/A convertor with a timescale, i.e., sampling period. By analyzing the dynamic model of each agent, a desired reference trajectory r(t) is planned. r(t) describes the position and orientation that need to be reached over time by a machine system governed by a dynamic equation. Note that the path and trajectory are distinguished in some literature. Different from path (e.g., (σ n ) or r[k]), a trajectory r(t) has considered the time in physic world. We also distinguish them in this way in this article.
If each agent in hybrid URTS can track each own reference trajectory r(t), then they can move in the physical world as we expect. To this end, a reference tracking control method needs to be designed. It will be discussed detailly in Section IV. In addition, the sensing information collected by the sensors in this block is not only used for reference tracking control but also fed back to the high-level block as shown in Fig. 1.

III. SYSTEM DESCRIPTION OF UAVs AND BIPED ROBOTS IN URTS
In order to plan a reference trajectory r(t) for the local motion of UAV and robot in URTS, their dynamic models must be given first. After the system description of UAV and robot in URTS, the local motion planning of flying and walking behavior as shown in Fig. 4 will be discussed subsequently and separately in the next two subsections. Before the discussion of the local motion planning of these two behaviors of UAV and robot in URTS, the following assumption is maded.
Assumption 5: The space between obstacles is large enough to eliminate the need for collision checking again, and the average speed of agents is slow enough to ignore the nonholonomic constraints (the velocity and acceleration constraints).
However, for these two behaviors, there exists an inevitable holonomic constraint on the curvature of local motion. Although an accurate reference trajectory without breaking the curvature constraint can be planned, it is not easy to solve this problem. Additionally, it is not nessasry for these two behaviors in URTS since they are uesd to move from one location to another while the effect of error during moving caused by breaking the curvature constraint is relatively insignificant. As an alternative, this problem can be handled by curve fitting which can be regarded as a post-process of the path (σ n ). The result of post-process, i.e., smoothed path (σ ′ n ), will not cause collision by Assumption 5. The post-process will appear in the begining of motion planning process of these two behaviors as shown in Fig. 4.

A. LOCAL MOTION PLANNING OF FLYING OF UAV
A dynamic model about how an UAV in URTS moves in the physical world is given first. By Newton-Euler equation, the dynamic model of each UAV in URTS can be formulated as [21]: g is the gravity acceleration, m and J are the mass and inertia matrix of UAV, respectively, τ u and F are the total torque and force acting on UAV, respectively, is the Euler angles in body frame as shown in Fig. 2.8 in [21] with φ ∈ [−π, π], θ ∈ [−π/2, π/2] and ψ ∈ [−π, π], X is the postion of center of mass (CoM) in inertial frame, K τ and K F are the aerodynamic damping coefficients, and R( ) is the intrinsic rotation matrix from body frame to inertial frame. This model treats the UAV as a mass point and can control the total force F and the total torque τ u . For UAV, the reference trajectory ∈ R 6 is in the task space, where the subscript r denotes the reference.

Remark 8: Since the UAV (quadrotor) has four rotors, we only have four control input. To simplify the model, the UAV dynamic model in (3) considers the actuator control
T as the equivalent control input for the four rotors. Besides, since u ′ (t) ∈ R 4 has two less degrees of freedom than r(t) ∈ R 6 , UAV is an underactuated system. This makes the two degrees of freedom in the reference trajectory of UAV not be assigned arbitrarily but be inversely calculated through other degrees of freedom that can be assigned arbitrarily by the dynamic equation in (3). Now, suppose the UAV flying behavior occurs between time step k 1 and k 2 , i.e., β n = flying, n ∈ Z ∩ [k 1 , k 2 ]. The corresponding path (σ n ), n ∈ Z ∩ [k 1 , k 2 ] will be smoothed first by linear interpolation and then by cubic spline interpolation, which gives the smoothed path T . Subsequently, we consider the orientation reference path, row angle φ r [k], pitch angle θ r [k] and yaw angle ψ r [k]. φ r [k] is set to zero since no need for spinning when flying. θ r [k] and ψ r [k] cannot be planned beforehand since UAV is an underactuated system, which will be discussed in the next section. Finally, the reference path r[k] of UAV can be planned by combining them together, i.e., The flowchart of flying motion planning is shown in Fig. 4 (a).

B. LOCAL MOTION PLANNING OF WALKING OF ROBOT
By Lagrange equation, the dynamic model of a biped robot in URTS can be formulated as: where τ R is the total torque on revolute joints, q,q,q ∈ R 12 are angular position, angular velocity, and angular acceleration vector of revolute joints, M R (q) ∈ R 12×12 is the inertia matrix, C(q,q) ∈ R 12×12 is the Coriolis and centripetal force vector and G(q) ∈ R 12 is the gravitational force vector. The detailed kinematic and dynamic parameters can be found in the online source [22]. For biped robot, the reference trajectory r(t) = q r (t) ∈ R 12 is planned in the joint space. Furthermore, the walking of biped robot suffers from the falling problem, i.e, how to plan a stable walking pattern to prevent robot from falling. These make the motion planning of walking behavior of robot more difficult. In this paper, a Three-Dimensional Linear Inverted Pendulum Model (3D-LIPM) [23] is used for the motion planning of walking behavior of each biped robot in URTS.
With 3D-LIPM, we can significantly reduce the amount of computation. Let us define the body frame of biped robot as [22]. Taking the forward direction of biped robot as X b direction, the left direction as Y b direction, and the torso direction as Z b direction in body frame, ''Falling'' means the moments on the robot in X b and Y b direction are not zero. More accuately, the biped robot will not fall if the zero moment point (ZMP) lies in the support polygon, i.e., the convex hull of face of supported foots. The ZMP in X b direction can be described as [24] (The ZMP in the Y b direction is in the same form): where m i is the CoM, x i , z i are the linear position components, I iy is the inertial component, and¨ iy is the angular acceleration component of link i. However, it is difficult to directly calculate the analytical solution of q r (t) through (5). Since there exists a complex coordinate transformation between q r (t) and x i , z i ,¨ iy . At the same time, it is necessary to ensure that x zmp falls in the support polygon which also has a relationship with q r (t). To simplify this complex problem, an approximate solution can be derived through 3D-LIPM. We plan CoM reference first and then obtain q r (t) by using inverse kinematic (IK) with given step size, step height, step period and CoM height. Many researchers have used this method to avoid complex calculations for ZMP of the actual robot dynamic model. Although there exists a model error between the actual dynamic model and 3D-LIPM, the planning process will be more simple.
Following the same motion planning step in UAV, the smoothed path (σ ′ n ) for biped robot can be obtained at first. For the convenience of explanation, suppose walking is occured between time step 1 and N , i.e., n ∈ Z ∩ [1, N ]. Note that (σ ′ n ) is not actual CoM reference in robot case since CoM of robot needs to ''swinging'' for balance. Despite of that, (σ ′ n ) tells the biped robot the position to go so the X b direction can be obtained by doing finite difference on (σ ′ n ) due to the expectation that the biped robot will move forward (rather than sideways or backward). To keep torso upright, the Z b direction is equal to the z-axis in the inertial frame Z g . Given X b and Z b , Y b can be obtained obviously through cross product. The sequence of body frame, i.e., CoM orientation path (CO n ) then be planned through the above steps.
Remark 9: A frame (or homogeneous transformation) in R 3 can be determined by giving the ''position'' and ''orientation'' with respect to a reference frame. That is, given the frames of two joints in links with known kinematics, the frames of joints between them can be found by IK. Hence, we need to find the position path and orientation path, which compose the desired path of the frame.
Let us denote the x and y component of σ ′ n in (σ ′ n ) as the sequence (σ 1 n ), σ 1 n ∈ R 2 . The left and right ''envelopes'', (σ 2 n ) and (σ 3 n ), of (σ 1 n ) with a fixed distance L 1 can be planned by (σ 1 n ) and (CO n ) through the geometric relation among (σ i n ), i = 1, 2, 3, where L 1 is the feet width (or shoulder width). Then the x and y component of the left and right foothold paths, (σ 2 k n ) and (σ 3 k n ), respectively, can be planned by a given step size, which are the subsequence of (σ 2 n ) and (σ 3 n ), respectively. Finally, the left and right foothold paths (σ i k n ), σ i k n ∈ R 3 , i = 4, 5 are planned by adding the z component which is given by ground height.
After foothold paths are planned, ankle position path can also be planned by the given step height which is customized by the designer or based on the height of the obstacle to be crossed. Taking  To keep the soles of the feet on the ground, the ankle orientation path can be planned by the gradient of ground. Finally, the left and right ankle paths, (σ 8 n ) and (σ 9 n ), are planned by combining the position and orientation path together. So far, the remaining work is to find out the CoM path and then to combine with the ankle path to calculate the joint path through IK.
To plan the CoM postion path (CP n ), ZMP path needs to be planned first. ZMP path can be planned through foothold paths (σ 4 k n ) and (σ 5 k n ) since ZMP needs to lie in the support face and the foothold path points out when the feet are on the ground. Suppose the CoM height z c of biped robot is kept constant when walking, then the biped robot model can be regarded as an 3D-LIPM [23]: where (x c , y c , z c ) is the position of CoM of the inverted pendulum, g is the gravity acceleration, and (p x , p y ) is the position of ZMP on the x-y plane. Since z c , g and (p x , p y ) are given, (x c , y c ) can be solved. From the dynamic equations in the x and y directions in (6), it can be found that they are decoupled and thus can be calculated separately. Therefore, only the solution in the x direction is given below (the y direction as the same). To solve it, a method is proposed to convert it to a servo problem [25]: Our goal is to find a control input u s in order that the output y s can track the ZMP reference trajectory p x so that the state x c , i.e., the solution of ODE in (6) can be obtained, i.e., the CoM position path can be planned. Unlike conventional methods, the problem is solved by the optimal control. The system is discretized first and the discrete LQ optimal tracker is employed to achieve the output tracking. The formulation of the discrete LQ optimal tracker can be found in TABLE 4.4-1 in [26]. The CoM position path (CP n ) can then be obtained by combining x c and y c with z c .
By combining the CoM orientation path (CO n ) wtih the position path (CP n ), the CoM path can be obtained. Finally, the joint path, i.e., reference path r[k] ∈ R 12 of robot can be found by solving IK. The flowchart of walking motion planning is shown in Fig. 4 (b).

IV. REFERENCE TRACKING CONTROL OF EACH AGENT IN HYBRID URTS
Through the previous step, the planning of the reference path r[k] of each agent in hybrid URTS has been completed. However, in order to calculate the unplannable reference paths, i.e., φ r [k] and θ r [k], of UAV and analyze the reference trajectory tracking problem in the continuous domain, we have to transform r[k] into the reference trajectory r(t). Before converting the planned reference path r[k] to desired reference trajectory r(t), we first convert the UAV dynamic model in (3) and robot dynamic model in (4) into a form called agent dynamic model in a hybrid team to analyze their team formation tracking control problems together. Through some appropriate variable transformations, we have: where u(t) ∈ R n is the control input vector, x(t) ∈ R n is the state vector, M (x(t)) ∈ R n×n is the inertia matrix, and H (x(t),ẋ(t)) ∈ R n is the non-inertial force vector. The purpose of converting the dynamic model of each UAV in (3) and each biped robot in (4) URTS into a unified dynamic model in (8)  The decentralized feedforward linearization control law for an agent in URTS is proposed as: u(t) = M (r(t))(r(t) + u fb (t)) + H (r(t),ṙ(t)) (9) where r(t) ∈ R n is the desired reference trajectory, M (r(t)), r(t) and H (r(t),ṙ(t)) are the feedforward control terms for canceling system nonlinearity, and u fb (t) is the feedback control law to be futher designed for improving system tracking robustness. Remark 10: Since this article analyzes each agent α i,j individually, the subscripts i and j of the corresponding variables are omitted for the convenience of notation in the following. For example, x i,j (t) is omitted as x(t) in (8), r i,j (t) is omitted as r(t) in (9), etc.
Remark 11: In Section 3.2.2 of [21], the dynamic system of UAV is divided into inner loop system and outer loop system to analyze the reference tracking problem of UAV. In order to improve the problem that the control gains in four state equations (z, φ, θ and ψ) of inner loop system and the two states of outer loop system (x and y) in [21] has no design specifications, we changed the controller design method so that the control gains of the six states of UAV can be designed together via the given specifications, i.e., design the feedforward linearization control law u(t) ∈ R 6 to achieve the observer-based tracking specification in (24). This can reduce the time cost of manually adjusting control gains. At the same time, the tracking control problem of UAV and robot can be analyzed simultaneously.
Remark 12: The difference between the feedforward linearization control law u(t) in (9) of this article and the traditional feedback linearization control (or computed torque control) is that we use the ''feedforward'' linearization control, i.e., use M (r(t)) and H (r(t),ṙ(t)) in (9)

instead of M (x(t)) and H (x(t),ẋ(t)). This is because the state x(t) is assumed to be unavailable in this paper so x(t) cannot be used in feedforward linearization control law u(t). While the asymptotical tracking control is achieved, i.e., x(t) → r(t), the feedforward linearization control will be achieved too.
To complete the planning of reference trajectory r(t) of each agent, a D/A converter is used to transform the reference path r[k] (output of local motion planning) into a continuous signal r ′ (t) as shown in Fig. 5. For UAV, we get r ′ (t) = [x r , y r , z r , ψ r ] T ∈ R 4 . Besides, for an UAV in hybrid URTS, it can be seen that the control input ∈ R 6 we design in (9) is different from the actuator control input u ′ (t) = [F, τ x , τ y , τ z ] T ∈ R 4 for UAV since UAV is an underactuated system. The two degrees of freedom we reserved in Section III-A, i.e., φ r and θ r , are just to solve this problem. By substituting = (3), the 3 unknown variables F, φ r and θ r can be found from these 3 equations using inverse dynamic because the component VOLUME 11, 2023 forces f x , f y and f z , and the yaw angle ψ r are given. Combining φ r and θ r with r ′ (t), we obtain r(t) = [x r , y r , z r , φ r , θ r , ψ r ] T . At the same time, we find u ′ (t). For biped robot, we directly have r(t) = r ′ (t) ∈ R 12 and u ′ (t) = u(t) ∈ R 12 since biped robot is a fully actuated system. So far, the transformation from reference path r[k] to reference trajectory r(t) for each agent in hybrid URTS is done. We define the process of converting r ′ (t) and u(t) into r(t) and u ′ (t) mentioned above as the reference generation block in Fig. 5. Remark 13: The solution of φ r and θ r must satisfy the constraints φ r ∈ [−π, π] and θ r ∈ [−π/2, π/2] in UAV dynamic model in (3). In addition, since φ r and θ r are calculated at each time step and there may be multiple solutions due to inverse dynamic calculation, the solution of φ r and θ r must be selected to be continuous with the previous time step. Remark 14: Let r i,j (t) be the reference trajectory r(t) of the agent α i,j , r i,j [k] be the reference path r[k] of the agent α i,j , (β n ) i,j be the behavior sequence (β n ) of the agent α i,j , (σ n ) i,j be the collision-free path (σ n ) of the agent α i,j , q goal,i,j be the goal configuration q goal of the agent α i,j , etc. As long as α i,j can track r i,j (t), the hybrid URTS can work as we expect since the previous blocks, i.e, Task Allocation, Global Path Planning, Behavior Decision and Local Motion Planning, have completed their respective responsibilities and planned their corresponding values, q goal,i,j , (σ n ) i,j , (β n ) i,j and r i,j [k].

That is, r i,j (t) is the reference trajectory that can accomplish the specific task, follow the specific path and perform the specific behavior.
To make the model more realistic, the following disturbances encountered in actual scenarios are considered: 1) For each agent, there exists coupling effect due to co-channel interference in communication between agents [27]. 2) For each agent, there exists cyber-attack on communication network between agents and ground station. 3) For each agent, there exists sensor noise. 4) For each UAV, there exists vortex coupling. 5) For each UAV, there exists wind disturbance [28]. 6) For each robot, there exists ground reaction force [29]. Let x i,j (t) denote the state vector of agents α i,j where i = 1, 2, . . . , N T , j = 1, 2, . . . , N A . The coupling effect c i,j (t) on each agent from other agents in hybrid URTS can be represented as for UAV α i,1 and for biped robot α i,j where i = 1, 2, . . . , N T and j = 2, 3, . . . , N A [27]. For the convenience of notation, we use c(t) to represent c i,j (t). Since the ground station is responsible for the calculation, the calculated control command in (9) will be transmitted to the agent through the network channel in hybrid URTS. Therefore, the coupling effect due to co-channel interference and the cyber-attack signal will deteriorate the control command. In addition, the wind disturbance and the ground reaction force will apply extra force on an agent in (8). Therefore, through an appropriate conversion, the above disturbances can be equivalent to two disturbance forces c(t) + d 1 (t) ∈ R n where d 1 (t) is the non-coupling external disturbance. The nominal system in (8) of an agent in hybrid URTS then can be rewritten as the following real system: Remark 15: Under the decentralized control architecture, the coupling effect c(t) between agents becomes their own external disturbance, which greatly reduces the difficulty of controller design. Now, substituting the feedforward linearization control law u(t) in (9) into (12) and subtracting M (x(t))r(t) from the left and right sides, we have: By multipling M (x(t)) −1 on both sides of (13) and with some arrangments, we have the tracking error differential equation as follows:ë where f 1 (t) = M (x(t)) −1 (− M (r(t) + u fb (t)) − H + c(t) + d 1 (t)) ∈ R n is considered as the actuator fault signal, M ≜ M (x(t)) − M (r(t)) and H ≜ H (x(t),ẋ(t)) − H (r(t),ṙ(t)) are the error terms from feedforward compensation, and e(t) = x(t) − r(t) is the tracking error. Let us denote e e e(t) = t 0 e T (τ )dτ e T (t)ė T (t) T ∈ R 3n , the tracking error differential equation in (14) can be rewritten as the following linear tracking error system: e e e(t) = Ae e e(t) + B(u fb (t) + f 1 (t)) (15) where Through the above analysis, the tracking control problem of the nonlinear system of each agent in hybrid URTS with external disturbance in (12) is transformed into the regulation problem of the linear tracking error system in (15) with actuator fault signal f 1 (t) by the feedforward ''linearization'' control law u(t) in (9). The remaining step is to design an appropriate feedback control law u fb (t) in (15) to make the linear tracking error system robustly stable.  Fig. 1 of each agent in hybrid URTS. The controller block is described in (d). The controller block (the proposed general H ∞ decentralized observer-based feedforward reference tracking FTC scheme) is designed for a fully actuated agent dynamic model in (8) while the UAV is an underactuated system. It makes the designed feedforward linearization control law u(t ) ∈ R 6 and the actuator control input u ′ (t ) ∈ R 4 different for UAV. The reference generator block is introduced to deal with this problem. For UAVs, the reference generator block is described in (b). For robots, the reference generator block is described in (c). (b) The reference generator block for each UAV. u ′ (t ) and r (t ) can be calculated from u(t ) and r ′ (t ) by inverse dynamic through the UAV dynamic model in (3). (c) The reference generator block for each robot. u ′ (t ) = u(t ) and r (t ) = r ′ (t ) since the robot dynamic model in (4) (15) can be stabilized through u fb (t).
In a real system, the feedback information is measured by sensor, i.e, the state x(t) in (8) is unavailable. At the same time, the sensor noise on sensor also needs to be considered as mentioned before. Since the sensor information will be transmitted back to the ground station for calculating control command through the network channel in URTS, not only the sensor noise but also the cyber-attack signal are concerned. We consider the effect of them as a sensor fault in this paper.
T , the measurement output equation can be described as: where y(t) ∈ R l is the output vector, C ∈ R l×3n is the output matrix, B 2 ∈ R l×o is the input matrix of sensor fault signal

τ )dτ can be calculated indirectly via agent's state x(t) and an integrator. For UAV, x(t) can be measured by GPS and inertial measurement unit. For biped robot, x(t) can be measured by the sensor of motor on the joints.
Let us define r r r(t) = t 0 r T (τ )dτ r T (t)ṙ T (t) T to modify the output equation in (16) and combine it with (15), we have the following tracking error dynamic system of an agent in the hybrid URTS: e e e(t) = Ae e e(t) + B(u fb (t) + f 1 (t)) y(t) = Ce e e(t) + Cr r r(t) + B 2 f 2 (t) (17) To deal with the fault signals f i (t), i = 1, 2, a smoothing signal model on our previous work is introduced [30]: where is the model error, C i = 1 0 . . . 0 ⊗I n i , and w i is the window size of smoothing signal model with n 1 = n and n 2 = o. To construct a smoothing signal model for an unknown signal f i (t) is impossible since there is no information about it. However, under the assumption that f i (t) is smooth, i.e., the first derivative of the f i (t) exists, we have the following equation by forward difference methoḋ where R 1 (t) ∈ O(h) denotes the remainder term, and by the following extrapolation where ε i (t) denotes the approximation error. Then, we get . . .
which is (18). Since A i in the smoothing signal model of (18) is a fixed constant matrix, we don't need to update the smoothing signal model in (18). Therefore, it is very suitable for practical applications. It can be constructed by using Lagrange extrapolation as described in [30]. (18) are with fixed values), and as long as the signal satisfies with the existence of the first derivative, this signal model can be used; if the first derivative does not exist, the modeling error v i (t) in (18) will become large.

Remark 18: Compared with other signal models, the merits of our model are that it simplifies the process of model building and omits the estimation of model parameters (the model parameters in A i in
Substituting (18) into (17), we get the following augmented tracking error system of an agent in hybrid URTS:  is the augmented tracking error vector,  . Since the fault signals become a state variable of the augmented tracking error system of an agent in (19), their corruption on the tracking error dynamic system in (17) can be avoided. In addition, the fault signals f i (t), i = 1, 2 no longer affect the augmented tracking error system in (19). Further, F 1 (t) and F 2 (t) in (19) can be easily estimated by the Luenberger observer in (20) to control in (21) through the H ∞ observer-based tracking control strategy in (24) to efficiently attenuate the effect the fault signals f 1 (t) and f 2 (t) in the sequel. This will allow the augmented tracking error system in (19) to tolerate the fault signals with larger amplitudes.
A Luenberger observer is proposed to estimate them and e e e(t) simultaneously to achieve an active FTC by the following estimation system: Remark 19: There exists an interconnected problem of agent's state in the decentralized observer-based control problem in MAS, i,e, the coupling effect c i,j (t) in (10) and (11) in this article. Different from the commonly used analysis approach (e.g. using graph theorem [31]), we treat it as the agent's own fault signal f 1 (t) and use a smoothing signal model as a compensator to deal with the interconnected problem. Although this article does not discuss the convergence between the real model of fault signal and the corresponding compensator, the model errorv(t) between them is considered and the impact ofv(t) on team formation tracking and estimation performance is to be attenuated by the purposed robust H ∞ decentralized observer-based team formation tracking control strategy of hybrid URTS in (24). Assumption 6 ( [30]): The augmented tracking error system (19) of an agent is observable, i.e., rank zI −Ā C = 3n + (w 1 + 1)n + (w 2 + 1)o, ∀z ∈ eig(Ā). The feedback control law u fb (t) of each agent in hybrid URTS then be designed as follows: where K is the control gain. Remark 20: Let u fb (t) = Kê(t) = [K I , K P , K D , K F 1 , T (τ )dτ,ê T (t),ė T (t),F 1 T (t),F 2 T (t)] T , we can find that the control gain K is composed of the PID control gains K I , K P and K D for t 0ê (τ )dτ ,ê(t) andė(t), respectively, and the fault control gains K F i for F i (t) with i = 1, 2. This shows that the feedback control law u fb (t) has fault-tolerant capability and PID control characteristic.
Let us define the augmented estimation errorẽ(t) =ē(t) − e(t), the augmented estimation error system can be obtained by (19) and (20): Combining (19), (21), and (22), we have the following augmented tracking and estimation error system of each agent in the hybrid URTS:ẋ In order to enable the designed control gain K in (21) and observer gain L in (20) to achieve a specific performance for the augmented system in (23) under the disturbancev(t), the robust H ∞ decentralized observer-based team formation tracking control strategy below a prescribed disturbance attenuation level ρ 2 for each agent in hybrid URTS is given as follows: where t f is the final time, Q 1 ≥ 0 is the weighting matrix of tracking error, Q 2 ≥ 0 is the weighting matrix of estimation error, R > 0 is the weighting matrix of control effort, V (x(0)) is the initial condition effect on the augmented tracking and estimation error system in (23) which is to be extracted from the H ∞ decentralized team formation tracking control performance, andṽ(t) is the total disturbance whose effect onē(t),ẽ(t) and u fb (t) is needed to be attenuated. If we can find the control gain K and observer gain L such that (24) holds, then the effect of total disturbanceṽ(t) on augmented tracking errorē(t) and augmented estimation errorẽ(t) can be attenuated to a prescribed level ρ 2 from the viewpoint of energy. Before analyzing the robust H ∞ decentralized observer-based tracking control problem of each agent in (24), the following lemmas are given: Lemma 1 ( [32]): For any matriices X and Y with appropriate dimensions, and matrix R = R T > 0 the following inequality holds: Lemma 2 (Schur Complement [32]): For the matrices X = X T , Y = Y T and matrix R with appropriate dimensions the following statement is true: Then, the following theorem is given. Theorem 1: (i) If there exists matrices P = P T > 0, K , L such that the following Riccati-like matrix inequality holds: ized observer-based team formation tracking control strategy in (24) of each agent in the hybrid URTS can be achieved.
Remark 21: Since the Riccati-like inequality in (27) of each agent α i,j in URTS has not involved the system information of other agents (e.g., the coupling term c i,j (t) in (10) and (11)), therefore, the robust decentralized team formation tracking control can be achieved.
Although the sufficient condition (27) for the existence of the H ∞ decentralized observer-based tracking control strategy in (24) have been found, it can not be solved easily since it is a bilinear matrix inequality (BMI) and there exists strong coupling between the designed variables K and L. To solve the issue, a two-step design procedure is exploited as follows.
If we want to find the optimal H ∞ decentralized observer-based tracking control strategy for the augmented tracking and estimation error system in (23) of each agent in hybrid URTS, we need to solve the following LMIs-constrained optimization problem: The design procedure of the optimal decentralized H ∞ observer-based feedforward reference tracking FTC scheme for each agent in (12) is summarized as follows: 1) Apply the feedforward control in (9) to obtain the linearized tracking error dynamic system in (17) for each agent in hybrid URTS. 2) Construct the smoothing signal models (18) for the actuator fault f 1 (t) and sensor fault f 2 (t). Embed these smoothing signal models into the linearized system (17) to get the augmented tracking error system of each agent in (19). 3) Construct the robust observer-based FTC law in (20) and (21) for each agent.
4) Solve the LMIs-constrained optimization problem (33) by the two-step design procedure to obtain the control gain K and observer gain L = P −1 2 Y 2 for observer-based controller in (20) and (21) of each agent in the hybrid URTS. The overall flowchart of reference tracking control of each agent in hybrid URTS is shown in Fig. 5. The reference generator is used to compute the desired reference trajectory r(t) and actuator control input u ′ (t) for each agent according to the continuous signal r ′ (t) obtained by D/A and the feedforward linearization control law u(t) we design in (9). Passing r(t) through the integrator and differentiator, we get r r r(t). r r r(t) is then passed to observer to calculate the error e e e(t). Its differential,ṙ r r(t), is then inputted to controller for feedforward control. The sensor measures not only the agent's own information (e.g., the position or velocity) but also the environmental information. The former, measurement output y(t), is passed to observer in (20) to get the estimationê(t) for the feedback control in (21). The latter is passed back to the high-level block for positioning, mapping and object recognition.
Remark 22: By the proposed agent dynamics model in (8) and the introduction of reference generator block in Fig. 5 (b) and (c), a general H ∞ decentralized observer-based feedforward reference tracking FTC scheme for each agent α i,j in hybrid URTS can be designed as shown in the controller block in Fig. 5 (d). The decentralized architecture also ensures the scalability of URTS. More specifically, let us introduce subscripts i and j to the corresponding variables of each agent α i,j , i = 1, 2, . . . , N T , j = 1, 2, . . . , N A (e.g., the state x i,j (t), the reference trajectory r i,j (t), the control gain K i,j , etc.). It can be seen that the number of teams N T > 0 and the number of agents in a team N A > 0 are scalable.
Although the control gain in (21) and observer gain in (20) for each agent in URTS can already be found through the previous steps, the calculation speed of solving the matrix inequality (27) and the online calculation speed of controller and observer can be further improved by reducing the dimensionality of the system. Observing the matrices A, B, C, B 2 in the linearized system (17), it can be further split into n subsystems for each agent (n = 6 for UAV and n = 12 for robot) if the matrices C, B 2 in the output equation (16) (17) where e e e i (t) ∈ R 3 , u fb,i (t) ∈ R, f 1,i (t) ∈ R, y i (t) ∈ R l 0 , f 2,i (t) ∈ R o 0 and e i is standard unit column vectors in R n , we get the n subsystems for each agent: e e e i (t) = A 0 e e e i (t) + B 0 (u fb,i (t) + f 1,i (t)) y i (t) = C 0 e e e i (t) + C 0 r r r i (t) + B 2,0 f 2,i (t) (34) where i = 1, 2, . . . , n.
45902 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.
Remark 23: If the linearized system (17) can be splited into n subsystems, this means that the error e e e i (t) of each state variable x(t) of each agent in (8), can be measured independently via n sensors to obtain the independent outputs y i (t). In actual UAV system or biped robot system, this is usually done.
By Theorem 1 again, the form of subsystems in (34) shows that we can find the control gain K i ∈ R 1×s and observer gain L i ∈ R s×l 0 of the ith subsystem (34) that achieve the H ∞ decentralized observer-based tracking control performance with a prescribed attenuation level ρ i , where s = 3 + (w 1 + 1) + (w 2 + 1)o 0 . The original control gain K of the original agent system can be reconstructed by The original observer gain L can be reconstructed in the same way.
In this case, the calculation speed of finding gains K , L of each agent can be improved since the dimensionality is decreased. Furthermore, the online calculation speed of controller and observer can be also improved since there are more zeros in the control gain K and observer gain L found by this method while maintaining the same estimation and tracking robustness. More clearly, the number of elements in matrix K , i.e., the number of scalar gains, changes from n×sn to n(1×s). For L, it changes from sn × l 0 n to n(s × l 0 ). The number of scalar gains to be designed is significantly reduced.

V. SIMULATION RESULTS
In this section, a specific S&R procedure for URTS is given to illustrate the proposed system architecture of performing S&R tasks for each agent in hybrid URTS and demonstrate the effectiveness of local motion planning and reference control strategy in hybrid URTS. First, a S&R area divided into N T areas area i , i = 1, 2, . . . , N T , is given as shown in Fig. 6. To simplify the description, we will focus on the UAV and robot in the ith team and the (i + 1)th team. Suppose each team has 5 agents, i.e., N A = 5, then we can denote the ith team as a set, team i = {α i,j |j = 1, 2, . . . , 5}.
At the beginning, the task allocation block will assign the agents in team i with some search tasks in area i to build the occupancy map and find targets. The search task is assumed to be allocated by dividing the unsearched region as shown in Fig. 7. Representing the search tasks in Fig. 7 as a set task 1 = {T j |j = 1, 2, . . . , 5}∪{T 6 }, then the proper agent-task pairs allocation 1 = {(α i,j , T j )|j = 1, 2, . . . , 5}∪{(α i+1,2 , T 6 )} can be obtained through the task allocation block. Suppose a goal is found after a while as shown in Fig. 7. At this point, we have a rescue task T 7 . The new task list task 2 = task 1 ∪ {T 7 } is obtained by updating the old one. If the ground station assigns α i,5 and α i+1,2 to perform T 7 through the task allocation algorithm, then we have the new allocation allocation 2 = (allocation 1 − {(α i,5 , T 5 ), (α i+1,2 , T 6 )}) ∪ {(α i,5 , T 7 ), (α i+1,2 , T 7 )}. Until the S&R task is over, the task allocation block will continuously work in the similar way.
To illustrate Global Path Planning block and Behavior Decision block, we choose the pairs (α i,5 , T 7 ) and (α i,1 , T 1 ) as example. For the UAV α i,1 , the path (σ n ), n ∈ Z ∩ [1, k f ], FIGURE 6. An example of a S&R area in URTS. This area is divided into N T areas, and team i is responsible for area i .

FIGURE 7.
The allocation of search tasks in the i th team and (i + 1)th team at the begining. The search tasks T j , j = 1, 2, . . . , 6 are allocated by the task allocation block. The content of the search tasks is to reach some consecutive goals q goal (black dots in figure) so that the straight path between goals q goal can cover the unsearched area. For UAV, the sequence formed by q goal is directly the path (σ n ) due to the no-collision assumption Assumption 4. For robots, q goal will be passed to Global Path Planning block together with current configuration q start and occupancy map C to find collision-free paths (σ n ). k f = 16 is directly assigned as shown in Fig. 8 without going through Global Path Planning block by Assumption 4. The behavior sequence (β n ) is set as β n = flying, n ∈ Z ∩ [1, k f ]. For the robot α i,5 , we have the goal configuration q goal from the task T 7 . With the current configuration q start and configuration space C obtained by SLAM, the path (σ n ), n ∈ Z ∩ [1, k f ], k f = 27 can be planned as shown in Fig. 9. The behavior sequence (β n ) is set as β n = walking for n ∈ Z ∩ [1, 5], β n = climbing for n ∈ Z ∩ [6,15] and β n = running for n ∈ Z ∩ [16,27]. We choose the walking behavior β n , n ∈ Z∩ [1,5] to illustrate Local Motion Planning block of robots.
After (β n ) is set, we can plan the reference path r[k] by Local Motion Planning block. Following the procedure in Fig. 4, the results of local motion planning of UAV flying and robot walking are shown in Fig. 8 and 10, respectively.
The simulation results of tracking and estimation in Reference Tracking Control block of the UAV α 1,1 and the robot α 1,5 in team 1 are given as follows: UAV α 1,1 : The position trajectories of reference r(t), state x(t) = e(t) + r(t) and estimated statex(t) =ê(t) + r(t) are shown in Fig. 11. The estimation of actuator fault f 1 (t) in (15) is shown in Fig. 12. The estimation of sensor fault f 2 (t) in (16) is shown in Fig. 13. The actuator control input u ′ (t) is shown in Fig. 14.
Robot α 1,5 : The position trajectories of reference r(t), state x(t) = e(t) + r(t) and estimated statex(t) =ê(t) + r(t) are shown in Fig. 15. The estimation of actuator fault f 1 (t) in (15)  is shown in Fig. 16. The estimation of sensor fault f 2 (t) in (16) is shown in Fig. 17. The actuator control input u ′ (t) is shown in Fig. 18.
In Fig. 11, the tracking and estimation errors of UAV position and attitude reach the steady state all within 2 seconds. There is a brief jitter when the sensor fault signal changes drastically, but it returns quickly to the steady state. In Fig. 15, the tracking and estimation errors of robot joint angles immediately reach and maintain steady state under the influence of fault signals. In Figs. 12 and 16, the results show that the actuator fault f 1 (t), i.e., feedforward control errors and disturbances, can be effectively estimated. However, in Figs. 13 and 17, the estimation of sensor fault f 2 (t) has an overshot phenomenon when there is a large change and returns to a steady state after about 2 seconds. This is because the sensor fault f 2 (t) in y(t) will directly influence on the estimation of Luenberger observer in (20). In Figs. 14 and 18, the actuator control input u ′ (t) have high frequency and high amplitude at   the initial instance due to the high gain characteristic of the robust control. After that, they maintain the sine wave shape    to offset the estimated actuator fault value. Besides, it can be seen that the total force F of UAV remains constant against the gravity in Fig. 14.   FIGURE 19. The trajectories of reference r (t ), state x(t ) and estimated statex(t ) of the UAV α 1,1 by the traditional PID computed torque controller without FTC [33]. FIGURE 20. The trajectories of reference r (t ), state x(t ) and estimated statex(t ) of the robot α 1,5 by the traditional PID computed torque controller without FTC [33]. In order to show the effect of active FTC based on the proposed embedded smoothing model method, a tra-45906 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.  ditional PID computed torque controller without FTC is used for comparison [33]. In order to only focus on the effect of FTC, the method in [33] was revised to oberserver-based output feedback control law, that is, the control law become u(t) = M (r(t))(r(t) + u fb (t)) + H (r(t),ṙ(t)) where u fb (t) = [K I , K P , K D ][ t 0ê T (τ )dτ,ê T (t),ė T (t)] T . The results are shown in Fig. 19 for a UAV and Fig. 20 for a robot. From Fig. 19 and Fig. 20, it can be clearly seen that the influence of the acuator fault f 1 (t) on the tracking error is revealed. The influence of the sensor fault f 2 (t) is relatively insignificant because its value is much smaller than f 1 (t) by our simulation setting. However, the effect of fluctuation f 2 (t) with a period of 10 seconds (corresponding to the period of the smoothed square wave) can still be seen from the UAV attitude state variables x 4 , x 5 , x 6 in Fig. 19 and the robot joint state variables x 5 , x 6 , x 11 , x 12 in Fig. 20. Although it is still stable, the tracking performance has deteriorated significantly compared to Fig. 11 and Fig. 15, respectively.
The norm of nonlinear terms M and H of UAV and robot are shown in Fig. 21. Since the terms are considered in the actuator fault f 1 (t), the values are estimated and eliminated so that they decay to a small value after 1 second as shown in Fig. 21.
We use three teams team i , i = 1, 2, 3 to show the team tracking results of hybrid URTS as shown in Fig. 22. As shown in Fig. 6, the UAV and robot in URTS will be released along the direction of the road (Y -axis) and each team is responsible for an area in Fig. 22. First, the respective tasks of each agent are determined by the task allocation algorithm. According to the task content, each agent will be determined a goal q goal . As shown in Fig. 7, this simulation assumes that team i , i = 1, 2, 3 are assigned with search tasks. Subsequently, the path (σ n ) for each UAV is assigned and the collision-free path (σ n ) for each robot is planned by the path planning algorithm to reach the goal configuration q goal . Depending on the terrain environment, an appropriate behavior (β n ) is decided to enable the agent to follow the path (σ n ). A reference path r[k] corresponding to a behavior is planned via local motion planning to enable an actual mechanical body to perform the behavior. Since this article only gives the motion planning method of flying (for UAV) and walking (for biped-robot), it is assumed that the appropriate behaviors from q start to q goal are all flying or walking behaviors in this simulation. In Fig.22, we draw the smoothed path (σ ′ n ) of each agent, which is the intermediate result of local motion planning as shown in Fig, 4. It represents the path that the agent is going to reach in the task space. According to r[k] and the dynamic model of UAV and robot, the reference trajectory r(t) of each agent is planned. Finally, through the decentralized H ∞ observer-based feedforward reference tracking FTC scheme proposed in this paper, the team tracking result of the agents in team i , i = 1, 2, 3 in URTS are shown. Note that the above process is dynamic. If a higher block in Fig. 1 makes a new decision that produces a new reference planning, the lower blocks must recalculate based on it. Relatively, this simulation is the static result, that is, there is no re-decision from q start to q goal . The real time simulation of reference planning and team formation tracking control of URTS is given in [34].
For further verification, we visualized the real time simulation of reference planning and team formation tracking control in hybrid URTS and the configuration trajectory of biped robot on the online resource [34]. The results again demonstrate the effectiveness of the proposed H ∞ decentralized observer-based feedforward reference tracking FTC method for agents in URTS.

VI. CONCLUSION
In this study, a system architecture of performing S&R tasks for each agent in hybrid URTS is given. The task allocation problem and path planning problem are investigated and unified to the integration of local motion planning and robust H ∞ decentralized feedforward reference tracking FTC for hybrid URTS. By decomposing the path planning process into three subprocess, i.e., Global Path Planning, Behavior Decision and Local Motion Planning, some common roadmap-based path planning algorithm can be applied in the global path planning of URTS. Through the behavior decision, the agent can decide the appropriate behavior according to the terrain environment. Next, we focus on the local motion planning of UAV flying and robot walking behavior. Besides, the bridging method between the reference path planned by the local motion planning block and the reference trajectory to be followed in the reference tracking control block is also given. By a general nonlinear agent dynamic model, the tracking control problem of UAV and robot can be analyzed together. Through a novel feedforward linearization control strategy, the nonlinear tracking control problem with external disturbances is transformed to a regulation problem with fault signals. Then, a smoothing signal model is introduced to embed the fault signals into the state vector to avoid the corruption on the agent dynamic model. After that, a robust H ∞ decentralized observer-based feedforward reference tracking FTC strategy is proposed for each agent in the hybrid URTS. To solve the robust H ∞ decentralized observer-based feedforward reference tracking FTC problem, we transform it into a LMI-constrained optimization problem by a two-step design procedure, which can be effectively solved by MATLAB LMI Toolbox. A simulation example is given to illustrate more concretely how the proposed hybrid URTS architecture actually works. Finally, the effectiveness of the proposed robust H ∞ decentralized observer-based feedforward reference tracking FTC method is also verified by the simulation results. Further research topic will focus on the robust adaptive H ∞ decentralized observer-based attack-tolerant team formation tracking for search and rescue task of hybrid UAVs and biped robots networked control system.