Control of Cooperative Manipulator-Endowed Systems under High-Level Tasks and Uncertain Dynamics

This paper considers the problem of distributed motionand task-planning of multi-agent and multi-agent-object systems under temporal-logic-based tasks and uncertain dynamics. We focus on manipulator-endowed robotic agents that can interact with their surroundings. We present first continuous control algorithms for multi-agent navigation and cooperative object manipulation that exhibit the following properties. First, they are distributed in the sense that each agent calculates its own control signal from local interaction with the other agents and the environment. Second, they guarantee safety properties in terms of inter-agent collision avoidance and obstacle avoidance. Third, they adapt on-the-fly to dynamic uncertainties and are robust to exogenous disturbances. The aforementioned algorithms allow the abstraction of the underlying system to a finite-state representation. Inspired by formal-verification techniques, we use such a representation to derive plans for the agents that satisfy the given temporal-logic tasks. Various simulation results and hardrware experiments verify the efficiency of the proposed algorithms. ?This document is the results of the research project supported by the H2020 ERC Starting Grant BUCOPHSYS, the European Union’s Horizon 2020 Research and Innovation Programmes under the GA No. 731869 (Co4Robots) and 644128 (AEROWORKS), the Swedish Research Council (VR), the Knut och Alice Wallenberg Foundation (KAW) and the Swedish Foundation for Strategic Research (SSF). ∗Corresponding author Email addresses: christos.verginis@angstrom.uu.se (Christos K. Verginis ), dimos@kth.se (Dimos V. Dimarogonas) Preprint submitted to Elsevier June 27, 2022

specifications expressed as temporal logic formulas. Temporal-logic-based motion planning has gained a significant amount of attention over the last decade, since it provides a fully automated correct-by-design controller synthesis approach for autonomous agents. Temporal logics, such as linear temporal logic (LTL), provide formal high-level languages that can describe planning objectives more complex than the well-studied navigation algorithms, and have been used extensively both in single-as well as in multi-agent setups. The objectives are given as a temporal logic formula with respect to a discretized abstraction of the system (usually a finite transition system), and then, a high-level discrete path is found by off-the-shelf model-checking algorithms, given the abstracted system and the task specification. Consider, for instance, a robot operating in a workspace which is partitioned into 6 rooms and a corridor consisting of three regions. A high-level task for the robot might have the following form: "Periodically visit rooms R 1 , R 4 , R 6 , in this order, while avoiding rooms R 2 , R 3 and R 5 ", or "Grab the ball that lies in room R 6 and deliver it in room R 3 between 10 and 20 seconds". The aforementioned specifications include complex tasks where time might play an important role.
One of the main problems that arise when dealing with high-level tasks based on temporal-logic formulas is the construction of a discrete abstracted representation of the continuous system. More specifically, given a temporallogic formula over a continuous workspace/state space, how does one partition this space into discrete states? Moreover, given a predefined partition, what are the control inputs of the agents that guarantee well-defined transitions among the discrete states? When multiple robotic agents are concerned, the aforementioned specifications must also incorporate collision-avoidance as well as connectivity-maintenance properties among the agents, which brings the problem of abstraction to a new level of complexity. Furthermore, consider a case where some unactuated objects must undergo a series of processes in a workspace with autonomous agents (e.g., car factories), expressed as temporal-logic highlevel specifications. In such cases, the agents, except for satisfying their own motion specifications, are also responsible for coordinating with each other in order to transport the objects around the workspace.
The aforementioned problems become even more challenging when one takes into account system uncertainty. The dynamic model of real robotic systems cannot be accurately known by the user/designer, since it includes terms that might not be easy to identify, e.g., dynamic parameters (mass, inertia), friction, and other external disturbances. This becomes more apparent as the complexity of the considered systems increases (consider, e.g., a mobile robot vs a 6-DoF robotic manipulator). Such uncertainties are expected to affect the performance of the system, and since they cannot be accurately canceled by the control design, the latter must render the closed-loop system adaptable and robust to them [12].
This paper summarizes the main results of the PhD Thesis [13] 1 . We consider the problem of distributed motion-and task-planning of uncertain multi-agent and multi-agent-object systems under complex task specifications expressed via temporal logic constraints. We present robust abstractions of the continuous system dynamics into a discrete representation (e.g., transition systems) and the application of formal verification methodologies towards the satisfaction of temporal-logic-based tasks. More specifically, we break down the problem into three main subproblems.
First, motivated by the need of transition design for multi-agent systems, we develop distributed control protocols for the safe multi-agent navigation subject to collision and connectivity constraints as well as model uncertainties. Second, in view of the incorporation of unactuated objects in the task specifications, we consider the problem of cooperative object manipulation. We design control protocols for distributed cooperative manipulation of an object grasped by multiple robotic agents subject to dynamic uncertainties. The third part draws from the previous ones to design well-defined discrete abstractions for multiagent and multi-agent-objects systems. In that way, we allow the expression 1 The first author was the awardee of the 2020 EECI European PhD Award on Control for Complex and Heterogeneous Systems.
of complex desired tasks as temporal logic specifications, for which we provide suitable controller-synthesis algorithms.

Multi-Agent Navigation
Multi-agent navigation with collision avoidance, possibly also with workspace obstacles, is a special instance of the motion planning problem [14,15].
Several techniques have been developed in the related literature for robot motion planning with obstacle avoidance, such as discretization of the continuous space and employment of discrete algorithms (e.g., Dijkstra, A ), probabilistic roadmaps, sampling-based motion planning, and feedback-based motion planning [16].
Feedback-based motion planning has been receiving attention for more than two decades. Early works [17,18,19], relied on the Koditschek-Rimon navigation function (KRNF) [20,21], where the robots successfully converge to their goals while avoiding collisions in obstacle-free workspaces from almost all initial conditions (in the sense of a measure-zero set), if a control gain is chosen large enough. The idea of gain tuning has been also employed to an alternative KRNF in [22]. Tuning-free controllers are presented in [23,24]. Other works employ control barrier functions for multi-agent collision avoidance [25] and optimization-based techniques via model predictive control (MPC) [26]; [27] proposes reciprocal collision obstacle by local decision making for the desired velocity of the agents and sensing uncertainties are taken into account in [28]. The work [29] develops a a workspace decomposition methodologies with a hybrid controller.
A common assumption that most of the aforementioned works consider is the simplified robot dynamics, i.e., single integrators/unicycle kinematics, without taking into account any robot dynamic parameters and where the control input is the robot velocity. Hence, indirectly, the schemes depend on an embedded internal system that converts the desired signal to the actual robot actuation command. The above imply that the actual agent trajectories might deviate from the desired ones, jeopardizing safety and possibly resulting in collisions.
Multi-agent navigation can be also achieved via means of sequential leaderfollower coordination, where an assigned leader agent has priority over the other agents for goal convergence. A large variety of works, however, do not consider collision-avoidance specifications among the agents [30,31,32,33], which is crucial when considering real physical systems. Moreover, as discussed before, many of the works in the related literature consider simplified/known dynamics ([34, 6, 22, 35, 31, 36, 33, 37, 38]), which can have crucial effects on the actual behaviour of real robotic systems.

Cooperative Manipulation
Cooperative manipulation is a well-studied topic, with numerous works in the last three decades [39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55]). Impedance and hybrid force/position control is the most common methodology used in the related literature, where a desired impedance behavior is imposed potentially with force regulation. Most of the aforementioned works employ force/torque sensors to acquire feedback of the object-robots contact forces/torques, which however may result in a performance decline due to sensor noise or mounting difficulties. Recent technological advances allow manipulator grippers to grasp rigidly certain objects (see e.g., [56]), which can render the use of force/torque sensors unnecessary. Force/Torque sensor-free methodologies can be found in [42,57,49]. Moreover, [51] uses an external force estimator, without employing force sensors, [40] presents a force sensor-free control protocol with gain tuning, and [44] considers the object regulation problem without force/torque feedback. Finally, force/torque sensor-free methodologies are developed in [58], where the robot dynamics are not taken into account.
Most works in the related literature consider known dynamic parameters regarding the object and the robotic agents. However, the accurate knowledge of such parameters, such as masses or moments of inertia, can be a challenging issue, especially for complex robotic manipulators; adaptive control protocols are proposed in [41] with a gain tuning scheme, in [44], where the object regulation problem is considered, and in [42], [52]. An estimation of parameters is included in [58,59], whereas [53] and [54] employ fuzzy mechanisms to compensate for model uncertainties. In [55,60], the authors develop a task-oriented adaptive control protocol using observers. Kinematic uncertainties and joint limits are handled in [61], [47], and [62], respectively.
Other works consider internal force and load distribution analysis [63], leaderfollower schemes [50], or intermittent contact [64]; [65] proposes a kinematicbased multi-robot manipulation scheme, and [66,67] address the problem from a formation-control point of view. Finally, an important property not addressed in the cooperative-manipulation related works is the establishment of transient constraints for the object and the robotic agents. Such constraints might entail obstacle avoidance, input saturation constraints, or singularities of the agents' Jacobian matrix, which maps the joint velocities of each agent to a 6D vector of generalized velocities.

Abstractions of Multi-Agent Systems
Temporal-logic-based planning has gained significant attention in recent years, as it provides a fully automated correct-by-design controller synthesis approach for autonomous agents. Temporal logics such as linear temporal logic (LTL) and metric interval temporal logic (MITL) provide formal high-level languages that can describe complex planning objectives. There exists a wide variety of works that employ temporal logic languages for multi-agent systems, e.g., [68,69,70,71,72,73,74,75,76,77,78]. The discretization of a multi-agent system to an abstracted finite transition system necessitates the design of appropriate continuous-time controllers for the transition of the agents among the states of the transition system [79]. Most works in the related literature, however, adopt point-mass agents with perfectly known dynamics, without taking into account inter-agent collision specifications and facilitating, thus, the control design. Furthermore, the vast majority of temporal-logic-based works consider LTL-based objectives; LTL languages, in contrast, e.g., to MITL, lack the abil-ity to encode time constraints, which provide a larger and more expressive variety of tasks. Time constraints are efficiently encoded in temporal-logic-based tasks through signal temporal logic (STL) languages [76,77,78]. However, STL tasks are tackled purely with feedback-control algorithms, inheriting thus their drawbacks, such as local minima. Finally, most works in the related literature consider temporal logic-based motion planning for fully actuated, autonomous agents. In many applications, however, unactuated objects must undergo a series of processes in a workspace with autonomous agents (e.g., car factories).
When the unactuated objects' specifications are expressed using temporal logics, then the abstraction of the agents' behavior becomes much more complex, since it has to take into account the objects' goals.

Notation
The sets of real and natural numbers are denoted by R and N, respectively; x and x 1 denote the 2-norm and 1-norm, respectively, of a vector x ∈ R n .
The open and closed balls with respect to the 2-norm and with radius δ > 0, centered at x ∈ R n , are denoted by B(x, δ) andB(x, δ), respectively; S s (·) is the skew-symmetric operator defined according to the cross product S s (a)b = a × b for vectors a, b ∈ R 3 . Given the sets S 1 , S 2 ⊂ R n and the matrix A ∈ R n×m , we define the Minkowski addition as S 1 ⊕ S 2 := {s 1 + s 2 ∈ R n : s 1 ∈ S 1 , s 2 ∈ S 2 }, the Pontryagin difference as S 1 S 2 := {s 1 ∈ R n : s 1 + s 2 ∈ S 1 , ∀s 2 ∈ S 2 }, and the matrix-set multiplication as A • S := {a : ∃s ∈ S : a = As}. Finally, the 3D rotation group is denoted by SO(3).

Task Specification in LTL
We focus on the task specification Φ given as a Linear Temporal Logic (LTL) formula. The basic ingredients of an LTL formula are a set of atomic propositions Ψ and several boolean and temporal operators. LTL formulas are formed according to the following grammar [79]: where a ∈ Ψ, Φ 1 and Φ 2 are LTL formulas and , U are the next and until operators, respectively. Definitions of other useful operators like (always), ♦ (eventually) and ⇒ (implication) are omitted and can be found at [79]. The semantics of LTL are defined over infinite words over 2 Ψ . Intuitively, an atomic proposition ψ ∈ Ψ is satisfied on a word w = w 1 w 2 w 3 . . . if it holds at its first position w 1 , i.e. ψ ∈ w 1 , denoted as w |= Φ. Formula Φ holds true if Φ is satisfied on the word suffix that begins in the next position w 2 , whereas Φ 1 ∪ Φ 2 states that Φ 1 has to be true until Φ 2 becomes true. Finally, ♦Φ and Φ holds on w eventually and always, respectively. For a full definition of the LTL semantics, the reader is referred to [79].

Task Specification in MITL
Similar to LTL, the basic ingredients of MITL are a set of atomic propositions Ψ, boolean and temporal operators, and intervals of time. MITL formulas are formed according to the following grammar [79]: where a ∈ Ψ, and , ♦, and U are the next, future, always and until operators, respectively, as in LTL case; I is a nonempty time interval in one of the following forms: MITL can be interpreted either in continuous or point-wise semantics. We utilize the latter and interpret MITL formulas over timed runs. The semantics of MITL are defined over infinite timed words over 2 Ψ . More specifically, the satisfaction of an atomic proposition ψ ∈ Ψ by a timed word w = (w 1 , t 1 )(w 2 , t 2 ) . . . at index j is as follows: (w, j) |=p ⇔ p ∈ L(r(j)), For more details, the reader is referred to [80,81].

Multi-Agent Navigation
Safe, distributed navigation of multiple agents is of paramount importance and entails numerous applications. In the special case of temporal-logic specifications, each agent's task requires it to navigate around a workspace while avoiding collisions with other agents and potential obstacles. In this section, we address the problem of distributed multi-agent navigation subject to collision constraints and uncertain dynamics. First, we consider a setup where the agents operate in an obstacle-free environment, but have to comply with connectivity constraints among them. Second, we relax the connectivity constraints and generalize our results to obstacle-cluttered environments. Both setups consider a leader-follower setting, where, at every time instant, a leader agent is prioritized over the rest of the agents to navigate to its goal. Such a prioritization can be given a priori to the agents and is used online to guarantee their sequential navigation to the respective goals.

Leader-follower navigation with collision avoidance and connectivity maintenance
Consider N > 1 autonomous agents, with N := {1, . . . , N }, operating in R n and characterized by the closed spheresB i (p i , r i ) = {y ∈ R n : p i − y ≤ r i } 2 , with p i ∈ R n being agent i's center, and r i > 0 its bounding radius. We consider the following Lagrangian dynamics for the agents: where M i : R n → R n×n are positive definite (and bounded) inertia matrices, C i : R 2n → R n×n are the Coriolis terms, g i : R n → R n are the gravity vectors, f i : R 2n → R n are unknown vector fields that represent friction-like terms, unknown external disturbances and modeling uncertainties, and u i ∈ R n are the agents' control inputs, ∀i ∈ N . All the aforementioned dynamic terms are continuous in their arguments, and d i are uniformly bounded.
2 Ellipsoid-shaped agents are considered in [82] Moreover, we consider that the dynamic terms M i , C i , and g i include unknown constant dynamic parameters of the agents (e.g., masses, moments of inertia), denoted by the vectors θ i ∈ R , ∈ N, for all i ∈ N . The Lagrangian system (1) can be linearly parameterized with respect to these parameters, i.e., for any vectors x, y, z, w ∈ R n , it holds that We further assume that the terms f i satisfy the growth condition: Each agent has a limited sensing radius ς i > 0, with ς i > max j∈N {r i + r j }, which implies that the agents can sense each other without colliding. Based on this, we model the topology of the multi-agent network through the undirected where p := [p 1 , . . . , p N ] ∈ R nN . We further denote K(p) := |E(p)|. Given the k-th edge in the edge set E(p), we use the notation (k 1 , k 2 ) ∈ N 2 that gives the agent indices that form edge k ∈ K(p), where k 1 is the tail and k 2 is the head of edge k, and K(p) := {1, . . . , K(p)} is an arbitrary numbering of the edges E(p).
We consider that the agents aim at safely navigating to their goals, denoted by x di , i ∈ N . As mentioned before, we assume that the agents have been assigned with a certain prioritization. The first agent in priority is a leader agent, while the rest are the follower agents. The follower agents give priority to the navigation of the leader agent, while avoiding collisions with each other and maintaining connectivity, as defined by the initial graph G(p(0)). Once the leader navigates to its goal, the next agent in priority assumes the role of the leader and so forth. Without loss of generality, we assume that agent i = 1 corresponds to the leader agent, whereas i > 1 are the followers, which belong to the set N F := {2, . . . , N }. Hence, the problem at hand is the design of a distributed control protocol such that leader navigates to a desired pose p d := p d1 , i.e., lim t→∞ (p 1 (t) − p d ) = 0, while guaranteeing collision avoidance and connectivity maintenance, i.e.,B i (p i (t), r i )∩B j (p j (t), r j ) = ∅, for all t ≥ 0, i, j ∈ N , i = j, and p k1 (t)−p k2 (t) ≤ min{ς m1 , ς m2 }, for all t ≥ 0, k ∈ K 0 ⊂ K(p(0)), where K 0 := {1, . . . , K 0 } is an edge numbering for the edge set E 0 := E(p(0)), Besides the edge set E 0 , with edge numbering K 0 and K 0 edges, which needs to remain connected, we consider also the complete graphḠ := (N ,Ē), with are polynomials that guarantee that β c,k and β n,l , respectively, are twice continuously differentiable, for all k ∈K, l ∈ K 0 . The aforementioned functions are smooth switches. Then, we choose β c,k := β c,k (ι k ), ι k := ι k (p k1 , p k2 ) := p k1 − p k2 2 − (r k1 + r k2 ) 2 β n,l := β n,l (ν l ), ν l := ν l (p l1 , p l2 ) := d 2 n,l − p l1 − p l2 2 with d n,k := min{ς k1 , ς k2 } and we also setd c,k := d 2 n,k − (r k1 + r k2 ) 2 , k ∈K, l ∈ K 0 . The termsβ c ,β n can be any positive constants. Note that β c,k and β n,l take into account the limited sensing capabilities of the agents and their derivatives vanish at collisions and connectivity breaks, respectively, of the respective edges.
All the parameters for the construction of β c,k , β n,l can be transmitted off-line to the agents.
Regarding the uncertain terms of (1), we define estimatesθ i ∈ R ,α i ∈ R, By using adaptive control techniques, these estimations compensate appropriately for the unknown terms, without necessarily converging to them.
In addition, we define the leader error signal s e := x 1 − x d and the constants α c i,k and α n i,l as: for all k ∈K, l ∈ K 0 , i ∈ N , which provide boolean values depending on whether agent i is part (head or tail) of edge m and l. Finally, we define, for all k ∈K, l ∈ K 0 , the terms which diverge to infinity in a collision and a connectivity break of the agents k 1 , k 2 and l 1 , l 2 , respectively. We propose now the following distributed adaptive control protocol. Choose the agents' desired velocity as that concerns the collision avoidance and connectivity maintenance properties, with the extra term γ e s e for the leader to guarantee the navigation to x d . The terms γ e , k i are positive constants, for i ∈ N F . Since v di is not the actual velocity of the agents, we define the errors e vi :=ṗ i − v di , for all i ∈ N , and design the distributed control laws , and k vi are positive gains. Moreover, we design the adaptation signalṡ for all i ∈ N , with arbitrary bounded initial conditions, and positive constants The following theorem establishes the leader navigation while ensuing collision avoidance and connectivity maintenance.
. Consider a multi-agent system described by the dynamics (1). Then, application of the control and adaptation laws (4), (5) guarantees: 1) navigation of the leader agent to x d , 2) connectivity maintenance of the subset E 0 of the initial edges, 3) inter-agent collision avoidance, and 4) boundedness of all closed loop signals, from all collision-free initial configurations. Moreover, it holds that lim t→∞ v i (t) = 0, i ∈ N .
Once the leader agent converges to its goal, e.g., when p 1 − p d becomes smaller than a small positive constant, it broadcasts it and the next agent in priority becomes the leader. Therefore, repetitive application of the aforementioned theorem for all agents guarantees their sequential navigation to their respective goals.

Leader-follower navigation in obstacle-cluttered environments
In this section, we consider the more challenging problem of distributed and safe multi-agent navigation in a bounded workspace with obstacles. More specifically, the workspace is now assumed to the open ball W := B(0, r W ) = {z ∈ R n : z < r W }, with radius r W > 0. The workspace contains M > 0 closed sets O j , j ∈ J := {1, . . . , J}, corresponding to obstacles. Each obstacle is a closed ball centered at c j ∈ R n , with radius r cj > 0, i.e., O j :=B(c j , r cj ) = {z ∈ W : z − c j ≤ r cj }, for j ∈ J . The agents are now assumed to obey the simplified -with respect to (1) -dynamics: for all i ∈ N , where m i is the ith agent's mass, which is considered unknown, g is a gravity constant, and f i (·) are unknown functions satisfying similarly to the previous section. Without loss of generality, we assume that i = 1 is the leader agent, aiming to safely converge to its goal p d = p d1 . The large complexity of the considered problem, imposed by the addition of static obstacles, prevents us from considering connectivity properties while keeping the convergence guarantees.
Moreover, in contrast to the previous section, we consider that when the leader agent arrives at its goal, it stays there idle, representing an additional obstacle to the rest of the agents. We further make the following assumption regarding the workspace, the initial states, and goals: Assumption 1. The workspace W, obstacles O j , j ∈ J , and destinations p di , i ∈ N , satisfy: for all i, ∈ N , i = , j, k ∈ J , j = k, for an arbitrarily small positive constant ε d , where r M := max i∈N {r i }.
Loosely speaking, the aforementioned assumption states that the pairwise distances among obstacles and workspace boundary are large enough so that one agent can always navigate between them. Since the convergence of the agents to the their destinations is asymptotic, we incorporate the threshold ε d , which is the desired proximity we want the agents to achieve to the destination.
In view of Assumption 1, we can find a positive constantr such that The proposed control algorithm relies on the fact that the agents actively avoid collisions with each other and the obstacles only when they are sufficiently close to them. Such a property is encoded via a designed switching function.
Before providing it, we define the associated distance metrics for the leader agent for all j ∈ J , i ∈ N F , and the follower distance metrics The aforementioned distances encode the collisions among the agents, with the obstacles and the workspace boundary.
Note that the follower agents need to keep a larger distance among each other, with the obstacles, and the workspace boundary. This is needed so that the leader agent is able to navigate among them to reach its goal. Intuitively, the follower agents will reach a local minimum from the conflicting objectives of driving to their goal and avoiding the other agents and obstacles; the leader then will be able to navigate among them and the obstacles to its goal. Moreover, as mentioned before and unlike the previous section, where the agents were required to stay connected to one another, here we consider that the agents aim to navigate to their goals and remain there. Therefore, the control algorithm uses the distances d i,d , i, ∈ N , i = so that a follower agent i does not reach a local minimum too close to a goal p d for some agent = i higher in priority (which would need to navigate to its goal before agent i). Finally, we assume that all aforementioned distances in (8) are positive initially at t = 0 and we define the constantr d := min r 2 As in the previous section, we consider that each agent i ∈ N has a limited sensing radius ς i and has access to (p j ,ṗ j ), ∀j ∈ {j ∈ N : The aforementioned specification on ς i will be needed later for the feasibility of the control protocol.
A critical property for the correctness of the proposed algorithm is the fact that the leader agent actively avoid collisions with the followers and the obstacles only when it is sufficiently close to them. This is encoded through a twice contin.
Let now a function β that satisfies the aforementioned properties. We define the 2nd-order navigation functions for the agents as The convergence of the leader to its goal is based on the fact that it actively avoids collision with other agents and obstacles only when it is sufficiently close to them. In that way, at most one of the terms appearing in φ 1 , has non-zero gradient at each x ∈ R N n . We achieve that by setting τ in the functions β as τ ∈ (0, min{r 2 ,r d }), as proven in [84]. Regarding the ability of the agents to sense each other when d i, (p i , p ) < τ , which is when they avoid collisions with other, it holds that We next define the distributed control law as andm i ,α i denote the estimates of m i and α i , respectively, by agent i, evolving according toṁ with k mi , k αi positive gain constants,α i (t 0 ) ≥ 0, and arbitrary initial conditionŝ The correctness of the proposed control algorithm is given in the subsequent theorem.

Theorem 2 ([84]
). Consider N robots operating in W, subject to the uncertain 2nd-order dynamics (6), and a leader agent i = 1. Then, the control protocol (9), (10) guarantees collision avoidance between the agents and the agents and obstacles/workspace boundary as well as convergence of p 1 to p d = p d1 from almost all initial conditions that satisfy positivity of the distance metrics (8) at t = 0, given sufficiently small τ and that k φi > αi 2 , i ∈ N . Moreover, all closed loop signals remain bounded, ∀t ≥ 0.
Similarly to the previous section, once the leader agent converges close to its goal, i.e., when p 1 −p d ≤ ε d , it broadcasts it and the next agent in priority becomes the leader. This occurs iteratively until all the robotic agents reach their destinations. Finally, the proposed algorithm can be extended to environments with obstacles that are more complicated (e.g., star-shaper obstacles [85]) in the single-agent case, i.e., when the follower robots are completely immobilized  Figure 1: (a): The leader signal se(t) + ev 1 (t) , which converges to zero for every navigation objective; (b) the product k∈K , which remains bounded, proving thus the collision and connectivity properties (the zero values stem from the computer's lower numerical limits); (c) the adaptation signals i∈{1,...,6} [84]. For more complex workspaces and simultaneous multi-agent navigation, an algorithm combining potential fields and sampling-based motion planning is developed in [86], requiring however a central computer unit to monitor and plan the motion for the agents.

Simulation Results
We first demonstrate the control algorithm of Section 4.1. We conduct simulations with N = 6 UAVs in R 3 using the realistic robotic simulator Gazebo [87]. We consider bounding radii r i = 0.35m, sensing ranges ς i = 3m, ∀i ∈ N , and initial positions , and x 6 (0) = [0.5, −1.5, 0.1] m. We also consider that the leader has 4 navigation objectives, that is, to sequentially navigate to the points Since this work provides asymptotic results with respect to the error s e , the leader switches navigation goal each time it gets closer than 0.075m to the current goal, i.e., s e ≤ 0.075m.
Next, we demonstrate the control algorithm of Section 4.2. We consider 20 agents in a 2D workspace of r W = 120, populated with 70 obstacles, as depicted in Fig. 2a. The radius of the agents and the obstacles is chosen as r i = r cj = 2, ∀i ∈ N , j ∈ J , and the sensing radius of the agents is taken as ς i = 20, ∀i ∈ N . The masses, and functions f i (p,ṗ), both unknown to the agent, are taken as m i = 1, and and β(x) = 1, for x ≥ 100, with τ = 0.25. The control gains are chosen as k 1i = 0.04, k 2i = 5, k vi = 20, k φi = 1, and k mi = k αi = 0.01, for all i ∈ N . The results are depicted in Fig. 2b for 870 seconds, which shows the convergence of the distance errors p i (t)−p di to zero, ∀i ∈ N , t ∈ [0, 870]. A video illustrating the simulation can be found on https://vimeo.com/393443782.

Cooperative Manipulation
As discussed in Section 1, we aim to incorporate high-level temporal tasks for unactuated objects. Therefore, apart from multi-agent navigation, it is equally important to establish control algorithms for cooperative object manipulation.
In this section, we address the problem of cooperative manipulation of a single object by multiple robotic agents. We consider that the agents grasp the object via means of rigid contacts. Such an assumption is relaxed in our work [88,89], where rolling contacts are employed, but is omitted from the current paper for ease of exposition.
We provide first the model of the cooperative manipulation system. We The overall joint configuration is then q := [q 1 , . . . , q N ] ,q := [q 1 , . . . ,q N ] ∈ R n , with n := i∈N n i . In addition, the inertial position and orientation of the ith end-effector, denoted by p E i and η E i , respectively, can be derived by the forward kinematics and are smooth functions of q i , i.e.
where ω E i ∈ R 3 is the respective angular velocity, can be considered as a transformed state through the differential kinematics v i = J iqi [90], where J i := J i (q i ) : R ni → R 6×ni is a smooth function representing the geometric Jacobian matrix, i ∈ N [90]. We define also the sets which contains all the singularity-free configurations. The differential equation describing the task-space dynamics of each agent is [90]: where 6 ] ∈ R 6 is the task space wrench, representing the control input. All the aforementioned dynamic terms are continuous, and d i (q i ,q i , t) are bounded in t, for all ∀i ∈ N .
Regarding the object, we denote by We consider the following second-order dynamics, which can be derived based on the Newton-Euler formulation: bounded vector representing modeling uncertainties and external disturbances, and h O ∈ R 6 is the vector of generalized forces acting on the object's center of mass. Moreover, Jacobian [90], which is not well-defined when θ O = ± π 2 , which is referred to as representation singularity. Similarly to the robotic agents, the aforementioned dynamic terms are continuous and The pose of the agents and the object's center of mass are related as are the constant distance and orientation offset vectors between {O} and {E i }. Following (13), along with the fact that, due to the grasping rigidity, it holds that where J O i : R ni → R 6×6 is the object-to-agent Jacobian matrix, with which is always full-rank.
The kineto-statics duality along with the grasp rigidity suggest that the force h O acting on the object's center of mass and the generalized forces h i , i ∈ N , exerted by the agents at the grasping points, are related through: where G : , is the full row-rank grasp matrix. By combining (16), (14), (11), and (12), we obtain the coupled dynamics where The problem we consider in this section is the tracking of a trajectory by the object. That is, we aim to design distributed, feedback control laws u such that lim , for all t ∈ R ≥0 ; η d is associated with a desired angular velocity ω d , as in (12a).
We assume that each robotic agent has continuous feedback of its own state q i ,q i and knows the constant offsets p E i E i /O and η E i /O , implying it can compute the pose and velocity of the object's center of mass via (13) and (14), for i ∈ N .
As previously mentioned, the dynamics of real robotic systems cannot be accurately known; therefore, we consider that the object and robot dynamic terms in (12), (11) are uncertain, i.e., they are not fully available for feedback in the control design. Furthermore, as mentioned in Section 1, cooperative object manipulation is part of a larger framework that consists of the safe and possibly timed execution of high-level tasks. Therefore, we present in the following two constrained -control methodologies based on Prescribed Performance Control (PPC) and nonlinear Model Predictive Control (MPC).

Prescribed Performance Control
We first adopt the concepts and techniques of PPC, proposed in [91], in order to achieve predefined transient and steady-state response for the derived error, as well as ensure that θ O (t) evolves in a subset of (− π 2 , π 2 ), for all t ≥ 0. Prescribed performance characterizes the behavior where a signal evolves strictly within a predefined region that is bounded by absolutely decaying functions of time, called performance functions. This signal is represented by the object's pose error e s := e sx , e sy , e sz , e sϕ , e s θ , e s ψ We further assume here that the robotic agents operate away from their kinematic singularities, i.e., each q i (t) evolves in a closed subset of S i , for i ∈ N . Finally, we consider that the dynamic terms and disturbances appearing in the robot and object models (11) and (12), respectively, are completely unknown and cannot be used in the control design.
The mathematical expressions of prescribed performance are given by the following inequalities: where K := {x, y, z, ϕ, θ, ψ} and ρ k : are designer-specified, smooth, bounded and decreasing positive functions of time with l s k , ρ s k ,∞ , k ∈ K, positive parameters incorporating the desired transient and steady-state performance respectively. The terms ρ s k ,∞ can be set arbitrarily small, achieving thus practical convergence of the errors to zero.
Next, we propose a state feedback control protocol that does not incorporate any information on the agents' or the object's dynamics or the external disturbances and guarantees (19) for all t ≥ 0. More specifically, given the errors (18): Step I-a. Select the functions ρ s k as in (20) with where θ * is a positive constant satisfying θ * +θ < π 2 .
Step I-b. Introduce the normalized errors where ρ s := diag{[ρ s k ] k∈K } ∈ R 6×6 , as well as the transformed state functions and design the reference velocity vector where g s is a positive gain constant.
Step II-a. Define the velocity error vector and select the corresponding positive performance functions Step II-b. Define the normalized velocity error where ρ v := ρ v (t) := diag{[ρ v k ] k∈K }, as well as the transformed states ε v : (−1, 1) 6 → R 6 and signals r v : (−1, 1) 6 → R 6×6 , with and design the distributed feedback control protocol for each agent i ∈ N as where g v is a positive constant gain.
Remark 1. The PPC technique guarantees predefined transient and steadystate performance specifications by enforcing the normalized errors ξ s k , and ξ v k , k ∈ K, to remain strictly in (−1, 1) for all t ≥ 0. Owing to (22) and (27), the proposed control algorithm achieves such a containment simply by maintaining the boundedness of the modulated errors ε s k , ε v k , k ∈ K; a careful inspection of (24) and (29) reveals that the PPC algorithm operates similarly to reciprocal barrier functions in constrained optimization, admitting high negative or positive values depending on whether e s k (t) → ±ρ s k (t) and e v k (t) → ±ρ v k (t), k ∈ K, eventually preventing e s k (t) and e v k (t) from reaching the respective boundaries.
Remark 2. Notice from (29) that each agent i ∈ N calculates its own control signal, rendering thus the overall control scheme distributed. The terms l k , ρ k,0 , ρ k,∞ , l v k , and ρ v k ,∞ , k ∈ K needed for the calculation of the performance functions can be transmitted off-line to the agents. Moreover, the PPC protocol is robust to model uncertainties and external disturbances. In particular, note that the control laws do not even require the structure of the terms M , C, g, d.
The correctness of the PPC algorithm is established in the following theorem.
The incorporation of θ * in the aforementioned theorem stems from the need to comply with the singularity constraints − π 2 < θ O (t) < π 2 . One can avoid such singularities by using different representation for the object orientation, such as unit quaternions or rotation matrices. Algorithms for multi-agent control using such representations have been developed in [93,94]. Finally, the developed PPC algorithm can be extended to achieve simultaneous asymptotic stability, except for compliance with the performance bounds [95,96].

Model Predictive Control
In this section, we consider a more general problem, where we take into account explicit constraints regarding the object and the agents [97,98,99].
Such constraints consist of inter-robot collision avoidance, collision avoidance with obstacles, singularity avoidance, and control-input saturation constraints.

Section 4, and denote by
for all t ≥ 0, as well as the input and velocity magnitude constraints: In the following, we assume that the free space is connected, i.e., the set ∅, ∀i, ∈ N , i = , j ∈ J } is nonempty and connected and that the each agent can continuously communicate with the other agents and transmit appropriate information.
We further define the sets associated with the desired collision-avoidance properties, where q −i := [q 1 , . . . , The presented results in this section concern a constant reference pose x d , but can be extended to account for time-varying reference trajectories Further, we consider that the dynamic uncertainties are represented by the state-and time-dependent functions d i (·) and d O (·) of (11) and (12), respectively, while the rest of the dynamics terms are assumed known.
Finally, we ease of exposition, we adopt agents with 6 degrees of freedom, i.e., n i = 6, i ∈ N ; the analysis can be extended, however, to redundant agents (n i > 6).
In order to develop a distributed MPC algorithm, we modify the representation of the coupled object-agents dynamics presented so far. We first decouple the dynamics (17) for each agent's MPC. We define Consider now the constants c i , with 0 < c i < 1 and i∈N c i = 1 that play the role of load sharing coefficients for the agents. Then, the object dynamics (12) can be written as: from which, by employing the grasp coupling (see (16)), the differential kinematics of the agents, (31), and after straightforward algebraic manipulations, we obtain the coupled dynamics where To design a distributed NMPC control scheme, we employ a leader-follower perspective. More specifically, as will be explained in the sequel, at each sampling time, a leader agent solves part of the coupled dynamics (32) via an NMPC scheme, and transmits its predicted variables to the rest of the agents. Assume, without loss of generality, that the leader corresponds to agent i = 1. Loosely speaking, the proposed solution proceeds as follows: agent 1 solves, at each sampling time step, the receding horizon model predictive control subject to the forward nominal dynamics: and a number of inequality constraints, as will be clarified later. After obtaining a control input sequence and a set of predicted variables for q 1 ,q 1 , it transmits the corresponding predicted state for the object for the control horizon to the other agents {2, . . . , N }. Then, the followers solve the receding horizon NMPC subject to the forward nominal dynamics: the state equality constraints: for i ∈ {2, . . . , N }, as well as a number of inequality constraints that incorporate obstacle and inter-agent collision avoidance. More specifically, we consider that there is a priority sequence among the agents, which we assume, without loss of generality, that is defined by {1, . . . , N }. Each agent, after solving its optimization problem, transmits its calculated predicted variables to the agents of lower priority, which take them into account for collision avoidance. Note that the coupled object-agent dynamics are implicitly taken into account in equations (33), (34) in the following sense. Although the coupled model (32) does not imply that each one of these equations is satisfied, by forcing each agent to comply with the specific dynamics through the optimization procedure, we guarantee that (32) is satisfied, since it's the result of the addition of (33) and (34), for i = 1 and every i ∈ {2, . . . , N }, respectively. Intuitively, the leader agent is the one that determines the path that the object will navigate through, and the rest of the agents are the followers that contribute to the transportation.
Moreover, the equality constraints (35) guarantee that the predicted variables of the agents {2, . . . , N } will comply with the rigidity at the grasping points.
By using the notation x qi := [x qi,1 , x qi,2 ] := [q i ,q i ] ∈ R 2ni , i ∈ N , the nonlinear dynamics of each agent can be written as: where dynamics of each agent are given by omitting the disturbance d qi (·): and· is used to denote the nominal signals. It can be concluded that the matrices one also concludes that the linearized nominal systems are stabilizable around x qi = 0.
We define now the error and nominal error as: which gives us the dynamics:ė where we employed (39) and the object dynamics. The input-constraint sets are now defined as where σ min (·) denotes the minimum singular value. Define also the sets for all i ∈ {2, . . . , N }. The sets X Di capture all the state constraints of the system dynamics (36), i.e., representation-and singularity-avoidance, collision avoidance among the agents and the obstacles, and collision avoidance of the object with the obstacles. Such constraints are assigned to the leader agent only. We further define the set which now represents the constraints set for the NMPC scheme of the leader.
The main problem at hand is the design of a feedback control law u 1 ∈ U D1 for agent 1 that guarantees that the error signal e D1 with dynamics given in (39), satisfies lim t→∞ e D1 (x q1 (t)) → 0, while ensuring the aforementioned constraints. The role of the followers {2, . . . , N } is, through the load-sharing coefficients c 2 , . . . , c N in (32), to contribute to the object trajectory execution, as derived by the leader agent 1, while also avoiding collisions. In order to solve the aforementioned problem, we propose a NMPC scheme, which is presented hereafter.
The control law for the leader agent consists of a nominal control actionū 1 and a state feedback law κ 1 (e D1 ,ē D1 ). As it will be presented hereafter,ū 1 will be the outcome of a nominal finite-horizon optimal control problem (FHOCP), solved at each sampling time instant; and the feedback law κ 1 (e D1 ,ē D1 ) is used to guarantee that the real trajectory e D1 (x q1 (t)) remains in bounded hypertubes centered in the nominal trajectoriesē D1 (x q1 (t)), for all times. The volume of the hyper-tubes depends on the upper bound of the disturbances d D1 as well as the agent dynamics. The state feedback law is designed as where k D1 is a positive control gain. By appropriately setting k D1 , we guarantee that e D1 −ē D1 ∈ Z 1 , where Z 1 is a closed ball in R 12 . More details are provided in [100].
Next, we describe the design of the componentū 1 . Consider a sequence of sampling times {t j }, j ∈ N with t j+1 = t j + h s , h s ∈ (0, T p ), and T p the respective horizon. For agent 1, the open-loop input signal applied in between the sampling instants is given by the solution of the following FHOCP: At a generic time t j then, agent 1 solves the aforementioned FHOCP. The functions F D1 : R 12 × R 6 → R ≥0 , V D1 : R 12 → R ≥0 stand for the running cost and the terminal penalty cost, respectively, and they are defined as: F D1 (e D1 , u 1 ) := e D1 Q D1 e D1 + u 1 R D1 u 1 , V D1 (e D1 ) := e D1 P D1 e D1 , where R D1 ∈ R 6×6 and P D1 ∈ R 12×12 are symmetric and positive definite gain matrices; Q D1 ∈ R 12×12 is a symmetric and positive semi-definite controller gain matrix and E D1 is the terminal set used to force stability of the closed-loop system. We will explain hereafter the form of the setsĒ D1 (q −1 (t j )) andŪ D1 . In order to guarantee that, while agent 1 is solving the FHOCP (40) for its nominal system, the real trajectory e D1 (x q1 (t)) and the control inputs u 1 (t) satisfy the state and input constraints E D1 (q −1 (t j )) and U D1 , respectively, the sets E D1 (q −1 (t j )) and U D1 need to be properly modified. In view of the fact that e D1 −ē D1 ∈ Z 1 under the feedback control law κ(e D1 ,ē D1 ), it holds that e D1 (x q1 (s)) ∈ Z 1 ⊕ {ē D1 (x q1 (s))}, for s ∈ [t j , t j + T p ]. Therefore,Ē D1 (q −1 (t j )) is defined aĒ D1 (q −1 (t j )) := E D1 (q −1 (t j )) Z 1 . By following a similar reasoning,Ū D1 is defined asŪ D1 := U D1 [(−k D1 ) • Z 1 ]. Intuitively, the sets E D1 (q −1 (t j )) and U D1 are tightened due to the difference between e D1 (x q1 (t)) andē D1 (x q1 (t)).
The solution to FHOCP (40a) -(40e) starting at time t j provides an optimal control input, denoted byū 1 (s; e D1 (x q1 (t j )), x q (t j )), s ∈ [t j , t j + T p ], x q := [x q1 , . . . , x q N ] . This control input is then applied to the system until the next sampling instant t j+1 : u 1 (s;ē D1 (x q1 (t j )),x q (t j )) =ū 1 (s; e D1 (x D1 (t j )), x q (t j )) , ∀s ∈ [t j , t j+1 ). (41) At time t j+1 = t j + h s , a new FHOCP is solved in the same manner, leading to a receding horizon approach. The control inputū 1 (·) is of feedback form, since it is recalculated at each sampling instant based on the then-current state.
After the solution of the FHOCP and the calculation of the predicted states x q1 (s), s ∈ [t j , t j+1 ), at each time instant t j , agent 1 transmits the values q 1 (s),q 1 (s) as well as x O 1 (q 1 (s)) and v O 1 (q 1 (s),q 1 (s)), computed by (30), (31), s ∈ [t j , t j + T p ], to the rest of the agents {2, . . . , N }. The rest of the agents then proceed as follows. Each agent i ∈ {2, . . . , N } applies the control input u i =ū i + κ i (·), similarly to the leader; κ i (·) is a closed-form state feedback law associated with some error function, such as e Di defined as in (38), andū i is the solution of the FHOCP: subject to: for s ∈ [t j , t j + T p ] and at every sampling time t j , where J Di is an associated cost function. The constraint (42c) guarantees that agent i will obtain a trajectory that does not collide with the predicted trajectories of the agents higher in priority, or the agents lower in priority at t j . Note that, through the equality constraints (42d), (42e), the follower agents must comply with the trajectory computed by the leaderq 1 (s),q 1 (s). We further assume that the aforementioned constraints are feasible, i.e., the sets {q ∈ R n : Next, similarly to the leader agent, agent i > 1 calculates the predicted states q i (s),q i (s), s ∈ [t j , t j + T p ], which then transmits to the agents {i + 1, . . . , N }. In that way, at each time instant t j , each agent i ∈ {2, . . . , N } receives the other agents' states, incorporates the constraint (42c) for the agents {i + 1, . . . , N }, receives the predicted statesq (s),q (s) from the agents ∈ {2, . . . , i − 1} and incorporates the collision avoidance constraint (42c) for the entire horizon. Loosely speaking, we consider that each agent i ∈ N takes into account the first state of the next agents in priority (q (t j ), ∈ {i + 1, . . . , N }), as well as the transmitted predicted variablesq (s), ∈ {1, . . . , i − 1} of the previous agents in priority, for collision avoidance. Intuitively, the leader agent executes the planning for the followed trajectory of the object's center of mass (through the solution of the FHOCP (40a)-(40e)), the follower agents contribute in executing this trajectory through the load sharing coefficients c i (as indicated in the coupled model (32)), and the agents low in priority are responsible for collision avoidance with the agents of higher priority. Moreover, the aforementioned equality constraints (42d), (42e) as well as the forward dynamics (42a) guarantee the compliance of all the followers with the model (32).
Given the constrained FHOCP (42a)-(42f), the solution of the problem lies in the capability of the leader agent to produce a state trajectory that guarantees x O 1 (q 1 (t)) → x des , by solving the FHOCP (40a)-(40e) [98]. A formal proof is similar to the ones presented in [101,100] and is therefore omitted.

Energy-Optimal Cooperative Manipulation
An important aspect in cooperative manipulation systems is the regulation of internal forces. Internal forces are forces exerted by the agents at the grasping points that do not contribute to the motion of the object. While a certain amount of such forces is required in many cases (e.g., to avoid contact loss in multi-fingered manipulation), they need to be minimized in order to prevent object damage and unnecessary effort of the agents. In particular, the forces h = [h 1 , . . . , h N ] of (11) between the object and the agents can be decoupled into motion-induced and internal forces The internal forces h int are squeezing forces that the agents exert to the object and belong to the nullspace of G(x) (i.e., G(x)h int = 0). When h = h int , it holds that G(x)(u − Mv − Cv − g − d) = 0 and the object moves according to Hence, h int does not contribute to the object's motion and results in internal stresses that might damage it.
In cooperative manipulation schemes, the most energy-efficient way of transporting an object is to exploit the full potential of the cooperating robotic agents, i.e., each agent does not exert less effort at the expense of other agents, which might then potentially exert more effort than necessary. For instance, consider a rigid cooperative manipulation scheme, with only one agent (a leader) working towards bringing the object to a desired location, whereas the other agents have zero inputs. Since the grasps are rigid, if the leader is equipped with sufficiently powerful actuators, it will achieve the task by "dragging" the rest of the agents, compensating for their dynamics, and creating non-negligible internal forces. In such cases, when the cooperative manipulation system is rigid (i.e., the grasps are considered to be rigid), the optimal strategy of transporting an object is achieved by regulating the internal forces to zero. Therefore, from a control perspective, the goal of a rigid cooperative manipulation system is to design a control protocol that achieves a desired cooperative manipulation task, while guaranteeing that the internal forces remain zero.
By employing Gauss' principle for constrained motion and exploiting the grasp rigidity, we obtain a closed-form expression for the internal forces as Based on the aforementioned expression, we derive new results on the expression of h int as well as the internal-force-free distribution of an object force to the agents.
Theorem 4 ( [102,103]). Consider N robotic agents rigidly grasping an object. Then it holds that Further, let a desired force h O,d ∈ R 6 to be applied to the object, which is distributed to the agents' desired forces as h d = G * h O,d , and where G * is a rightinverse of G, i.e., GG * = I 6 . Then it holds that Based on the aforementioned result, we provide next a control protocol that guarantees convergence to a reference x d while guaranteeing regulation of internal forces to zero. In order to avoid the representation singularity of θ O = π 2 (see (12a)), we use the orientation error metric e o = 1 2 tr( (3) is a desired rotation matrix, corresponding to η d and evolving according toṘ d = S s (ω d )R d . which we desire to drive to zero [102]. The aforementioned singularity is translated now to an undesired equilibrium located at the configuration that satisfies e o = 2. The proposed control protocol guarantees e o (t) < 2 for all t ≥ 0.
Corollary 1 ( [102,94]). Consider N robotic agents rigidly grasping an object, with coupled dynamics (17). Further assume that d O (·) = d i (·) = 0, for all i ∈ N , and e o (0) < 2. Consider the control law are positive definite matrices, and k p2 ∈ R >0 is a positive constant. Then the solution of the closed-loop coupled system satisfies: Note that the employed inverse dynamics controller requires knowledge of the agent and object dynamics. In case of dynamic parameter uncertainty, standard adaptive and/or robust control schemes (like the ones of the previous sections) that attempt to compensate for potential uncertainties in the model would intrinsically create internal forces, since the dynamics of the system would not be explicitly accounted for. Further, note that G * = M G (GM G ) −1 induces an implicit and natural load-sharing scheme via the incorporation of M . More specifically, note that the force distribution to the robotic agents via Hence, larger values of M i will produce larger inputs for agent i, implying that agents with larger inertia characteristics will take on a larger share of the object load. Note that this is also a desired load-sharing scheme, since larger dynamic values usually imply more powerful robotic agents.

Simulation and Experimental Results
We first present experimental results for the PPC scheme of Section 5. g s = 0.05 and g v = 10, respectively. The experimental results are depicted in Fig. 3, which shows the pose and velocity errors e s (t), e v (t) along with the respective performance functions, for t ∈ [0, 70] seconds. A video illustrating the experimental results can be found on https://youtu.be/jJWeI5ZvQPY. Next, we demonstrate the MPC scheme of Section 5.2. We consider a simulation example with N = 3 ground vehicles equipped with 2-DOF manipulators, rigidly grasping an object with n 1 = n 2 = n 3 = 4. The states of the agents are given as: are the vehicles' position and α i = [α i1 , α i2 ] ∈ R 2 are the manipulators' joint angles, i ∈ N = {1, 2, 3}. The manipulators become singular when sin(α i1 ) = 0, i ∈ N , thus the state constraints for the manipulators are set to ≤ α i1 ≤ π 2 − , − π 2 + ≤ α i2 ≤ π 2 − , i ∈ N . We also consider the input constraints: −10 ≤ u i,j (t) ≤ 10, i ∈ N , j ∈ {1, . . . , 4}. The initial conditions of agents and the object are set to: q 1 (0) = [0.5, 0, π 4 , π 4 ] ,  Fig. 4, which shows the error states of the agents, successfully converging to zero. The simulation was carried out by using the NMPC toolbox given in [104] and it took 13450 sec in 6 , w p = w φ = w ψ = 1, w θ = 0.5, and η d (t) is transformed to the respective R d (t). The control gains are set as K p1 = 15, k p2 = 75, and K d = 40I 6 .
The results are given in Figs. 5 and 6 for 15 seconds. Fig. 5 depicts the pose and velocity errors e p (t), e O (t), e v (t), which are shown to converge to zero for both choices of G * , as expected. Fig. 6 depicts the norm of the internal forces, h int (t) . It is clear that G * 2 yields significantly large internal forces, whereas G * 1 keeps them very close to zero, as proven in the theoretical analysis. A video illustrating the aforementioned simulation can be found on https://youtu.be/a31LTBBkE-Q.

Abstractions of Multi-Agent and Multi-Agent-Object Systems
The control algorithms of the previous sections offer a way to abstract the underlying continuous system to a discrete representation. Such a discrete representation can be used to synthesize a higher-level controller such that the system satisfies a temporal-logic task. In the following, we present such an abstraction and controller synthesis for both multi-agent and multi-agent-object systems.

Multi-Agent Systems
We first consider the distributed control-synthesis problem for a multi-agent system subject to linear temporal logic (LTL) tasks [106,107]. Formally, we consider N agents, characterized by p i ∈ R n , i ∈ N = {1, . . . , N } and operating in a static workspace W. Similarly to Section 4, we consider that the agents evolve subject to 2nd-order uncertain dynamics and their physical volume is bounded by the closed spheresB(p i , r i ), where r i is the radius of agent i ∈ N . The workspace is populated with J ≥ 0 obstacles at c j ∈ R n represented by the closed ballsB(c j , r cj ), j ∈ J = {1, . . . , J}. We further assume that there exist K R > 1 points in W, denoted by p π k ∈ R n , corresponding to certain properties of interest (e.g., gas station, repairing area, etc.), with K R := {1, . . . , K R }. Since, in practise, these properties are naturally inherited to some neighbourhood of the respective point of interest, we define for each k ∈ K R the region of interest π k , corresponding to p π k , as the the closed ball π k :=B(p π k , r π k ) with radius r π k > 0, for k ∈ K R . These properties of interest are expressed as boolean variables via finite, disjoint sets of atomic propositions Ψ i , i ∈ N . The properties satisfied at each region π k are provided by the labelling functions L i : Π → 2 Ψi , which assigns to each region π k the subset of the atomic propositions Ψ i that are true in that region. We provide now some tools that relate the continuous multi-agent system with the navigation of the agents to the regions of interest.
Definition 1. LetB(p i (t 0 ), r i ) ⊂ π k0 , i.e., agent i is in region π k0 , k 0 ∈ K R , for some t 0 ≥ 0. Then, agent executes a transition from π k0 to π k1 , k 1 ∈ K R \{k 0 }, denoted by π k0 → i π k1 , if there exists a finite time instant t 1 ≥ t 0 such thatB(p i (t 1 ), r i ) ⊂ π k1 andB(p i (t), r i ) ∈N \{i}B (p (t), r ) = B(p i (t), r i ) j∈JB (c j , r cj ) = ∅, r W > p i (t) + r i , for all t ∈ [t 0 , t 1 ]. Loosely speaking, agent i can transit between two regions in Π if it can navigate among them while avoiding collisions with other agents and obstacles. We note that, if the agent's task requires it to avoid a certain region of Π, then the latter can be considered as an obstacle.
Our goal here is to control the multi-agent system so that each agent's behaviour obeys a given temporal-logic specification over its atomic propositions Ψ i . Formally, the behaviour of agent i over its trajectory , with k m ∈ K R and an increasing sequence of positive time instants t im , m ∈ N. The control objectives are given for each agent separately as LTL formulas Φ i over Ψ i , i ∈ N (see Section 3). Agent i By using the control algorithm of Section 4, one guarantees the transitions π k → i π k , for all k, k ∈ K R and agents i ∈ N , which gives rise to the following finite transition system that encodes the motion of the agents in the workspace: where we use π init i ∈ Π to denote the ith agent's initial region. After defining the transition systems T i , the problem at hand reduces to deriving high-level paths for the agents, as sequences of regions of T i to be visited, that satisfy their individual specifications encoded by Φ i . In order to derive such paths, we follow an algorithm inspired by formal-verification methodologies [79]. The algorithm translates each formula Φ i to a Büchi Automaton C i and computes the product T i := T i × C i . One can compute the accepting runs of T i by viewing it as a weighted digraph and using standard graph-search methods. Such accepting runs satisfy Φ i and are directly projected to a sequence of waypoints to be visited, providing therefore a desired path for agent i. Although the semantics of LTL is defined over infinite sequences of atomic propositions, it can be proven that there always exists a high-level path that takes a form of a finite state sequence followed by an infinite repetition of another finite state sequence. For more details on the followed technique, we refer the reader to the related literature, e.g. [79].

Multi-Agent-Object Systems
As mentioned in Section 1, numerous applications require that, except for the agents, unactuated objects undergo certain processes. Therefore, this section considers the control-synthesis problem under tasks expressed as temporal-logic specifications for a team of robotic agents and a set of unactated objects [108,109].
The problem setup follows Section 6.1, i.e. we consider a set of N agents operating in a workspace W ⊂ R n with a set Π of K R of spherical regions around points of interest p π k , k ∈ K R and the respective atomic proposition set Ψ i and labelling functions L i , i ∈ N . We further consider M R unactuated objects, characterized by the variables p oj (e.g., center of mass), for j ∈ M R := {1, . . . , M R }. The objects have their own set of properties of interest over Π, as expressed by the disjoint sets Ψ oj of atomic propositions and the respective labelling functions L oj : Π → 2 Ψo j , j ∈ M R . Similarly to the agents, the objects' volume is modelled by the spheresB(p oj , r oj ), j ∈ M R . We note that the agents are equipped with suitable equipment, e.g., robotic manipulators, in order to be able to interact with the objects. For ease of exposition, however, we consider that each agent's volume is represented by the sphereB(p i , r i ), where p i denotes a point on agent i (e.g., its center of mass).
We further introduce some additional tools for the interaction among the agents and the objects. First, we consider that the agents have specific power capabilities, which for simplicity, we associate to positive integers ζ i > 0, i ∈ N , via an analogous relation. Hence, we define a function Λ ∈ {True, False} that outputs whether the agents that potentially grasp an object are able to transport it, based on their power capabilities. For instance, Λ(j, A) = True, implies that the agents with indices in A ⊆ N have sufficient power capabilities to (cooperatively) transport object j. Second, we define the boolean functions AG j : R ≥0 → 2 N0 , with N 0 := N ∪ {0}, to denote which agents grasp an object j ∈ M R at a specific time instant; AG j (t s ) = {0} means that no agents grasp object j at time t s . Note also that i ∈ AG j (t s ) ⇔ i / ∈ AG j (t s ) = False, for all j ∈ M R \{j}, i.e., agent i can grasp at most one object at a time. We further denote AG := [AG 1 , . . . , AG M R ] ∈ (2 N0 ) M R . Finally, we represent the volume of a set of agents A ⊆ N grasping an object j ∈ M R by the sphereB(p oj , r A,j ), with r A,j being the bounding radius, which is large enough to cover the volume of the coupled system.
We now define action primitives for the multi-agent-object systems, similarly to Definition 1. In these definitions, we use the term "entity" to refer to single agent, object as well as systems comprised by agents that grasp an object (agents-object systems). The number of these systems depends on the variables AG. Given a grasping configuration AG ∈ (2 N0 ) M R , considerT R (AG) number of entities, indexed by the set T R (AG) := {1, . . . ,T R (AG)}. Each entity (agent, object, or coupled agents-object system) is characterized by the respective configuration (e.g., p i , p oj , or p o j ) and radius (e.g., r i , r oj , or r A,j ), respectively, which we denote for simplicity by the generic variables p e i , r e i , for all i ∈ T R (AG).
Definition 2. (Navigation) Let AG(t 0 ) ∈ (2 N0 ) M and i ∈ T (AG(t 0 )) such thatB(p e i (t 0 ), r e i ) ⊂ π k0 , for some k 0 ∈ K R and t 0 ≥ 0. Then, entity i executes a transition from π k0 to π k1 , with k 1 ∈ K R , if there exists a finite time instant Definition 2 corresponds to the action primitives of agent navigation and (cooperative) object transportation, depending on what entity i is, from π k0 to π k1 . When the transition corresponds to a single-agent navigation for agent i ∈ N , we denote it by π k0 → i π k1 ; when it corresponds to a transportation of object j ∈ M R by a subset of agents A ⊂ N , we denote it by π k0 T − → A,j π k1 . We further need the definition of grasping an object. Definition 3. (Grasping) Let AG(t 0 ) ∈ (2 N0 ) M and i ∈ N , j ∈ M R such that i / ∈ AG j (t 0 ). Then, agent i grasps object j, denoted by i g − → j, if there exists a finite time instant t 1 ≥ t 0 such that i ∈ AG j (t 1 ) and r W > p e i (t) + r e i , B(p e i (t 1 ), r e i ) ∈N \{i}B (p e (t), r e ) = j∈JB (c j , r cj ) = ∅, for all t ∈ [t 0 , t 1 ].
Similarly, we can define the releasing action i r − → j for an agent i ∈ N and object j ∈ M R .
Next, we assume the existence of a procedure P s that outputs whether or not a set of non-intersecting spheres fits in a larger sphere as well as possible positions of the spheres in the case they fit. More specifically, given a region of interest π k and a number N ∈ N of sphere radii (of robots, objects, or objectrobots systems) the procedure can be seen as a function P s := [P s,0 , P s,1 ] , where P s,0 : R N +1 ≥0 → {True, False} outputs whether the spheres fit in the region π k whereas P s,1 provides possible configurations of the robots and the objects or 0 in case the spheres do not fit. For instance, P s,0 (r π2 , r 1 , r 3 , r o1 , r o5 ) determines whether the robots 1, 3 and the objects 1, 5 fit in region π 2 , without colliding with each other; (p e 1 , p e 2 , p e 3 , p e 4 ) = (p 1 , p 3 , p o1 , p o5 ) = P s,1 (r π2 , r e 1 , r e 2 , r e 3 , r e 4 ) = P s,1 (r π2 , r 1 , r 3 , r o1 , r o5 ) provides a set of configurations such that the pairwise intersections amongB i (p e i , r e i ), i ∈ {1, . . . , 4}, are empty. The problem of finding an algorithm P s is a special case of the sphere packing problem [110]. Note, however, that we are not interested in finding the maximum number of spheres that can be packed in a larger sphere but, rather, in the simpler problem of determining whether a set of spheres can be packed in a larger sphere.
Our goal is to control the multi-agent-object system defined above such that the agent and the objects obey a given specification over their atomic propositions Ψ i , Ψ oj , ∀i ∈ N , j ∈ M. Similar to the agents' behaviour, defined in Section 6.1, we define the behaviour of object j, over its trajectory p oj , as the infinite sequence b oj (ψ oj ) := (p oj (t oj ,1 ), ψ oj ,1 )(p oj (t oj ,2 ), ψ oj ,2 ) . . . ,, with ψ oj ,m ∈ 2 Ψo j andB(p oj (t oj ,m )) ⊂ π km , ψ oj ,m ∈ L oj (π km ), with k m ∈ K R , and an increasing sequence of time instants t oj ,m , m ∈ N.
As in Section 6.1, the control objectives are given as LTL formulas Φ i , Φ oj over Ψ i , Ψ oj , respectively, for all i ∈ N , j ∈ M R . The control-synthesis problem is the derivation of actions for the agents such that each agent i ∈ N and each object j ∈ M R satisfy their LTL formulas Φ i and Φ oj , respectively, i.e., actions that produce behaviours b i (ψ i ) and b oj (ψ oj ) such that As in Section 6.1, we abstract the continuous multi-agent-object system into a discrete transition system. For the execution of the navigation primitives (Def. 2), one can use the multi-agent navigation and cooperative-manipulation control algorithms of Sections 4 and 5, respectively. For the execution of the grasping/releasing primitives (Def. 3), one can employ existing methodologies that can derive the respective control actions (e.g., [111]). Therefore, we can abstract the behaviour of the multi-agent-object system using the finite transition system T O = (Π s , Π init s , → s , AG, Ψ, L, Λ, P s , χ) where 1. Π s ⊂Π ×Π o × (2 N0 ) M is the set of states;Π := Π 1 × · · · × Π N and Π o := Π o1 × · · · × Π o M R are the set of states-regions that the agents and the objects can be at, with Π i = Π oj = Π, for all i ∈ N , j ∈ M R ; By definingπ := (π k1 , · · · , π k N ) ,π o := (π ko,1 , · · · , π k o,M R ), then the coupled state π s := (π,π o , AG) belongs to Π s , i.e., (π,π o , AG) ∈ Π s if (a) P s,0 r π k , [r i ] i∈{i∈N :ki=k} , [r oj ] j∈{j∈M R :ko,j =k} = , i.e., the respective agents and objects fit in the region, for all k ∈ K R , (b) k i = k o,j for all i ∈ N , j ∈ M R such that i ∈ AG j , i.e., an agent must be in the same region with the object it grasps, 2. Π init s ∈ Π s is the initial state at t = 0, 3. → s ⊂ Π s × Π s is a transition relation defined as follows: given the states π s , π s ∈ Π, with π s :=(π,π o , AG) :=(π k1 , . . . , π k N , π ko,1 , . . . , π k o,M R , AG 1 , . . . , AG M ), π s :=( π, π o , AG) :=(π k1 , . . . , π k N , π ko,1 , . . . , π ko,1 , AG 1 , . . . , AG M ), a transition π s → s π s occurs if all the following hold: (a) i ∈ N , j ∈ M R such that i ∈ AG j , i / ∈ AG j (or i / ∈ AG j , i ∈ AG j ), and k i = k i , i.e., there are no simultaneous grasp/release and navigation actions, , there are no simultaneous grasp/release and transportation actions, (c) i ∈ N , j, j ∈ M R , with j = j , such that i ∈ AG j and i ∈ AG j , i.e., there are no simultaneous grasp and release actions, (d) j ∈ M R such that k o,j = k o,j and i / ∈ AG j , ∀i ∈ N ( or i / ∈ AG j , ∀i ∈ N ), i.e., there is no transportation of a non-grasped object, (e) j ∈ M R , T ⊆ N such that k o,j = k o,j and Λ(j, ζ T ) = ⊥, with (i ∈ AG j , i ∈ AG j ) ⇔ i ∈ T , i.e., the agents grasping an object are powerful enough to transfer it, 4. Ψ :=Ψ ∪Ψ o withΨ = i∈N Ψ i andΨ o = j∈M R Ψ oj , are the atomic propositions of the agents and objects, respectively. 5. L : Π s → 2 Ψ is a labeling function defined as follows: Given a state π s as in (44) and ψ s := i∈N ψ i j∈M R ψ oj with ψ i ∈ 2 Ψi , ψ oj ∈ 2 Ψo j , then ψ s ∈ L(π s ) if ψ i ∈ L i (π ki ) and ψ oj ∈ L oj (π ko,j ), for all i ∈ N , j ∈ M R . 6. Λ and P s as defined previously. 7. χ : (→ s ) → R ≥0 is a function that assigns a cost to each transition π s → s π s . This cost might be related to the distance of the robots' regions in π s to the ones in π s , combined with the cost efficiency of the robots involved in transport tasks (according to ζ i , i ∈ N ).
Next, we proceed as in Section 6.1; we form the global LTL formula Φ := (∧ i∈N Φ i ) ∧ (∧ j∈M R Φ oj ) over the set Ψ, and we translate it to a Büchi Automaton CO. Next, we compute the product T O := T O × CO and use graphsearch-based methodologies to find an accepting path that satisfies Φ. This path is projected to the original transition system T O and provides a sequence of action primitives for the agents to execute (navigation, grasping, releasing, or cooperative transportation). Note that, unlike Section 6.1, where the tasks were specified independently for each agent, the object-related LTL tasks require the coupling and collaboration of more than one agents. Such a coupling results in a centralized path-planning procedure, i.e., a central computer unit computes T O and the accepting path for all the agents. We note that such a centralized procedure might yield significant computational complexity, in terms of computer memory and running time. One can, however, employ procedures more sophisticated than simple graph search to compute an accepting path of T O in order to reduce the computational complexity (such as STyLuS * [109]).

Timed Abstractions for Cooperative Manipulation
In the previous sections, we considered the motion planning and control of multi-agent and multi-agent-object systems subject to tasks expressed as LTL constraints. In this section, we consider the problem of controlling a cooperatively manipulated object such that it satisfies a task expressed as metric interval temporal logic (MITL) constraints [112,113]; such constraints allow the incorporation of time constraints (deadlines), enriching thus the range of available tasks (see Section 3).
We consider here N robotic agents rigidly grasping an object in a bounded workspace W ⊂ R 3 , as in Section 5. The agent, object, and coupled dynamics are given by (11), (12), and (17), respectively. In contrast to section 6.1 and 6.2, where the LTL tasks were specified over specific "regions of interest" π k , we consider here a different discretization of the workspace, described next.
We denote by S q the set that consists of all points p s ∈ W that physically belong to the coupled system, i.e., they consist part of either the volume of the agents or the volume of the object. Note that these points depend on the agents' configuration variables q. We further define the constantL ≥ sup q∈R n ps∈Sq p s − p O (q) , where we express p O as a function of q. Note that, although the explicit computation of S q may not be possible,L is an upper bound of the maximum distance between the object center of mass and a point in the coupled system's volume over all possible configurations q, and thus, it can be measured. For instance, Fig. 7 showsL for a system comprised by two robotic agents and an object. It is straightforward to conclude that S q ⊂B(p O (q),L), ∀q ∈ R n . Next, we partition the workspace W into R W equally sized rectangular regions Π = {π 1 , . . . , π R W }, whose geometric centers are denoted by p c πj ∈ W, j ∈ {1, . . . , R W }. The length of the region sides is set to D = 2L + 2l 0 , where l 0 is an arbitrary positive constant. Hence, each region π j can be formally defined as follows: with p c πj+1 − p c πj = (2L + 2l 0 ), ∀j ∈ {1, . . . , R − 1}, and (p c πj ) z :=L + l 0 , ∀j ∈ {1, . . . , R W }; (·) k , k ∈ {x, y, z}, denotes the k-th coordinate. An illustration of  : Top view of a transition between two adjacent regions π j and π j . Since pO ∈ B(p j,j (t), l 0 ), we conclude that Sq ⊂B(pO,L) ⊂B(p j,j (t), l 0 +L) ⊂ π j ∪ π j .
Intuitively, if we guarantee that the object's center of mass stays l 0 -close to p j,j , i.e., p O (t) − p j,j (t) < l 0 , for all t ≥ t 0 , then p O (t 0 + δt j,j ) − p c π j < l 0 , we obtain S q(t) ⊂B(p O (t),L) ⊂B(p j,j (t),L + l 0 ) ⊂ π j ∪ π j , for all t ≥ t 0 (and therefore t ∈ [t 0 , t 0 + δt j,j ]), and thus the requirements of Definition 4 for the transition relation are met. Fig. 9 illustrates the aforementioned reasoning.
for user-defined performance functions ρ s k (t) and all k ∈ {x, y, z, ϕ, θ, ψ}. Now, by selecting ρ s,k (t 0 ) = l 0 ≥ ρ s,k (t), for t ≥ t 0 and k ∈ {x, y, z}, the PPC algorithm guarantees that p O (q(t)) ∈ B(p j,j (t), l 0 ), ∀t ≥ t 0 and, consequently, p O (q(t 0 + δt j,j )) ∈ B(p c π j , l 0 ), since p j,j (t 0 + δt j,j ) = p c π j . Moreover, since p O (q(t)) ∈ B(p j,j (t), l 0 ), we deduce that B(p O (q(t)),L) ⊂ B(p j,j (t), l 0 +L) and S q(t) ⊂ π j ∪ π j , ∀t ∈ [t 0 , t 0 + δt j,j ] ⊂ [t 0 , ∞), and therefore a transition relation with time duration δt j,j is successfully established. More details can be found in [112,113]. Therefore, we can abstract the motion of the coupled agents-object system as a timed transition system where we further introduce the map γ T : ( T − →) → R ≥0 that assigns to each transition its time duration, i.e., γ T (π j T − → π j ) = δt j,j . The next step of the algorithm follows a similar procedure as that of Sections 6.1 and 6.2. Namely, we translate the MITL formula Φ into a Timed Büchi Automaton C t Φ [114], we compute the product T = T ⊗ C t Φ [79], and find its accepting runs. The projection of these accepting runs onto T provides a infinite sequence of regions in Π to be visited at specific time instants; such a sequence satisfies Φ. More details on the technique can be found in [79,115,114].
complete partition of the state space, ensuring timed navigation among neighbouring cells. More general workspaces with arbitrarily-shaped obstacles can be found in [120,121,122] for single-agent systems. The extension to multiple agents, however, is significantly challenging since the inter-agent interactions and potential actuation constraints may prevent them from achieving navigation in pre-defined time intervals. Finally, the presented algorithms focus on the safety of the underlying system and on the adaptability and robustness to dynamic uncertainties and exogenous disturbances. Safety is a crucial property when it comes to physical dynamical systems that interact with each other, such as robots. Being an invariance-like property, however, safety concerns a much wider range of systems, including for instance sensor networks, biological systems, or power systems, where several state variables must operate within certain regions of operation. Robustness and adaptability to model uncertainties is an equally important property; the vast majority of dynamical systems cannot be modelled accurately and suffer, in many cases, from unknown disturbances or unexpected faults.
One can identify numerous future directions of the presented results. Regarding the control algorithms of Sections 4, 5, an important direction consists of the incorporation of explicit control-input constraints. For the multi-agent navigation (Section 4), an open problem is the simultaneous and timed navigation of the agents in complex environments. Such a problem is also interesting for the cooperative manipulation problem (Section 5), since the complex structure of robotic manipulators brings significant challenges when it comes to collision avoidance. Regarding the planning algorithms of Section 6, an important direc- tion is the development of reconfigurable and more decentralized procedures for the derivation of high-level paths in multi-agent-object systems.