Autonomous collision avoidance for wheeled mobile robots using a differential game approach

A multi-agent system consisting of N wheeled mobile robots is considered. The robots are modeled by unicycle dynamics and the multi-agent collision avoidance problem , which lies in steering each robot from its initial position to a desired target position while avoiding collisions with obstacles and other agents is considered. The problem is solved in two steps. First, exploiting a differential game formulation, collision-free trajectories are generated for virtual agents satisfying single-integrator dynamics. Second, the previous step is used to construct dynamic feedback strategies for the wheeled mobile robots satisfying unicy-cle dynamics which ensure each of the robots reaches its target without collisions occurring. A numerical study of the proposed methodology is provided through a series of simulations.


Introduction
With several important areas of application, such as mobile sensor networks [2,9,10,17,18,36] , air traffic management [7,35] , power systems [19] and distributed robotics, including collision avoidance and formation flying [15,20,22] , multi-agent systems have gained much interest in recent years. Essentially a multiagent system is a system consisting of several, interacting subsystems, which we refer to as agents. While a system of several simple agents can perform tasks more efficiently and reliably than a single (possibly more complex) agent, often such systems and their applications call for the development of novel control methodologies.
In the context of robot navigation many approaches available in the literature are based on a so-called navigation function (see, for instance [3,6,31,33,34] ) which is constructed from the geometric information on the considered topology and employed to define gradient descent control laws. Recognising that multi-agent systems are ubiquitous in the nature, other approaches are inspired by naturally occurring systems, such as schools of fish, migrating birds and swarms of bees [8,12,[25][26][27]29,32] . In the presence of communication constraints, the communication topology is often represented by a graph and control strategies are designed using notions from graph theory [15,28] . The use of game theory in the context of multi-agent systems has also been explored (see, for instance, [1,13,14,24] ).
The multi-agent collision avoidance problem for a system of N wheeled mobile robots (WMRs) is considered in this paper, i.e. assuming the WMRs satisfy unicycle dynamics we consider the problem of designing feedback controls such that the robots autonomously reach predefined targets while avoiding collisions with obstacles and other robots. A Lyapunov-based approach for avoidance control has been considered in [11] , where the authors recognise that the main challenge associated with the approach lies in determining Lyapunov functions. In [22,24] the multi-agent collision avoidance has been considered for agents satisfying singleintegrator dynamics. Exploiting the single-integrator dynamics of the agents, feedback strategies which approximate the Nash equilibrium solution of the game are readily constructed, thus providing a systematic manner of constructing Lyapunov functions (see [22,24] for details).
In this paper we extend the results of [24] to WMRs. We also present several interesting and challenging case studies. For instance, we consider situations which include several sources of potential dead-locks which, to the best of our knowledge, are not readily solved using other approaches. The multi-agent collision avoidance problem is solved in two steps. First, virtual agents satisfying single-integrator dynamics are introduced and the multiagent collision avoidance problem for the virtual agents is solved https://doi.org/10.1016/j.ejcon.2017. 11.005 0947-3580/© 2017 European Control Association. Published by Elsevier Ltd. All rights reserved. using a differential game formulation of the problem and exploiting the main results of [23,24] . Second, the solution of the differential game involving virtual agents is used to design feedback strategies for the WMRs which ensure that the robots reach their targets while avoiding collisions. The novel contributions presented in this paper can be roughly summarised as follows. The introduction of virtual agents allows to use the main result of [24] to generate a trajectory plan for the WMRs, which can then be tracked with zero error. Moreover, the case studies provide useful insights and validate the theoretical results obtained both in this paper and in [24] through simulations of complex and challenging scenarios as well as a sensitivity analysis with respect to certain control parameters.
The remainder of the paper is structured as follows. The multiagent collision avoidance problem is introduced in Section 2 . A method of generating collision-free trajectories exploiting the virtual agents is presented in Section 3 (see [24] for details). In Section 4 the virtual agents are used to solve the multi-agent collision avoidance problem for the WMRs. A series of illustrative numerical examples are then provided in Section 5 before some concluding remarks and directions for future research are given in Section 6 .
Notation: Standard notation is used in the remainder of the paper. The set of real numbers is denoted by R , the set of complex numbers is denoted by C and C − denotes the open lefthalf plane of C . Given a vector x ∈ R n , the gradient of a func- The empty set is denoted by ∅ .

Problem formulation
In this section the multi-agent collision avoidance problem is introduced, formulated and studied in a centralised setting, i.e. the positions of each agent are available to the remaining members of the group at all times. We consider a team of N WMRs moving on the ground (Euclidean plane), possibly characterised by the presence of (static) obstacles. In particular, each WMR is described by a unicycle-like model with a passive wheel, the dynamics of which are defined (see, for instance, [4,5] ) by the equations the position on the Euclidean plane of the middle point between the two actuated wheels along the direction of the axle connecting the wheels, θ i denotes the orientation of the i -th robot and v i and ω i represent the longitudinal and the angular velocity controls, respectively. The constant a i describes the distance between the point ( X i , Y i ) and the centre of mass of the i -th robot (x c,i , y c,i ) ∈ R 2 , along the segment connecting ( X i , Y i ) and the (forward) passive wheel (see Fig. 1 ). The control task may be informally stated as follows: given a target location (x * c,i , y * c,i ) ∈ R 2 for the centre of mass of each agent, determine control inputs v i and ω i such that the i -th agent is steered towards the desired point while avoiding collisions with other members of the team and with static obstacles. ω i , i = 1 , . . . , N, that steer each agent from its initial position to a predefined target while avoiding collisions between agents and among agents and static obstacles.

Collision-free trajectory planning
Problem 1 is solved in two steps. First, N virtual agents (one for each robot) satisfying single-integrator dynamics are introduced and collision-free trajectories are planned for the virtual agents. This result is one of the main contributions of [24] and it is based on a formulation of the multi-agent collision avoidance problem in terms of a nonzero-sum differential game for which a closed-form (approximate) solution is provided. Second, control inputs v i and ω i are determined to steer the i -th robot with dynamics (1) along the collision-free trajectories, thus solving Problem 1 . The first step, namely collision-free trajectory planning , is considered in this section. The reader is referred to [24] for a more detailed discussion on the results presented in this section.
We consider the case in which the movement of each virtual agent is described by single-integrator dynamics, i.e.
where u i ∈ R 2 is the control input of the i -th agent and the position of the i -th agent is denoted by x i ∈ R 2 . Each virtual agent is associated with a desired goal, namely a target position x i denotes the error variable between the current position of the i -th agent and its corresponding target position. Moreover, each agent i is associated with a parameter r i > 0, which plays the role of safety radius , which may differ from one agent to another. These values take into account the fact that the agents are WMRs with dynamics defined by Eqs.
(1) , rather than point masses with the single-integrator dynamics (2) , i.e. they account for the physical dimensions of the unicycle.
A static obstacle is represented by its centre of mass p c j ∈ R 2 and the region of the Euclidean plane that it occupies P j ⊂ R 2 , with j = 1 , . . . , m, where m ≥ 0 denotes the number of static obstacles present. The boundary of the region P j is denoted by ∂P j and in what follows elliptical obstacles are considered, i.e.
where ρ j > 0 and E j = E j > 0 . There is a one-to-one relation between the point p j , ρ j and E j , and the physical parameters of the ellipse, i.e. the centre of mass p c j and the lengths of the semi-axes. The obstacle avoidance region and the agent avoidance region of the i -th agent are defined as follows.

Definition 1. Consider the open sets
The obstacle avoidance region , denoted by S, is defined as Let D t i and S denote the complement of the sets D t i and S, respectively. A collision-free trajectory for the i -th agent is defined in the following statement.
Let ρ j (φ) denote the radius of the ellipse P j in polar coordinates as a function of the angle φ of the segment connecting x i ( t ) and p c j , relative to the polar description of p c 1 We assume that sufficient time is provided to accomplish the task of steering each virtual agent from its initial position x i (0) to its corresponding target position x * i and that the following conditions are satisfied.

Assumption 1.
i) Obstacle collision-free initial deployment : the initial positions of the agents satisfy for all i = 1 , . . . , N, j = 1 , . . . , m . ii) Agent collision-free initial deployment : the initial positions of the agents satisfy for all i = 1 , . . . , N and j = 1 , . . . , N, j = i . iii) Obstacle collision-free desired deployment : the target positions of the agents satisfy for all i = 1 , . . . , N, j = 1 , . . . , m . iv) Agent collision-free desired deployment : the target positions for each agent satisfy for all i = 1 , . . . , N and j = 1 , . . . , N, j = i . v) Configuration feasibility: the static obstacles do not form an impermeable boundary about targets of one or more of the agents. Namely, there exists a continuous path 2 l i connecting the initial condition x i (0) and the target x * i satisfying The problem of planning a collision-free trajectory for the virtual agents is defined in the following statement.

Problem 2.
Consider a multi-agent system consisting of N > 1 agents with dynamics (2) , for i = 1 , . . . N. The multi-agent collision 1 Given an ellipse P j , the function ρ j (φ) can be computed by straightforward avoidance problem consists in determining feedback control strategies u i , i = 1 , . . . , N, that steer each agent from its initial position to a predefined target while avoiding collisions between agents and among agents and static obstacles.
As will be seen in Section 4 , the solution of Problem 2 , which involves the virtual agents, is instrumental to the systematic construction of a solution to Problem 1 , which concerns the WMRs.
To streamline the presentation the following notation is intro- where P i with the so-called collision avoidance functions defined as 3 for c ≥ 1, and In [22,24] it is demonstrated that the multi-agent collision avoidance problem for the virtual agents, i.e. Problem 2 , can be formulated as a differential game for which (approximate) solutions can be constructed in closed-form.
In particular, it can be demonstrated that the functions allow us to systematically construct so-called α -Nash equilibrium strategies (see [23] for details) of the differential game defined by the dynamics (2) , i = 1 , . . . , N, and individual cost functionals which each agent i , i = 1 , . . . , N, seeks to minimise. Namely, using only the matrix-valued mappings P i , for i = 1 , . . . , N, we can define the functions (12) , i = 1 , . . . , N, and thus construct dynamic strate- for all strategies describing the linearisation of the closed-loop system around the origin, u j = i is the set of strategies { u j } for j = 1 , . . . , N and j = i and x 0 ,α ≥ 0 . This, in turn, provides a solution of Problem 2 as recalled in the following statement which summarises the main result of [24] .
Then there exist k ≥ 0 , R i , i = 1 , . . . , N, and a neighbourhood ⊆ R 2 N × R 2 N of the origin such that the dynamic strategies   (12) only require the computation of the simple block matrices P i given in (9) . Evaluated at the initial conditions ˜ x i (0) and ξ (0) the value taken by the function V i corresponds to the costs associated with the individual agent i . Therefore, differently from what would be the case if an optimal control formulation were adopted, the differential game formulation allows to evaluate the cost associated with each individual agent. Thus, "high-risk" agents can easily be identified.

Remark 2.
The control strategies provided in this paper are centralised. However, the use of game theory in the context of distributed control has gained interest in recent years (see, for instance, [14] and [30] ). The differential game framework may be adapted to take into account communication constraints. Some preliminary results on distributed differential games can be found in [21] and [16] . However, further developments of these results are needed to solve the multi-agent collision avoidance problem in a fully distributed setting.

Collision avoidance for wheeled mobile robots
The following result utilises the solution of Problem 2 to provide a dynamic control law that solves the centralised multi-agent collision avoidance problem for the WMRs described by (1) . Let  for all i = 1 , . . . , N and suppose Assumption 1 holds. 4 Then, the dynamic control law (16) with κ > 0, solves the multi-agent collision avoidance problem for the WMRs. 4 This assumption merely implies that the centres of mass of the robots coincide with the initial positions of the virtual agents. This condition is easily satisfied since the virtual agents can be interpreted as a design tool for the dynamics and initial conditions can be freely assigned.

Theorem 2. Consider a team of N agents described by the dynamics (1) and a set of desired target locations p
Proof. To begin with, note that the coordinates of the centre of mass of each robot may be written as p i = (X i + a i cos (θ i ) , Y i + a i sin (θ i )) . Differentiating the above relation with respect to time, the dynamics of the centre of mass of each agent is described by the equations The selection v i and ω i as in (16)  However, since ˙ e i = −κe i , if a "buffer" δr i is included in the safety radius, i.e. r i = r i + δr i , any initial conditions satisfying e i (0) < δr i , i = 1 , . . . , N, is such that the WMRs reach their targets without collisions occurring. This may be useful for scenarios in which the positional measurements are imperfect.

Remark 4.
WMRs described by unicycle dynamics are considered in this paper. However, the virtual agents can be exploited to generate a collision-free trajectory plan which can be tracked with zero error through the design of controllers similar to those in Theorem 2 for any mobile robots with dynamics satisfying certain (possibly partial) feedback linearisability assumptions.

Simulations
A variety of illustrative examples are presented in this section: each example is considered separately in what follows. 5 In all cases the differential game corresponding to the problem associated with the virtual agents is solved using Theorem 1 . This is then used to construct the dynamic controllers for the WMRs

Two robots exchanging positions
In this example we consider the simple scenario in which two WMRs, i.e. N = 2 , on a plane, without any obstacles, are to switch positions without colliding into one another. Consider the

Avoiding deadlock
Consider now the case in which N = 3 , This is a setting similar to the one considered in [33] . Similarly to the previous example it is a scenario in which, intuitively, one may expect a deadlock to occur, as is observed in [33] . The remaining parameters are The trajectories of the first (solid line), the second (dashed line) and the third (dotted line) WMRs are shown in Fig. 6 and it is clear that the WMRs avoid reaching a deadlock and instead converge to their desired targets.

Two robots avoiding static obstacles
Consider now the scenario in which two WMRs ( i.e. N = 2 ) have to perform their manoeuvres in the presence of a static obstacle. The static obstacle is taken to occupy a circular region centred at (0, 0) with radius 2. The parameters relating to the The circular markers indicate the initial positions of the WMRs.
for the dynamic strategies are = 1 , The grey circular region in Fig. 7 indicates the static obstacle, whereas the trajectories of first and second WMR are shown by the black and grey lines, respectively. The dotted circles illustrate the safety radius about the points at which the WMRs are closest to one another and the arrows indicate the direction of travel. As can be seen the WMRs manoeuvre around the obstacle while avoiding collisions with one another.

Overtaking
Consider again a scenario with N = 2 . However, consider the case in which α 1 = α 2 , i.e. the penalty on the control effort s of the two WMRs are not identical and consequently one WMR is faster than the other. In particular we consider the case in which α 1 = 10 and α 2 = 0 . 01 , namely robot 2 is faster than robot 1. Furthermore β 1 = β 2 = 1 and r 1 = 1 . 5 and r 1 = 0 . 5 . The initial po- The trajectories of the first (black line) and second (grey line) WMR are shown in Fig. 8 . The dotted circles indicate the safety radius at the points along the trajectories at which the WMRs are closest to one another, whereas the arrows indicate the direction of travel. Note that robot 2, which is the fast agent overtakes robot 1 as expected.       The initial and target positions are selected as

Robots forming a cordon
In this final example consider the case in which N = 9 . Consider the scenario in which robots 1 , . . . , 8 , initially at their target positions, form a circular cordon and the ninth WMR, initially outside this circle, has its target position at the centre of the cordon. In particular the initial and target positions of The trajectories of the robots are shown in Fig. 11 in which the numbering indicates the individual WMRs. Robots 1 , . . . , 6 remain almost stationary whereas the seventh and eighth robots (dark grey and light grey lines, respectively) move to make room for the ninth robot (black line). The black dashed circles denote the radii of the stationary robots. The dark grey dotted circles denote the radii of the seventh and ninth robots at the points along their trajectories at which the distance between the two robots is at a minimum. The light grey dotted circle denote the same quantity for the eighth and ninth robots. The solid circular markers denote the initial and final positions of robots 1 , . . . , 8 , and the final position of the ninth robot, whereas the solid square marker denotes Fig. 13. The positions of each WMR at six successive time instants. Small circular markers denote the positions of robots 1 , . . . , 6 , whereas the larger dark grey, light grey and black circular markers denote the positions of the seventh, eigth and ninth robots, respectively. the initial position of the ninth robot. The arrows indicate the direction of travel. The time histories of the x -components (top) and y -components (bottom) of the positions of the seventh (dark grey lines), eighth (light grey lines) and ninth (black lines) are shown in Fig. 12 . The positions of the WMRs at six successive time instants are shown in Fig. 13 where the positions of robots 1 , . . . , 6 are denoted by the small black circular markers whereas the positions of robots 7 (dark grey marker), 8 (light grey marker) and 9 (black marker) are denoted by the larger circular markers. The dashed circular lines show the radii of each WMR.

Conclusion
The multi-agent collision avoidance problem for a team of WMRs satisfying unicycle dynamics is considered in this paper. The problem of manoeuvreing the robots from their initial positions to predefined targets while avoiding collisions is solved in two steps through the introduction of virtual agents and the formulation of a differential game. The approach allows for the systematic construction of feedback stratagies for the WMRs which ensure (locally) that the robots reach their targets while avoiding collisions, provided the initial and target configurations are feasible. The approach is illustrated on a series of simulations including challenging scenarios involving potential deadlocks. A sensitivity analysis with respect to the selection of the initial condition ξ (0) is also provided.
Directions of future research include taking into consideration distributed settings in which the communication between agents is limited, possibly through further developments based on the preliminary results relating to distributed differential games presented in [21] and [16] as mentioned in Remark 2 . Experimental validation of the results will also be sought through the implementation of the control laws on teams of small-scale robots. It is of interest to extend the results presented herein to enable operation in uncertain environments in which the agents should create a map of a partially or fully unknown environment while avoiding collisions. It is also of interest to take into consideration dynamic targets for the agents. Robust collision avoidance, where inter-agent collisions should be avoided in the presence of disturbances acting on the robots, may also be considered.