Mean Field Games of Controls: Finite Difference Approximations

We consider a class of mean field games in which the agents interact through both their states and controls, and we focus on situations in which a generic agent tries to adjust her speed (control) to an average speed (the average is made in a neighborhood in the state space). In such cases, the monotonicity assumptions that are frequently made in the theory of mean field games do not hold, and uniqueness cannot be expected in general. Such model lead to systems of forward-backward nonlinear nonlocal parabolic equations; the latter are supplemented with various kinds of boundary conditions, in particular Neumann-like boundary conditions stemming from reflection conditions on the underlying controled stochastic processes. The present work deals with numerical approximations of the above mentioned systems. After describing the finite difference scheme, we propose an iterative method for solving the systems of nonlinear equations that arise in the discrete setting; it combines a continuation method, Newton iterations and inner loops of a bigradient like solver. The numerical method is used for simulating two examples. We also make experiments on the behaviour of the iterative algorithm when the parameters of the model vary. The theory of mean field games, (MFGs for short), aims at studying deterministic or stochastic differential games (Nash equilibria) as the number of agents tends to infinity. It supposes that the rational agents are indistinguishable and individually have a negligible influence on the game, and that each individual strategy is influenced by some averages of quantities depending on the states (or the controls as in the present work) of the other agents. MFGs have been introduced in the pioneering works of J-M. Lasry and P-L. Lions [17, 18, 19]. Independently and at approximately the same time, the notion of mean field games arose in the engineering literature, see the works of M.Y. Huang, P.E. Caines and R.Malham{\'e} [14, 15]. The present work deals with numerical approximations of mean field games in which the agents interact through both their states and controls; it follows a more theoretical work by the second author, [16], which is devoted to the mathematical analysis of the related systems of nonlocal partial differential equations. There is not much literature on MFGs in which the agents also interact through their controls, see [13, 12, 8, 10, 7, 16]. To stress the fact that the latter situation is considered, we will sometimes use the terminology mean field games of control and the acronym MFGC.


-Introduction
The theory of mean field games, (MFGs for short), aims at studying deterministic or stochastic differential games (Nash equilibria) as the number of agents tends to infinity. It supposes that the rational agents are indistinguishable and individually have a negligible influence on the game, and that each individual strategy is influenced by some averages of quantities depending on the states (or the controls as in the present work) of the other agents. MFGs have been introduced in the pioneering works of J-M. Lasry and P-L. Lions [17,18,19]. Independently and at approximately the same time, the notion of mean field games arose in the engineering literature, see the works of M.Y. Huang, P.E. Caines and R.Malhamé [14,15].
The present work deals with numerical approximations of mean field games in which the agents interact through both their states and controls; it follows a more theoretical work by the second author, [16], which is devoted to the mathematical analysis of the related systems of nonlocal partial differential equations. There is not much literature on MFGs in which the agents also interact through their controls, see [13,12,8,10,7,16]. To stress the fact that the latter situation is considered, we will sometimes use the terminology mean field games of control and the acronym MFGC.
Here, the state space Ω is a bounded domain of R d with a piecewise smooth boundary, while the controls are vectors of R d . The time horizon is T > 0, and the parameter ν > 0 is linked to the level of noise in the trajectories (the dynamics of a given agent is described by dX t = α t dt+ √ 2νdB t where (B t ) is a standard d-dimensional Brownian motion and α t ∈ R d is the control at time t). A special feature of the present model is that the third argument of the Hamiltonian H is a probability measure µ(t) ∈ P Ω × R d , which stands for the joint probability of the states and optimal controls of the agents at time t. In (1.1), u : [0, T ] × Ω → R and m : [0, T ] × Ω → R + respectively stand for the value function of a representative agent and the distribution of states. The first, respectively second line in (1.1) is the Hamilton-Jacobi-Bellman equation (HJB for short) which leads to the optimal control of a representative agent, respectively the Fokker-Planck-Kolmogorov equation (FP for short) which describes the transport-diffusion of the distribution of states by the optimal control law. The HJB equation is a backward w.r.t. time parabolic equation and is supplemented with a condition at t = T which involves the terminal cost function φ : Ω → R, whereas the FP equation is forward w.r.t. time and is supplemented with an initial condition, which translates the fact that the initial distribution of states is known. The HJB equation also involves the so-called coupling function f : Ω × R + → R, or, in other words, f (X t , m(t, X(t))) is part of the running cost of a representative agent. We shall discuss later the boundary conditions on (0, T ) × ∂Ω associated with the first two equations in (1.1). The third equation in (1.1) gives the connection between µ(t) and m(t), and shows in particular that µ(t) is supported by a d-dimensional geometrical object, namely the graph of the feedback law: Ω → R d , x → −H p (x, ∇u(t, x), µ(t)).

A brief discussion on the mathematical analysis of (1.1)
Recall that the Hamiltonian of the problem is (x, p, µ) → H(x, p, µ), (x, p, µ) ∈ Ω × R d × P Ω × R d .
From the viewpoint of mathematical analysis, a priori estimates for (1.1) are more difficult to obtain than in the case when the agents interact only via the distribution of states m. Indeed, in the latter case, if for example the costs f and φ are uniformly bounded, then a priori estimates on u ∞ stem from the maximum principle for second-order parabolic equations. By contrast, since the Hamiltonian in (1.1) depends non-locally on ∇ x u, the maximum principle applied to the HJB equation only permits to bound u ∞ by a quantity which depends (quadratically under standard assumptions on H) on ∇ x u ∞ , and this information may be useless without additional arguments.
If the agents interact only through the distribution of states and if the Hamiltonian depends separately on p and m, a natural assumption is that the latter is monotone with respect to m, see [17,18,19]; it implies existence and uniqueness of solutions, see in [20,19]. Such an assumption is quite sensible in many situations, since it models the aversion of the agents to highly crowded regions of the state space. It is possible to extend these arguments to MFGCs, see [10] for a probabilistic point of view and [8] for a PDE point of view, and the monotonicity assumption then means that the agents favor controls that are opposite to the main stream. In [16] and in the present work, we prefer to avoid such an assumption, because it is generally not satisfied, at least in models of crowd motion: indeed, in models of traffic or pedestrian flows, a generic agent would rather try to adjust her speed (control) to the average speed in a neighborhood of her position.
The third equation in (1.1) can be seen as a fixed point problem for µ given u and m, which turns to be well-posed under the Lasry-Lions monotonicity assumption adapted to MFGC, provided that u and m are smooth enough. We shall replace this assumption by a new structural condition which has been introduced in [16], namely that H p depends linearly on the variable µ and is a contraction with respect to µ (using a suitable distance on probability measures). In the context of crowd motion, this structural condition is satisfied if the representative agent targets controls that are proportional to an average of the controls chosen by the other agents nearby, with a positive proportionality coefficient smaller than one. Were this coefficient equal to or larger than one, it would be easy to cook up examples in which there is no solution to (1.1) or even to the N -agent game, see Remark 4.3 below.
In [16], the focus is put on the existence of solutions rather than on their uniqueness. Indeed, without the monotonicity condition, uniqueness is unlikely in general if T is not small. Consider for example a game in which the function φ has two perfectly symmetrical minima (the targets), f = 0 and where the initial distribution of states has the same symmetry. If H depends on |∇u(t, x)| only (no interaction through the controls), then a representative agent will simply travel to the minimum which is closest to her initial position. On the contrary, if the representative agent favors a control close to the average one, then there are at least two symmetrical solutions in which the whole distribution moves toward one of the two minima.
Going back to existence results, it is now frequent in the MFG literature to obtain energy estimates by testing the HJB equation by m, the FP equation by u, summing the resulting equations, integrating in the time and state variables then making suitable integrations by parts. If the value function is uniformly bounded from below (which is often the case even if there are interactions through the controls), this results in a relationship between the L ∞ [0, T ]; L 1 (Ω) -norm of the positive part of u and the L 2 m [0, T ] × Ω; R d -norm of ∇ x u; this observation can then be used to obtain additional a priori estimates, which may be combined with the ones obtained from the maximum principle and discussed above, and finally with Bernstein method. This strategy has been implemented in [16] and leads to the existence of a solution to (1.1) under suitable assumptions. By and large, existence was proved in [16] for periodic problems in any of the following cases: • short time horizon • monotonicity • small enough parameters (in particular the contraction factor mentioned above, see the parameters λ and θ in (1.7)-(1.10) below) • weak dependency of the average drift on the state variable Note that in [11], existence and uniqueness have been proved with probabilistic arguments in the case where the Hamiltonian depends separately on p and µ.

A more detailed description of the considered class of MFGCs
For any x ∈ ∂Ω, let n(x) be the outward pointing unit normal vector to ∂Ω at x. The dynamics of a representative agent is given by is the control, a stochastic process adapted to (B t ) with values in R d , L t is the local time of the process X t on ∂Ω. We assume that X |t=0 is a random variable in Ω, independent of (B t ) for all t > 0, whose law is L X |t=0 = m 0 .
In what follows, the part of the running cost which models the interactions via the controls will involve an average drift V ∈ R d : for (t, x) ∈ [0, T ]×Ω, where K is a nonnegative kernel, Z(t, x) = Ω K(x, y)dm(t, y) is a normalization factor, µ(t, ·, ·) is the law of the joint distribution of the states and the controls, and m(t, ·) is the law of the distribution of states. Hereafter, we are going to focus in MFGCs in which the cost to be minimized by a representative agent is where λ, θ are real numbers such that −1 < λ < 1 and 0 < θ ≤ 1. This leads us to define the Lagrangian which is convex with respect to α, and its Legendre transform (with respect to α): With this definition of the running cost, the first three lines of (1.1) can be written as follows: in Ω, (1.11) in Ω. (1.12) We can now specify the boundary conditions on (0, T )×∂Ω. First, assuming that u is smooth enough, the optimal control of a representative player is given by the feedback law: The reflection condition at the boundary translates into the Neumann boundary conditions: test-function ψ such that ∂ψ ∂n (x) = 0 on ∂Ω. Taking the time derivative of the latter equality, and using (1.13), we deduce that Note that (1.8), (1.13) and (1.14) imply that the total mass Ω m(t, x)dx is conserved. We have studied or will sometimes consider other kinds of boundary conditions on some parts of the boundary, for example: • periodic conditions, i.e. we set (1.7)-(1.12)) in the flat torus Ω = R d /Z d • At a part of the boundary standing for an exit, there is an exit cost. This yields a Dirichlet condition on u. The Dirichlet condition on m: m = 0 at exits, means that the agents stop taking part in the mean field game as soon as they reach the exit. Such boundary conditions will be simulated numerically in paragraph 4.2 below.
• If a part of the boundary stands for an entrance, then it is natural to impose a Dirichlet condition on u to prevent the agents from crossing the entrance outward, and a flux-like condition on m to specify the entering flux of agents, see also paragraph 4.2 below.

Organization of the paper
Section 2 is devoted to the description of the finite difference scheme; it is based on a monotone upwind scheme for the HJB equation, and a scheme for the FP equation which is obtained by differentiating the discrete HJB equation and taking the adjoint. It is an extension of the finite difference schemes proposed and studied by the first author and I. Capuzzo-Dolcetta in [3,2] for simpler MFGs. Designing an efficient strategy for solving the system of non linear equations arising in the discrete version of MFGCs is a challenging task, in particular because • the underlying system of PDEs is forward-backward and cannot be solved by simply marching in time • there is no underlying variational structure • Equation (1.9) is non local.
In Section 3, we propose a strategy for solving the system of non linear equations: it is a continuation method in which the viscosity parameter (for instance) is progressively decreased to the desired value, and for each value of the latter parameter, the system of non linear equations is solved by a Newton method with inner iterations of a bi-gradient like algorithm. Finally, in Section 4, we discuss the results of numerical simulations in two cases. In the first example, we show in particular that there exist multiple solutions and we discuss how the iteration count of the solver is affected by the variation of the parameters in the model. The second case is a model for a crowd of agents crossing a rectangular hall from an entrance to an exit: we consider situations in which queues occur, and we show how the solution depends on the parameters.

-Finite difference methods
In order to approximate (1.7)-(1.12) numerically, we propose a finite difference method reminiscent of that introduced in [3,2] for MFGs without coupling through the controls. The important features of such methods are as follows • they are based on monotone finite difference schemes for (1.7). Hence, a comparison principle still holds at the discrete level.
• the special structure of (1.7)-(1.8) is preserved at the discrete level, namely that the FP equation can be obtained by differentiating the HJB equation w.r.t. u and by taking the adjoint of the resulting equation. This results in a monotone approximation of (1.8), which ensures that the discrete version of m remains non negative at all time if m 0 is non negative. The discrete FP equation will also preserve mass.
To simplify the discussion, let us focus on the case when d = 2 and Ω = (0, 1) 2 . More complex domains (even with holes) can be handled by the present method, but an additional effort would be needed if the domain had boundaries not aligned with the axes of the underlying grid that will be introduced soon. We also assume for simplicity that the boundary conditions are of the type (1.13)-(1.14) on the whole ∂Ω.

Notations and definitions
For two positive integers N T , N h , we set ∆t = T /N T , the time step, and h = 1/N h , the step related to the state variables. Consider the set of discrete times T ∆t = {t k = k∆t, k = 0, . . . , N T } and the The goal will be to approximate u (t k , x i,j ) and m (t k , x i,j ) respectively by u k i,j and m k i,j , for all k ∈ {0, . . . , N T } and (i, j) ∈ {0, . . . , N h } 2 , by solving the discrete version of (1.7)-(1.14) arising from the finite difference scheme.
Let us define the discrete time derivative of y : T ∆t × Ω h → R by the collection of real numbers: Given a grid function y : Ω h → R, we introduce the first order right sided difference operators and the five point discrete laplace operator: Note that the definition of these operators at boundary nodes of Ω h needs to extend the grid function y on a layer of nodes outside Ω h . This is done by using discrete versions of the Neumann conditions (1.13)-(1.14): assume that the boundary condition for y is of the type ∂y ∂n = z, where z is a given continuous function. Let (z k i,j ) is a suitable grid function approximating z. We then choose the discrete version of the latter equation (first order scheme): Remark 2.1. We have chosen a first order scheme for the boundary condition in order to preserve the monotonicity of the discrete Hamiltonian at boundary nodes, see later. Since the overall scheme is monotone as it was already mentioned above, thus first order, it would be pointless to choose a higher order scheme for the boundary conditions. Remark 2.2. Note that the previously mentioned discrete operators are not needed at the nodes of ∂Ω h at which Dirichlet boundary conditions are imposed.
For a grid function V : Ω h → R, we define the value of V at half-integer indices by linear interpolation: In order to define the Godunov scheme for (1.7), we introduce the map Φ: for two grid functions y : Finally, let us introduce the function g : R 4 × R 2 → R:

The scheme
With the ingredients defined in paragraph 2.1, we are ready to propose the discrete version of (1.7)-(1.10): for any i, j = 0, . . . , N h . Note that (2.6) is not enough in order to fully determine T(Φ(u, V ), m) at the boundary nodes. We also need the following quantities: for i, j = 0, . . . , N h , where the last identity in each line comes from (2.16) below. The discrete version of (1.11) and (1.12) is The discrete Hamiltonian introduced in (2.7) g : , has the following properties: H1 (monotonicity) if q = (q 1 , q 2 , q 3 , q 4 ), then g is nonincreasing with respect to q 1 and q 3 and nondecreasing with respect to q 2 and q 4 Remark 2.3. We aim at using Newton iterations in order to solve the system of nonlinear equations arising from the discrete scheme. For that purpose, we need to linearize the discrete version of the Fokker-Planck equation. Therefore, for another small positive parameter ε, we may replace the definition (2.6) of Φ by the following: where the C 1 approximation a → a +,ε of a → a + = a1 a>0 is defined by and a −,ε = a +,ε − a. We may modify (2.14) and (2.17) accordingly. The total mass of m is preserved, i.e. (2.20) for any k = 0, · · · , N T .

Solving the discrete version of the Hamilton-Jacobi-Bellman equation
For brevity, we will use the notation (y k i,j ) for a grid function, omitting that the indices i, j take their values in {0, . . . , N h } and that k takes its values in {0, . . . , N T }. This paragraph is devoted to solving the system of nonlinear equations satisfied by the grid function (u k i,j ), given the grid functions (m k i,j ) and (V k i,j ). Let f = ( f k i,j ) denote the grid function defined by We may then rewrite (2.8),(2.16) and the first identity in (2.15) in the compact form Finding u given m and V amounts to solving a discrete version of a nonlinear parabolic equation posed backward in time with Neumann boundary conditions. This is much simpler than solving the complete forward-backward system for (u, m, V ), because a backward time-marching procedure can be used. Since, as it was already observed above, the scheme is implicit, each time step consists of solving the discrete version of a nonlinear elliptic partial differential equation which is local because V is given. Starting from the terminal time step N T for which (2.15) gives an explicit formula for u N T , the k-th step of the backward loop consists of computing u k by solving (2.8) and (2.16) given u k+1 , m k+1 and V k . This is done by means of Newton iterations. For completeness, let us give a few details on Newton iterations. We introduce the operator R u , for u , v , f : Ω h → R, V : Ω h → R 2 , and i, j = 0, . . . , N h . Note that in (2.25), u is extended outside Ω h thanks to (2.16). We aim at approximating the solution of Starting from an initial guess noted u k,0 : Ω h → R, the Newton iterations consist of computing by induction a sequence u k, of approximations of the solution to (2.26): given u k, , u k, +1 is found by solving the system of linear equations In the latter equation, the Jacobian matrix d u R u u k, , u k+1 , f k , V k is sparse since the PDE is local, and invertible since the scheme is monotone. Note that d u R u u k, , u k+1 , f k , V k depends neither on u k+1 nor on f k , so we can write it d u R u u k, , ·, ·, V k . The system of linear equations is solved by using special algorithms for sparse matrices. In the numerical simulations presented below, we use the C-library UMFPACK, see [1], implementing unsymmetric multifrontal method and direct sparse LU factorization. Note that one may choose the initial guess u k,0 = u k+1 . The Newton iterations are stopped when the residual |R u (u k, , u k+1 ,f k , V k )| is small enough, say for = 0 . Then we set u k = u k, 0 . It is well known that the Newton algorithm for (2.8) is equivalent to an optimal policy iteration algorithm, that it is convergent for any initial guess, and that the convergence is quadratic.

Solving the discrete version of the Fokker-Planck-Kolmogorov equation
This paragraph is devoted to solving the system of linear equations satisfied by the grid function (m k i,j ), given the grid functions (u k i,j ) and (V k i,j ). We may then rewrite (2.9), (2.17) and the second identity in (2.15) in the compact form Finding m given u and V amounts to solving a discrete version of an linear parabolic equation posed forward in time with Neumann boundary conditions. Since the scheme is implicit, each time step consists of solving the discrete version of a linear elliptic partial differential equation which is local. Starting from the terminal time step 0 for which (2.15) gives an explicit formula for m 0 , the k-th step of the forward loop consists of computing m k+1 by solving (2.9), (2.17) given m k , u k+1 and V k . It is easy to see that the matrix of the latter system of linear equations is exactly the transposed of d u R u u k , ·, ·, V k which has been introduced in the previous paragraph. Therefore, it is sparse and invertible. In the numerical simulations presented below, we use the C-library UMFPACK again.

-Newton algorithms for solving the whole system (2.8)-(2.17)
The solution of the system of nonlinear equations (2.8)-(2.17 is not easy because 1. the system is the discrete version of a system of forward-backward equations, which precludes a simple time marching algorithm 2. The easiest instances of MFGs can be interpreted as optimal control problems driven by a partial differential equation, which opens the way to algorithms based on the variational structure. In the MFGC considered here, there is no underlying variational principle and these algorithms cannot be used. Following previous works of the first author, see [2], we choose to use a continuation method (for example with respect to the viscosity parameter ν) in which every system of nonlinear equations (given the parameter of the continuation method) is solved by means of Newton iterations. With Newton algorithm, it is important to have a good initial guess of the solution; for that, we take advantage of the continuation method by choosing the initial guess as the solution obtained with the previous value of the parameter. Alternatively, we have sometimes taken the initial guess from the simulation of the same problem on a coarser grid, using interpolation. It is also important to implement the Newton algorithm on a "well conditioned" system. Therefore, we shall not directly address (2.8)-(2.17), but we shall rather eliminate the unknowns u and m by using the time-marching loops described in paragraphs 2.3 and 2.4 and see (2.8)-(2.17) as a fixed point problem for (f , V ).
Before describing the algorithm, we need to provide a numerical method for obtaining the average drift given u and m, i.e. for solving (2.10)-(2.11) at least approximately.

The coupling cost and the average drift
Let us introduce F f which maps the grid function m defined on T ∆t × Ω h to the grid function f given in (2.23). We also define the maps Z and V by It can be proved exactly in the same manner as in the continuous case, see [16,Lemma 2.4], that V → V(u , m , V ) is a contraction in the maximum-norm for instance, with a contraction factor |λ|θ.
For a positive integer L, we define the map F V by : for k = 0, . . . , N T − 1, (3.5) where V k,0 = V k and the sequence V k, is defined by the following induction: Remark 3.1.
1. With a slight abuse of notation, if L = ∞, then V k is the fixed point of the map V → V u k , m k+1 , V .
2. The induction (3.6) corresponds to Jacobi iterations and can be easily parallelised. It is also possible to implement Gauss-Seidel iterations, which are a little more complex to write. They consist of using the components of V k, as soon as they are obtained (instead of those of V k, −1 ) in order to compute the components of V k, which have not been obtained yet . In our implementation, we have in fact used the Gauss-Seidel iterations with a lexicographic ordering of the components of the grid functions.

Notation
Let u, m, f : T ∆h × Ω h → R be generic grid functions standing respectively for discrete versions of the value function, the law of the distribution of states, the right hand side of the discrete HJB equation (2.8). Let V : T ∆h × Ω h → R 2 be a generic grid function standing for the average drift. Let us introduce the operators obtained by differentiation of the maps F u , F m and F V : In the three paragraphs below, we explain how these differential operators can be computed.

Linearized Hamilton-Jacobi-Bellman equation
The variation du of u = F u (f, V ) induced by variations df and dV of f and V is given by More explicitly, du is obtained by linearizing (2.8), (2.16) and (2.15). It satisfies for i, j = 0, . . . , N h and k = 0, . . . , N T . The first (respectively second) inner product appearing in (3.9) takes place in R 4 (respectively R 2 ). The set of equations (3.9) is supplemented with the terminal condition (3.10) du N T i,j = 0, and the boundary conditions Given df and dV , the variation du is found by solving (3.9),(3.10) and (3.11). This is done by marching backward in time and at each time step solving a system of linear equations with a sparse and invertible matrix (of the same form as in the Newton iterations described in 2.3). Again, we use the library UMFPACK for that.

Linearized Kolmogorov-Fokker-Planck equation
The variation dm of m = F m (u, V ) induced by variations du and dV of u and V is given by The grid function dm is obtained by linearizing (2.9), (2.17) and (2.15). It satisfies (3.13) for i, j = 0, . . . , N h and k = 0, . . . , N T . The set of equations (3.13) is supplemented with the initial condition (3.14) dm 0 i,j = 0, i, j = 0, . . . , N h , and the boundary conditions Here again, (3.13), (3.14) and (3.15) is solved by marching in time and solving a system of linear equations at each time step (using UMFPACK).

Linearized coupling costs and average drifts
The variation of f = F f (m) induced by a variation dm of m is obviously given by d f k i,j = cdm k i,j , hence C f,m dm = cdm. Let us turn to the variation of V = F V (u, m, V ) induced by the variations du, dm and dV of u, m and V . Differentiating (3.4)-(3.6) leads to This permits to compute To sumarize,

The algorithm for solving (2.8)-(2.17)
We see (2.8)-(2.17) as a fixed point problem for the pair (f, V ), which we write Remark 3.3. Given u and m, V → W = F V (u, m, V ) is obtained as follows: for each k, computing W k consists of iterating W k × V L (u k , m k+1 , W k ) L times starting from W k = V k . Recall that V(u k , m k+1 , ·) is in (3.2) and (3.3), and is a contraction with a unique fixed point. Therefore, F V (u, m, ·) has also a unique fixed point which does not depend on L. Since F u and F m do not depend on L, neither do the the solutions of (3.16).

The Newton iterations involve the Jacobian
and the blocks of the Jacobian A(f, V ) are defined by: In an equivalent manner, we can write (3.24) Every Newton iteration for solving (3.16) consists of solving a system of linear equations of the form In our implementation, this system is solved iteratively by BiCGStab algorithm, see [21]. Note that BiCGStab algorithm only requires a function that computes for any grid functionsf : Ths is done using (3.25) and combining the methods described in paragraph 3.2. The assembly of the Jacobian matrix is not needed.

Description of the model
The state space is the square Ω = (−0.5, 0.5) 2 and the time horizon is T = 1. We consider the MFGC described by (1.7)-(1.14), in which 1. The kernel k in (1.9) is radial, i.e. of the form K(x, y) = K ρ (|x − y|). Here r → K ρ (r) is a non-increasing C 1 function defined on R + , with for a positive number ρ > 0, which is the radius of the disc in the state space that a reprensative agent uses for computing the average of the controls.  Remark 4.2. The two parts of the running cost: cm(t, X t ) and θ 2 |α − λV (t, X t )| 2 may have competing effects: indeed, the former cost may incitate the agents to spread all over the state space, whereas the latter may result in the agents selecting the same controls therefore staying grouped. Although the constant c = 10 −3 seems small, it is chosen in such a way that the above-mentioned two costs have the same orders of magnitude for e.g. θ = 1 and λ = 0.9.

The results of the simulations: discussion
In the present simulation, we choose the parameters as follows: On Figure 2, we display snapshots of m, u and the optimal feedback law at several times. Since both the problem and the grid are symmetric with respect to the two diagonals, m, u and the feedback have the same symmetry for all times. Let us describe the evolution of the distribution of states, that we can name "gathering-kissing-splitting" referring to the "drafing-kissing-tumbling" phenomenon in fluid mechanics (for the interaction of particles in a fluid).
• The term cm(t, X t ) in the running cost prevents the part of the distribution initially supported in one of the opposite corners (say the bottom-left corner) to travel directly to one of the targets (say the bottom-right corner). On the other hand, the interaction through controls prevents this part of the distribution to split into two equal parts which would travel directly to the two targets, because the agents favor controls close to the local average. Therefore, the part of the distribution initially supported in one of the opposite corners first travels to the center of the domain. This is the gathering phase of the evolution.
• The two parts of the distribution reach the center of the domain, where the local average of the velocity becomes small; the dominating cost then becomes the one which attracts the agents to the bottom-right and top-left corners. Because of the repulsive effect due to the part cm(t, X t ) of the running cost, the part of the distribution initially supported in one of the opposite corners (say the bottom-left corner) and having traveled toward the center of the domains splits into two parts which make each a ninety-degrees turn. This is the kissing phase of the evolution.
• After the kissing phase, the distribution of states splits into parts which travel to the targets.
Finally, we see that the paths followed by the agents is far from being a shortest path to the targets.
Let us compare these results with a simulation of the MFG obtained by cancelling either θ or λ in (1.7)-(1.14) while keeping all the other parameters and the grid unchanged. Since the problems remains symmetric with respect to the two diagonals, the obtained results have the same symmetry. On Figure 3, we display snapshots of m, u and the optimal feedback law at several times. We see that the evolution is quite different from that displayed on Figure 2, since the two parts of the distribution initially supported in two opposite corners of the domain symmetrically split into two parts each, which travels directly to the targets. The initial splitting phase is due to the coupling cost cm(t, X t ) which favors the spread of the distribution. The path followed by the agents is close to a shortest path to the targets.

Non-uniqueness of solutions
The solution of the MFGC displayed on Figure 2 is likely to be unstable because the paths followed by the agents are significantly longer than the shortest paths to the targets. We expect that there are other solutions. We are going to see that this is indeed the case. For that purpose, we are going to introduce a vanishing perturbation of the initial distribution which breaks the symmetry of the problem. This will lead to different solutions to (1.7)-(1.14). An example of perturbation is displayed on Figure 4. It consists of adding very little mass at the top and the bottom of the domain; the perturbed distribution is no longer symmetric with respect to the diagonals. We expect that the agents initially distributed near the bottom target will go to the right, and that all the agents initially distributed at the bottom of the domain will follow them, because of the interaction through controls. Similarly, we expect that all the agents initially located at the top of the domain will go to the left. In our simulations, we use a continuation method, i.e. we consider perturbations corresponding to a decreasing sequence of nonnegative parameters (π n ) n∈0,...,N . The last value π N = 0 corresponds to the distribution displayed on the left of Figure 1. For each new value π n+1 of the parameter, we initialize the Newton iterations described in Section 3 by the solution corresponding to the preceding value π n . For positive values of π n , the simulated solution is not symmetric with respect to the diagonals, and this property is preserved for the last value π N = 0. On Figure 5, we display three different solutions (the distribution of states at different times) obtained with three different sequences of vanishing perturbations of the initial distribution displayed on Figure 1: the solution displayed on the left corresponds to the sequence of perturbations displayed on Figure 4; applying to the pertubation a symmetry with respect to a diagonal, we obtain the solution displayed on the center; the solution displayed on the right is obtained by another kind of perturbation symmetric with respect to one diagonal only, located at the top and at the left of the domain: for this solution, we again notice gathering and kissing phases, and that the paths followed by the agents deviate from the shortest ones.

Behaviour of the algorithm
In this paragraph, we investigate how the behaviour of the algorithm described in Section 3 is affected by the variations of the parameters in the model. Recall that the algorithm is based on Newton iterations with a BiCGStab inner loop at each step.
We start by the grid size and the viscosity parameter ν: in the experiments reported below, the stopping criterion for the outer Newton iteration is that the normalized Euclidean norm of the residual is smaller than 10 −8 . The stopping criterion for the inner BiCGStab iterations is that the ratio of the norms of the current and the initial residuals is smaller than 10 −7 (hence, it involves a relative error). The parameters of the model are set to θ = 1, λ = 0.9, c = 10 −3 , ρ = 0.2, L = 1. Table 1 displays the iterations counts for the outer Newton and the inner BiCGStab loops for different viscosities and grid sizes. We notice that the number of iterations is not very sensitive to the grid size, as expected because the map G − I d , with G defined in (3.17) (3.18), is the discrete version of a map which has compactness properties. The iteration counts increase as ν is decreased. In fact, the numbers in Table 2 are obtained by running a continuation method in θ and λ: for each cell of the table, the solution corresponding either to the left or the upper neighboring cell is used as an initial guess for the Newton iterations. If a cell has two such neighbors, the choice of the initial guess is made as follows: in the top-right triangular part of the table, strictly above the diagonal, we choose the initial guess corresponding to the left neighboring cell. In the bottom-left triangular part of the table, including the diagonal, we choose the initial guess corresponding to the cell immediately above. We see that changing λ and θ mostly impacts the number of iterations of the inner loop. A reason for that is that V is obtained by solving a linear fixed point problem whose contraction factor is λθ. Hence, it is sensible that the number of iterations necessary to solve the systems of linear equations arising in the Newton steps increases as λθ tends to 1.
Recall that we use a continuation method, consisting of decreasing progressively ν until it reaches the desired value; in Figure 6 (respectively Figure 7), we plot the average number of BiCGStab iterations versus ν, (respectively the number of Newton iterations versus ν) for different values of λ, θ and c. The stopping criteria have been described above. The grid contains 101 × 101 nodes and there are 101 time steps. We recover the information contained in Tables 1 and 2, i.e. that ν mostly impacts the number of iterations in the inner loop. We also notice that when ν is large, the number of iterations seems to depend more on λ and θ than on c and that it becomes highly sensitive to c when ν is small, but we do not really know how to explain this. The number of iterations in Table 2: Sensitivity to λ and θ.  Finally, we investigate the sensitivity of the algorithm to L, the number of fixed point iterations yielding a proxy of V , see paragraph 3.1. Table 3 contains the iteration counts of the inner and outer 22 loops for different values of ν and L, and two choices of (λ, θ) (all the other parameters are fixed c = 10 −3 , ρ = 0.2). We notice that the choice of L seems to have little impact on the iteration count. For these reasons, L = 1 appears to be a good choice.

Second example
As a second example, we consider a model for a crowd crossing a hall, with an entrance at the left and an exit at the right. Roughly speaking, a generic agent will interact with those located in a cone ahead of her.

Description of the model
The state space is the rectangle Ω = (−1, 1) × (−0.1, 0.1) and the time horizon is T = 8. Let denote the length of the bottom and upper sides of ∂Ω: = 2. The boundary of Ω is split into three parts Γ N D , Γ N N and Γ DD which respectively stand for an entrance, some walls and an exit: and we consider the following boundary conditions: where H is defined below.
The system of PDEs is still (1.7)-(1.10), with the following new features: Comparison with the one-shot one-dimensional problem We have made numerical simulations of the MFGC described in paragraph 4.2.1 with f 0 (x) = F = 1 for all x ∈ Ω, a = a(1 − λ 2 θ) −1 and a = 2. We wish to compare the results with the explicit formulas obtained above for the one-shot one-dimensional MFG. If the latter problem is a good approximation of the former one, we should find α 1 close to 2F a = 1, therefore mostly independent of λ and θ. Note that the approximation by the one-shot one-dimension mean field game is sensible only if the time remaining to the horizon is large enough, i.e. significantly larger than /α * = . If T − t is small, then the optimal strategy for a representative player located on the left part of the domain is not to move.
On Figure 8, we display the norm of the optimal feedback for several values of λ and θ. We see that the optimal drift is close to 1 in agreement with the explicit formula found for the simplified one-shot one-dimensional MFG. We also see that the agents located at the bottom and top of Ω head toward the center of the domain (i.e. x 2 = 0) before reaching the exit. Thus, the second coordinate of α gets large and there are singularities on both sides of the exit, whose amplitude increases as λθ tends to 1.

Queues
Here, we choose f as follows: hence, an agent pays a cost for staying in Ω which is higher in the left part of the domain.
The grid has 101 × 21 nodes and there are 101 time steps. The evolution of the distribution of states and of the optimal feedback is displayed on Figure 9. We compare these results with a simulation of the MFG obtained by cancelling either θ or λ while keeping all the other parameters and the grid unchanged, see Figure 10. On Figure 9, we see that a well distributed queue takes place from the entrance to the exit, by contrast with Figure 10 where we see that the agent rush and accumulate in the right part of the domain. Similarly, the deceleration is much stiffer in the latter case. It is not surprising that the interactions through controls have the effect of smoothing the distribution of states and the optimal feedback law. On the bottom of Figure  10, we see that when t is close to the horizon, the distribution is mainly concentrated near the middle of the domain but slightly on the right: this corresponds to the agents that have reached the zone where f 0 has the smaller value, i.e. 1, but for which reaching the exit before T becomes too costly. There is also a smaller bump near the entrance corresponding to agents that would pay too high a cost to reach the right part of the domain before T . These phenomena are clearly a side effects due to the finite horizon. These side effects are also present on Figure 9, but they are attenuated by the interaction through the controls.
Remark 4.5. In models accounting for congestion effects, the cost of motion depends on the density of the distribution of states and gets higher in the crowded regions. We refer to [20] for a pioneering discussion of MFG models including congestion, to [5] for the analysis of the system of PDEs that arise with these models, and to [4] for numerical simulations. Such models also permit to describe queueing phenomena, because the agents located in crowded regions pay a large cost for moving. In [4], since the congestion effects are taken into account in a local manner, the queues take place in small regions of the state domain. By contrast, the present nonlocal model accounts for the fact that the agents anticipate low speed regions, making the traffic more fluid and the distribution of state smoother. Although it is quite possible to do, we have not incorporated congestion effects in the present model. We plan to do it in forthcoming works.

Stationary regime
We now look for a stationary equilibrium. For that, we use an iterative method in order to progressively diminish the effects of the initial and terminal conditions: starting from (u 0 , m 0 , V 0 ), the numerical solution of the finite horizon problem described above, we construct a sequence of approximate solutions (u , m , V ) ≥1 by the following induction: (u +1 , m +1 , V +1 ) is the solution of the finite horizon problem with the same system of PDEs in (0, T ) × Ω, the same boundary conditions on (0, T ) × ∂Ω, and the new initial and terminal conditions as follows: x , x ∈ Ω, (4.13) m +1 (0, x) = m T 2 , x , x ∈ Ω. (4.14) As tends to +∞, we observe that (u , m , V ) converge to time-independent functions. At the limit, we obtain a steady solution of −ν∆u + a