A model problem for Mean Field Games on networks

In [14], Gueant, Lasry and Lions considered the model problem ``What time does meeting start?'' as a prototype for a general class of optimization problems with a continuum of players, called Mean Field Games problems. In this paper we consider a similar model, but with the dynamics of the agents defined on a network. We discuss appropriate transition conditions at the vertices which give a well posed problem and we present some numerical results.


Introduction
The study of pedestrian flow in a crowd environment is attracting an increasing interest and some models based on optimization principles have been recently proposed, see for example [4,8,20]. In some applications (crowd motion in shopping centers, stations, airports) the dynamics of the population is defined on a network rather than in an Euclidean domain. There is a large literature concerning vehicular traffic on road networks (see [11] and reference therein). These models are based on a fluido-dynamical approach with the dynamics described by some nonlinear conservation law and appropriate transition conditions at the junctions modelling the interactions of the cars coming from different roads. Vehicular traffic models do not seem to be adequate to reproduce the pedestrian flow since they do not take into account the interactions and the goaldirected decisions of the agents. Aim of this paper is to study a simple optimization model for the evolution of a large number of agents moving on a network. The model is based on the one described in [14], titled "What time does meeting start?", and consists in finding the optimal arrival time at a place where the meeting is being held with the starting time defined by means of a quorum rule. This problem can be considered as a prototype for a large class of optimization problems based on the Mean Field Game (MFG) theory. This theory has been introduced by Lasry and Lions [17] (see also [1], [6], [12]) with the aim of describing the behavior of very large number of agents who take decisions in a context of strategic interactions. The main difficulties in our approach is to deal with the transition conditions at the internal vertices to obtain a well posed MFG problem. It is known that a parabolic equation on a network has to be complemented with the usual initial-boundary conditions and some transition conditions at the internal vertices (see [3,18]). In fact, in our model the stochastic differential equation describing the motion of the agent inside the arcs is coupled with a condition prescribing the probability that it enter in a given edge when it occupies a transition vertex; this fact give rise to a Kirchhoff type condition (see [9]). Using an appropriate change of variable we transform the original MFG system in a forward-backward system of two heat equations coupled via the initial datum. Relying on classical results for the heat equation on networks and some appropriate estimates for the specific problem, we prove the well-posedness of the heat system and the existence of a mean field for the quorum problem.
Going back to the original MFG problem we obtain existence and uniqueness of the solution to a system composed by a backward Hamilton-Jacobi-Bellman equation and a forward Fokker-Planck on the arcs with transitions conditions expressing respectively the probability that a single agent enters a given arc and the conservation of the density of the agents through a vertex.
The paper is organized as follows. In Section 2 we describe the model problem. In 3 we prove some technical results concerning the heat equation on the network which are used in Section 4 to show the existence of the mean field. in Section 5, we illustrate the problem with some numerical examples. Finally, the Appendix contains some technical proofs.
Notations: A network is a finite collection of points V := {v i } i∈I in R n connected by continuous, non self-intersecting arcs E := {e j } j∈J . Each arc e j is parametrized by a smooth function π j : [0, l j ] → R n , l j > 0. For i ∈ I we set Inc i := {j ∈ J | e j is incident to v i }. We denote by I B := {i ∈ I | #Inc i = 1}, I T := I \ I B , by ∂Γ := {v i ∈ V | i ∈ I B }, the set of boundary vertices of Γ, and by Γ T := {v i | i ∈ I T }, the set of transition vertices. The network is not oriented, but the parametrization of the arcs induces an orientation which can be expressed by the signed incidence matrix A = {a ij } with In the following we always identify x ∈ e j with y = π −1 j (x) ∈ [0, l j ]. For any function u : Γ → R and each j ∈ J we denote by u j : [0, l j ] → R the restriction of u to e j , i.e. u j (y) = u(π j (y)) for y ∈ [0, l j ]. For γ ∈ N, we define differentiation along an edge e j by and at a vertex v i by

The model problem
Following [14], we describe the model "What time does meeting start?" with the variant that the dynamics of the agents are defined on a network Γ. For the sake of simplicity, we assume that the place where the meeting is being held is the unique boundary vertex, namely ∂Γ = {v 0 }; the general case can be dealt with by using easy adaptations. The meeting is scheduled at a certain time t 0 but the common experience says that in general it starts at a time T greater than t 0 , when a certain rule is reached, for example the presence of a certain percentage of participants. At the initial time there is a continuum of indistinguishable players distributed according to a distribution function m 0 : Γ → R. The player's dynamics is subject to random perturbations. We assume that, inside each edge e j , the generic agent moves according to the process where the drift a is the control variable (and it coincides with the speed), σ = (σ j ) j∈J with σ j > 0 and W is a Brownian process, which is an independent disturbance for each player. Moreover we assume that, at each transition vertex v i , it spends zero time a.s. and it enters in one of the incident edge e j with probability 1/#(Inc i ) (see [9,10] for stochastic differential equations on networks). We denote by τ the random time the agent reaches v 0 , i.e.
Moreover each player wants to optimize its arrival time τ taking into account various parameters, which are encoded in the cost functional where 1 2 a 2 (t) is the actual cost of moving along the network at the velocity a while c is the final cost and T max ∈ R is a time which cannot be exceeded for the end of the meeting. The cost function c : [0, T max ] → R is given by where c i : R → R, i = 1, 2, 3 are smooth functions such that c i (s) = 0 for s ≤ 0 and c i (s) > 0 for s > 0. The term c 1 (s − t 0 ) represents a reputation cost of lateness in relation to scheduled time t 0 ; the term c 2 (s−T ) a cost of lateness in relation to actual starting time of the meeting T ; c 3 (T − s) a waiting time cost which corresponds to the time lost waiting the starting of the meeting. It is worth noticing that the cost c depends on T via the cost of lateness and the cost of waiting; hence, in order to display this dependence, from now on we write c T . Nash equilibrium theory assumes that each player want to optimize the arrival time by assuming that actual time T the meeting starts is known. Hence each agent has to solve the optimization problem Note that max a∈R {−ap + 1 2 |a| 2 } = − 1 2 |p| 2 for a = −p and the optimal control in feed-back form is given by a * (x, t) = −∂ x u(x, t). By an application of the Dynamic Programming Principle the value function, if it is assumed to be regular, formally solves the Hamilton-Jacobi-Bellman equation where ν = σ 2 /2 (i.e., ν j = σ 2 j /2 ∀j ∈ J), with final-boundary conditions and transition on internal vertices (Kirchhoff condition) On the other hand, by duality, the dynamic of the agents, i.e. the evolution of the initial distribution m 0 , is governed inside each edge by the Fokker-Planck equation T max ) and we assume the initial-boundary condition (with a "smooth fit") and a Kirchhoff condition on internal vertices Observe that the previous Kirchhoff condition implies that the parabolic flux of the agents is null at the junctions, giving the conservation of the total mass (see [7] for similar assumptions). The flow of participants reaching v 0 is given by s → ∂ x m(v 0 , s), hence the cumulative distribution F of the arrival times is The actual starting time T is fixed by a quorum rule, which means that the meeting starts when a given percentage θ of the participants has reached the meeting place v 0 . Given m, we set Note that T is the mean field, i.e. the information that the single agent has about the behavior of the other agents: the starting rule induces a strategic interactions among the participants and T influences as an external field the decisions of the agents. The main point is to prove the existence and the uniqueness of a time T which is coherent with the expectations of the participants. As in [14], this can be done by proving that the scheme: with T * defined by (2.5), has a fixed point in [t 0 , T max ]. To this end, it is important to study existence and uniqueness of a solution to the forward-backward system (2.7) For the sake of simplicity, from now on we assume As in [14,15] we apply a change of variable which transforms system (2.7) into a forward-backward system of heat equations coupled through the initial conditions.
is a solution of system (2.7).
Proof Let (φ, ψ) and (u, m) be defined as in the statement. The proofs that (u, m) is a solution to the PDEs and to initial-final-boundary conditions of (2.7) follow by easy calculations; hence, we shall omit them. Let us prove that (u, m) verifies the transitions condition of (2.7). Since φ = e u , we get which amounts to the first transition condition in (2.7). On the other hand, since Taking into account the previous relation, we obtain the second transition condition in (2.7). ✷ Remark 2.1 It is worth to observe that, by similar arguments, one can linearize a more general class of MFG systems (see [15]). Actually, assume that ν j are positive constants and that the cost J in (2.2) includes a potential term depending on the distribution of other players, i.e.
In this case, in the system (2.7) the Hamilton-Jacobi-Bellman equation is while the Fokker-Planck equation and the boundary-transition conditions are left

The heat equation on a network
In this section we collect some technical results about existence, uniqueness and a priori estimates for classical solutions to (2.9). These results will be used in the next section to prove the existence of the mean field T . We introduce some functional spaces on the network. We recall that the Sobolev space W 2,1 q,(a,b)×(0,T ) (with q ≥ 1) consists of the elements of L q ((a, b) × (0, T )) having generalized derivatives of the form ∂ r t ∂ s x with 2r + s = 2 and it is endowed with its usual norm (see [16]). For q ∈ N and α ∈ (0, 1), C (q+α) ([a, b]) stands for the Banach space of q times differentiable functions on [a, b], whose q-th derivative is Hölder continuous with exponent α and it is endowed with the usual Hölder norm which is a Banach space with respect to its norm |u| ii) For α ∈ (0, 1), we set which is a Banach space with respect to its norm In the next proposition we establish the well-posedness of the initial-boundary problem for the heat equation obtained by the Hamilton-Jacobi-Bellman equation of (2.7) via the change of variable (2.10).
Moreover, the following estimate holds where K 0 is a constant independent of w 0 . Finally, for Proof The statement is an immediate consequence of the result in [2]. Let us just note that the compatibility conditions in [2] are obviously satisfied because the terminal condition is constant and the right-hand side of the Kirchhoff condition is null. Moreover the strict positivity of w is a consequence of the comparison principle for classical solution of the heat equation (see [3]). We observe that it can be proved using the same arguments of [15,Proposition 2]. ✷ Since v 0 is a boundary vertex, there exists a unique edge, say e 0 incident to it. Without any loss of generality, we denote v 1 the other endpoint of e 0 and we assume that the parametrization of e 0 fulfills: For λ ∈ (0, 1), we set namely, v ′ λ is a point in the edge e 0 while e 0,λ is the part of e 0 between v 0 and v ′ λ . In the next proposition, we establish existence and uniqueness of a classical solution to the heat equation obtained by the Fokker-Planck equation of (2.7) via (2.10). Moreover we show a "weak" continuous dependence estimate in the sub-edge e 0,1/2 with respect to the initial datum µ(·)/w(·, 0) where w is the solution of (3.1).
Proposition 3.2 Let w be the solution of problem (3.1) and assume Then there exists a unique solution µ ∈ C 2, Moreover, for every q ≥ 1, the following estimate holds where K 1 is a constant independent of µ 0 and w.
The proof is postponed in the Appendix.
In the next proposition, we establish two continuous dependence estimates for the solution of problem (3.6) with respect to the initial datum: the former is a "strong" estimate in the sub-edge e 0,1/2 while the latter is the classical estimate in the whole network. Proposition 3.3 Let µ be the solution to (3.6). Besides the hypotheses of Proposi- where K 2 is a constant independent of µ 0 and w.
ii) Under the further assumption where K 3 is a constant independent of µ 0 and w.
The proof is postponed in the Appendix. Let us now establish a well-posedness result for the system (2.9).
Theorem 3.1 Assume that, for some α ∈ (0, 1), there holds Then, there exists a unique classical solution (φ, ψ) to the system (2.9) with φ > 0. Moreover, the following estimates hold Being a straightforward consequence of the previous theorem, the proof of this result is omitted.

The Mean Field Game result
We prove the existence of a starting time T consistent with the corresponding flux of participants ∂ x m. To this end we show that the map from [t 0 , T max ] into itself, defined by the scheme (2.6) is continuous and therefore it admits a fixed point by the Brouwer's Theorem. For simplicity, we shall recast it in terms of couple (φ, ψ) solution of (2.9). Consider the function Ψ : where T * is defined as in (2.5) with In this section we assume the hypotheses of Theorem (3.1) and that the map is continuous. A crucial step to prove the existence of the mean field T is to establish some bounds for ∂ x ψ(v 0 , ·). In order to get such an estimate, we consider in the next Lemma two complementary assumptions.  then, there exists a value ε > 0, independent of T , such that (b) If m 0 fulfills (3.8), then there holds: In particular, there exists a constant ε T such that Proof (a). Owing to (3.11), the function m 0 satisfies: [0,l 0 ] is bounded independently of T . We infer that there exist ξ 0 ∈ (0, l 0 ) and a sufficiently small a > 0 such that, for every T ∈ [0, T max ] there holds One can easily check that the function v(x, t) := ae bt sin(xπ/ξ 0 ), with b := −π 2 /ξ 2 0 solves the initial-boundary value problem while the function ψ is a supersolution to this problem. By the standard comparison principle, we infer: where all the constants are independent of T .
(b). Being nonnegative, the function ψ attains a global minimum at each point (v 0 , t) with t ∈ (0, T max ]. The Hopf Lemma prevents that ∂ x ψ(v 0 , t 0 ) ≤ 0 in these points. Hence, there holds: ∂ x ψ(v 0 , t) > 0 in (0, T max ]. The second part of the statement follows by continuity. ✷ We shall establish the existence of a fixed point provided that m 0 fulfills either (4.4) or (3.8). We cope with these two cases separately in the next two statements. Proof We shall follow the arguments of [14, Lemma 2.6]. In order to apply the Brouwer fixed point Theorem, we need to prove that the function Ψ defined in (4.1) is continuous. We consider two admissible flowsψ T 1 ,ψ T 2 (see equation (4.2) for their definition) and, without any loss of generality, we assume Ψ(T 1 ) ≤ Ψ(T 2 ). If Ψ(T 1 ), Ψ(T 2 ) ∈ (t 0 , T max ), we have (where the first equality is due to the fact that both integrals are equal to θ). Taking into account Lemma 4.1-(a), we obtain The estimates in Theorem 3.1-(i) and the trace theorem (for instance, see [16, Theorem II.2.3]) yield Taking into account assumption (4.3), we obtain that in this case the function Ψ is continuous.
When Ψ(T 1 ) = t 0 (respectively, Ψ(T 2 ) = T max ), we have indeed, eitherψ T 1 is a flux which reaches θ at most at time Ψ(T 1 ) orψ T 2 is a flux which does not reach the value θ before time T max ; in other words, the former integral is ≥ θ (respectively, the latter one is ≤ θ). Hence we can conclude by the same arguments as before. Therefore, the continuity of Ψ is achieved. ✷ Theorem 4.2 Assume the hypotheses of Theorem 3.1-(ii). Then the map Ψ defined by (4.1) admits a fixed point.

Proof
We shall argue adapting the arguments of Theorem 4.1: hence, our purpose is to prove that Ψ is continuous on [t 0 , T max ]. To this end, let us fix T ∈ [t 0 , T max ]. For every T 1 ∈ [t 0 , T max ] such that Ψ(T ) = Ψ(T 1 ), there is nothing to prove. We split the arguments according to the fact that Ψ(T ) belongs to and observe that Ψ(T 1 ) = max{t 0 , τ }. Then, we have τψ T (the first equality is due to the fact that both the integrals are equal to θ). By Lemma 4.1-(b), we infer Arguing as before, we deduce that there exists a constantK (depending on T ) such that where the inequality is due to the fact that the first integral is equal to θ while the second one is less or equal to θ. Again by Lemma 4.1-(b), we infer Arguing as before, for some constantK ′ (depending on T ), we get By this relation and (4.6), the proof of the continuity of Ψ in T is accomplished. Case2: Ψ(T ) = T max . For T 1 ∈ [t 0 , T max ] with Ψ(T 1 ) = T max , there is nothing to prove; hence, without any loss of generality, we assume that Ψ(T 1 ) < T max . We have Arguing as before, we accomplish the proof in this case. Case3: there is nothing to prove; hence, without any loss of generality, we assume that Ψ(T 1 ) > t 0 . We have By the same arguments as those used before, we accomplish the proof. ✷ We conclude with a uniqueness result for the fixed point under some monotonicity condition on the cost c T . Assume by contradiction that there exist T 1 , T 2 ∈ [0, T max ] with T 1 > T 2 such that T i = Ψ(T i ). Let c T i and (φ i , ψ i ) be the costs and the solutions of (2.9) corresponding to T i , i = 1, 2. Then, (φ, ψ) := (φ 1 − φ 2 , ψ 1 − ψ 2 ) satisfies (2.9) with m 0 /φ(·, 0), e c T (Tmax) and e c T (·) replaced respectively by m 0 /φ 1 (·, 0) − m 0 /φ 2 (x, 0), e c T 1 (Tmax) − e c T 2 (Tmax) and e c T −1 (·) − e c T 2 (·) . We have (the term −a ij takes into account the orientation of the edge e j ). Similarly Subtracting the previous inequality and using the transition conditions at the internal nodes we get (recall that e 0 is the unique arc incident to v 0 parameterized in such a way that v 0 is the initial point).
The first term in the previous inequality is negative. By the assumption on c T , the map T → c T is increasing in T and c T 1 (T max ) = c T 2 (T max ). Hence the second term is null. Moreover, since T 1 > T 2 and therefore It follows that also the third term is negative, hence φ 1 (x, 0) = φ 2 (x, 0) for x ∈ Γ and therefore a contradiction to c T 1 > c T 2 . ✷

Numerical simulation
In this section we propose a numerical method to compute the mean field T . The scheme is based on a finite difference approximation of the system (2.9) with an iterative procedure to solve the fixed point map (4.1). On each interval [0, l j ], j ∈ J, it is defined an uniform partition y k = kh j with space step h j = l j M j and k = 0, . . . , M j . In this way a spatial grid G(Γ) = {x j,k = π j (y k ), j ∈ J, k = 0, . . . , M j } is defined on the network Γ. A time step ∆t is also introduced to obtain a uniform grid t n = n∆t, n = 0, 1, . . . , N max with N max = [T max /∆t], on the time interval [0, T max ]. We will approximate the solution (φ, ψ) of (2.9) by two sequences {φ n } n and {ψ n } n , where, for each n = 0, . . . , N max , φ n , ψ n : G(Γ) → R and φ n j,k ≃ φ(x j,k , t n ), ψ n j,k ≃ ψ(x j,k , t n ). The discrete functions {φ n } n and {ψ n } n are computed by the following forward-backward explicit finite difference scheme: ψ n j,k+1 − 2ψ n j,k + ψ n j,k−1 , n = 0, . . . , N max − 1 for k = 1, . . . , M j − 1 and j ∈ J. (5.1) At each time iteration n, to compute {φ n } n and {ψ n } n it is necessary to fix the values of these functions at the boundary of the arcs e j , j ∈ J, i.e. at the transition vertices v i , i ∈ I T . We define an approximation of the Kirchhoff's condition which together with the continuity condition across the vertices will give the #(Inc i ) conditions necessary to determine in a unique way the value of the functions φ n and ψ n at v i . We introduce two sets of indices Inc + i = {j ∈ J |a ij = 1} and Inc − i = {j ∈ J |a ij = −1}. Moreover we denote by φ n (v i ), ψ n (v i ) the values of the functions φ n , We define the following finite differences approximations of the derivatives at v i along an edge e j : We rewrite the transition conditions in (2.9) as and we consider the following finite difference approximation Given a discrete function f : G(Γ) → R, we consider a continuous piecewise linear reconstruction I[f ] : Γ → R such that I[f ] | (x j,k ,x j,k+1 ) is linear for all j ∈ J and k = 0, . . . , M j − 1 and I[f ](x j,k ) = f j,k . To guarantee the continuity on Γ of the linear interpolation I[·] applied to the discrete function φ n and ψ n , we need to impose the following continuity conditions: At each time step t n , the #(Inc i ) − 1 conditions given by (5.6)-(5.7) coupled with (5.4)-(5.5) give #(Inc i ) relations which uniquely determine φ n (v i ) and ψ n (v i ). Summarizing, we approximate (2.9) by computing the couple of discrete functions {(φ n , ψ n )} n which solve the finite difference scheme (5.1) together with iii) the initial and terminal conditions: Defined a function {ψ n } n by means of the previous scheme, we consider the following approximation of the cumulative distribution (4.2) where e 0 denotes the edge incident v 0 with π 0 (0) = v 0 and by the boundary condition ψ k 0,0 = ψ k (v 0 ) = 0. To approximate the fixed point of the map Ψ defined in (4.1) we apply the following Algorithm 1. Given an initial guess T 1 and denoted by T 2 an initial value to enter the loop and by τ as threshold for the stopping criteria, we consider

Algorithm 1: Fixed Point Iterations
Data: initial guess T 1 , T 2 , threshold value τ Result: approximated mean field

Example 1: a simple graph
We consider a simple graph with four vertexes and four edges, as shown in Fig.1. The initial mass distribution is given by and the percentage value of the expected players is θ = 0.5. For each arc j ∈ J, we consider the same space step h j = h and we run a series of numerical tests varying the space step according to the first column of  9) where N is such that T 2 = N∆t in Algorithm 1. Since θ represents the percentage of player exited from the boundary vertex v 0 , then 1 − θ represents the percentage of the residual population and the term on the right side of (5.9) should be zero. This error is shown in the second column of Table 5.1. In the third and fourth columns we show the computed mean time T 2 , and the number of iterations needed by the Algorithm 1 to converge when τ = 10 −4 and T 1 = 10. Table 5.1 shows small values for E h (T ) and, even if we do not observe a monotone behavior, the smallest value is attained with the finer space grid. The graph on the right of Figure 1 shows the convergence of the approximated mean field time T 2 , computed by Algorithm 1 with space step h = 2.50 · 10 −2 . On the horizontal axis are the iterations of the fixed point, while on vertical axis the corresponding approximated mean field time T 2 . In Fig.2, we show the initial mass distribution (left), equilibrium mass distribution (center) and the corresponding value function (right).

Example 2: A more general graph
We consider a more general graph with 17 vertexes and 22 edges, see Fig.3. The and the expected percentage of arrival players is θ = 0.7. The Algorithm 1 is run with h = 0.05, ∆t = h 2 4 and τ = 0.05. We get T = 23.99 with error E h (T ) = 2.35 · 10 −2 . The graph on the right of Figure 3 shows the convergence of the approximated mean field time T 2 computed by Algorithm 1: on the horizontal axis is the number of iterations of the fixed point algorithm, whereas on the vertical axis the corresponding mean field time. Figure 4 shows the mass evolution at different times. It can be observed that at the initial time the diffusion spreads the population in all the directions on the graph, later the cost (2.2) favors the population closer to v 0 to reach the exit before of the population farther away.  In order to prove this estimate, we introduce two families of functions {μ 0,n } n and {μ 1,n } n such that µ 0,n ,μ 1,n ∈ C 1 ([0, T max ]), |μ 0,n | ∞ + |μ 1,n − µ(v 1 , ·)| ∞ → 0 as n → +∞, By standard regularity theory for parabolic equations on domains in Euclidean spaces, the problem ∂ t µ n − ∂ 2 x µ n = 0 (x, s) ∈ (0, l 0 ) × (0, T max ) µ n (0, s) =μ 0,n (s), µ n (l 0 , s) =μ 1,n (s) s ∈ [0, T max ] µ n (x, 0) = µ 0 (x) w(x,0) x ∈ (0, l 0 ) admits a unique classical solution µ n which belongs to C (2+α,1+α/2) ((0, l 0 )×(0, T max )) for some α depending only on the features of the equation By Ascoli theorem, as n → +∞, (eventually, passing to a subsequence), the function µ n converges uniformly to some function v and the same happens for ∂ t µ n , ∂ x µ n and ∂ 2 x µ n with the corresponding derivatives of v. By the stability result we get v = µ. Moreover, passing to the limit in the last estimate, we obtain |µ| (2+α,1+α/2) (l 0 /4,3l 0 /4)×(0,Tmax) ≤ K |µ 0 /w(·, 0)| (2+α) e 0 + |µ| ∞ and, taking into account estimate (6.1) and the definition of the sub-edgeẽ, we accomplish the proof of claim (6.2).

Proof of Prop. 3.3
We shall improve some arguments of the proof of Proposition 3.2 taking advantage of the stronger compatibility condition given by (3.8).
Here, the constant K is independent of µ 0 and w and it may change from line to line.