Nash Equilibria and Bargaining Solutions of Differential Bilinear Games

This paper is devoted to a theoretical and numerical investigation of Nash equilibria and Nash bargaining problems governed by bilinear (input-affine) differential models. These systems with a bilinear state-control structure arise in many applications in, e.g., biology, economics, physics, where competition between different species, agents, and forces needs to be modelled. For this purpose, the concept of Nash equilibria (NE) appears appropriate, and the building blocks of the resulting differential Nash games are different control functions associated with different players that pursue different non-cooperative objectives. In this framework, existence of Nash equilibria is proved and computed with a semi-smooth Newton scheme combined with a relaxation method. Further, a related Nash bargaining (NB) problem is discussed. This aims at determining an improvement of all players’ objectives with respect to the Nash equilibria. Results of numerical experiments successfully demonstrate the effectiveness of the proposed NE and NB computational framework.


Introduction
This paper is devoted to a theoretical and numerical investigation of a class of differential (dynamical) Nash games and related Nash bargaining problems governed by differential models with bilinear state-control (input-affine) structures.
Since their appearance [24,25], Nash games have attracted the attention of many scientists as they provide a convenient mathematical framework to investigate problems of competition and cooperation. A competition game can be interpreted as a coupled optimization problem for which the Nash equilibrium (NE) defines the solution concept. In this framework, non-cooperative differential Nash games were introduced in [18]. In this case, differential (dynamical) models govern the state of the system that is subject to the action of different controls representing the strategies of the players in the game, and to each player is associated an objective (pay-off) functional. Differential games have received much attention in the field of economics and marketing [11,19], and are well investigated in the case of linear differential models with linear control mechanism and quadratic objectives; see, e.g., [5,13,15,28]. On the other hand, much less is known in the case of nonlinear models, especially in the case of a nonlinear control structure and, in particular, in the case of a bilinear structure, where a function of the state variable multiplies the controls. For a past review of works on differential Nash games, we refer to [27], and for more recent and detailed discussion and references see [13,16,19]. We also remark that nonlinear differential NE games have been investigated in the framework of Young measures; see, e.g., [3,27]. However, the numerical implementation of the latter framework is at its infancy and requires further development that is beyond the scope of our present work.
In the general framework of differential Nash games, we focus on a bilinear control structure and we remark that bilinear control problems play a central role in many applications [26]. In particular, they are omnipresent in the field of quantum control problems [4]. However, the additional theoretical and numerical difficulties due to the bilinear structure hinder the further application of the NE framework to many envisioned problems. Furthermore, Nash games are strictly related to multi-objective optimization with nonlinear control structure, and therefore we believe that a study of NE bilinear game problems would be beneficial for further development and application of the differential NE framework and related fields. This is in particular true for new problems involving quantum systems [17] and biological systems [6], and for this reason, we show results with two related models that can both be put in the following general bilinear structurė where x(t) ∈ R n represents the n-dimensional state of the system at the time t, x 0 is the initial condition and u = (u 1 , . . . , u N ) represents the vector control function. The function f 0 governs the free dynamics of the system, and the function F denotes the interaction coupling between the control and the system. We present a theoretical and numerical investigation of Nash games governed by (1), where the control components u j , j = 1, . . . , N , represent N players, choosing their strategy in an admissible set, and to each player we associate a different cost functional such that a non-cooperative problem is defined. Thus, our first purpose is to prove existence of a NE in this framework, and for this purpose we consider the theoretical framework in [28], which we extend to our nonlinear case. The result is that a NE exists if a sufficiently short time horizon is considered, T < T 0 . Moreover, we provide an estimate of T 0 .
With this knowledge, we turn our attention to the numerical realisation of the NE solution. For this purpose, we implement and analyse a relaxation scheme [20] combined with a semi-smooth Newton method [7][8][9], where the latter choice is motivated by the presence of constraints on the players' actions that make our game problem non-smooth.
The second aim of our work is to address the fact that a Nash equilibrium does not provide an efficient solution with respect to the payoffs that the players could achieve by agreeing to cooperate negotiating the surplus of payoff that they can jointly generate. For this purpose, J. F. Nash proposed in [23] a bargaining strategy of jointly improving efficiency while keeping close to the strategy of a NE point. The idea is to find a point that is Pareto optimal and symmetric (the labelling of the players should not matter) that maximizes the surplus of payoff for both players. In this way, we obtain controls that represent an improvement toward the task of approaching the desired targets while keeping their costs as small as possible.
In order to implement this goal, we use the Nash characterization of a bargaining solution to construct an efficient method to explore the Pareto frontier and converge to the solution sought. For this purpose, we compute a Pareto point based on its characterization as a solution of a bilinear optimal control problem for u and a cost functional resulting from a composition of the players' objectives. Further, we consider the framework in [12], see also [13], to reformulate the characterization of a Nash bargaining (NB) solution and use this characterization to construct an iterative scheme that converges to this solution on the Pareto frontier.
In the following section, we formulate a class of bilinear optimal control problems as the starting point for the formulation of our differential Nash game. Correspondingly, we define the NE concept and the related NB solution. In Sect. 3, we prove existence of NE for our game. For this purpose, preparatory results are presented addressing the Fréchet differentiability of our differential model and other functional properties of the components of the dynamic Nash game. In Sect. 4, we illustrate our numerical framework for solving a NE differential game, which requires to introduce first-order optimality conditions that must be satisfied by the NE solution and discuss a semi-smooth Newton scheme that is applied to this system in such a way to obtain partial updates for the game strategies sought. Thereafter, we show how these updates are combined in a relaxation scheme in order to construct an iterative procedure that converges to the NE point. The convergence properties of this relaxation scheme are also discussed. Section 5 is devoted to the analysis of our Nash bargaining problem, where we prove a characterization of a solution to this problem, and correspondingly define a solution procedure for its calculation. In Sect. 6, we present results of numerical experiments that successfully validate our NE differential game framework based on a quantum model of two spins and on a model of population dynamics with two competing species. A section of conclusion completes this work.

From Bilinear Control to Bilinear Games
In many applications with (1), the aim of the controls u j is to drive the system from an initial state x(0) = x 0 to a neighbourhood of a given target state x T at a given time T . For this purpose, the following bilinear optimal control problem is formulated where x ∈ X is referred to as the state of the system in the set where H 1 is the usual Sobolev space of the subset of L 2 functions such that their first-order weak derivatives have finite L 2 norm; see, e.g., [1]. Further, x T ∈ R n is a given target and the functions f 0 , F j : R n → R n are assumed to be Lipschitz and two-times Fréchet differentiable. Whenever necessary, we denote with F(x) ∈ R n×N the matrix with columns F j (x), j = 1, . . . , N . The functions u j : [0, T ] → R, j = 1, . . . , N , represent the controls belonging to the following admissible sets where M ∈ (0, ∞] is a given positive constant and N is the number of controls. We denote Now, turning to the framework of differential games, we can consider each scalar function u j in (2) as the control strategy of a player that we label with j. In the present setting, all players have the same purpose: come as close as possible to x T at final time while minimizing the L 2 cost of its action. Within the game strategy, we can now consider different control objectives associated with different players and thus attempt to drive the bilinear system to perform different and even competitive tasks.
Assuming that each player has chosen its control strategy, this choice determines a unique solution x ∈ X of problem (1). We call x(t), t ∈ [0, T ], the trajectory corresponding to the controls u 1 , . . . , u N . For the jth player, each choice of controls has a cost, We call J j the objective (reduced cost functional) of player j. The J j , j = 1, . . . , N , represent the objectives of the game. The purpose of each player is to minimize its own cost functional. However, since the game is non-cooperative, this criterion does not provide a suitable solution concept. On the other hand, a Nash equilibrium is an outcome in which every player is performing the optimal strategy knowing the other players' choices. Thus, no player can benefit from unilaterally changing his choice. If each player has chosen a strategy and no player can benefit by changing strategies while the other players keep theirs unchanged, then the current set of strategy choices and the corresponding objectives' values constitute a Nash equilibrium (NE). A NE is defined as follows.
Whenever necessary, we denote with u NE = (u NE 1 , . . . , u NE N ) a NE strategy. It is well known that the NE solution concept is inefficient in the sense that, in many cases, the players can jointly generate and share an improvement in their objectives by choosing to cooperate. This is a so-called bargaining problem and we pursue the Nash bargaining (NB) solution concept. In particular, the NB solution u NB = (u NB 1 , . . . , u NB N ) requires that the strategies of the players are Pareto optimal and satisfy the following criteria where P is a subset of the Pareto frontier such that J i (u NE ) ≥ J i (u). In this formulation, we choose u NE as the disagreement outcome, and J i (u NE ), i = 1, . . . , N , are the values of the objectives of the game if no bargaining takes place (status quo). Based on (6), the players act in order to maximize the Nash product of the excesses (or defects) with respect to the solution corresponding to disagreement. Since the NB solution concept requires Pareto optimality, the NB solution is sought in the Pareto frontier.
In this case, under the assumption that the players cooperate in trying to minimize their performance, a set of control actions u is sought such that the resulting individual objectives cannot be improved upon by all players simultaneously. This is a so-called Pareto efficient solution. In the literature, a way to find Pareto solutions is to solve a parameterized optimal control problem as follows where Notice that in this way not all Pareto solutions can be found; see [14]. Nevertheless, we assume that u NB can be computed with (7) and corresponding to a specific choice of μ = (μ 1 , . . . , μ N ).

Existence of Nash Equilibria
In this section, after discussing some preliminary results, we prove a theorem stating existence of a NE strategy for our bilinear-control problem. We closely follow the approach in [28]; see [27] for a review of this approach and the discussion of an alternative method in the framework of Young measures.
We start this section showing some properties of the solution to (1). Clearly, this solution depends on the initial condition, and since we focus on open-loop NE games, some of the constants obtained in the following estimates depend on the initial state of the system denoted with x 0 .

Lemma 3.1 For any u
Proof It is well known that x : [0, T ] → R n has the following form For any t ∈ [0, T ] one can compute which then allows us to estimate where the Lipschitz continuity of f 0 and F is used and L f 0 and L F denote the corresponding Lipschitz constants. From estimate (11) we obtain where With the Gronwall theorem we get Therefore the Lemma is proved with K(T ) := α T exp T 0 β(s)ds which is a monotonically increasing function of T .

Lemma 3.2 The solution x is a Lipschitz function of u, for all u
Proof For any u 1 , u 2 ∈ U(M), consider the corresponding state equations given bẏ Hence, and where K (T ) monotonically increases with T because of the continuity of F and the properties of x. Finally, the Cauchy-Schwarz inequality and the Gronwall theorem imply Therefore x is a Lipschitz function in u.
Let us consider the linearized problem related to (1) for a general reference pair (x, u). We have where f 0 and F j denote the Jacobian matrices of f 0 and F j , j = 1, . . . , N , respectively. Then, the following properties hold.
Proof Consider the linearized problem (17). It holds As in the previous Lemma, applying the Cauchy-Schwarz inequality and the Gronwall theorem, we get Next, consider the operator c(·, ·) : In this way (1) and (17) can be written respectively as The operator c is Fréchet differentiable and its Fréchet derivative D x c(x, u) is invertible. Hence the equation and the implicit function theorem (see [10]) imply that D u x(u)(δu) = δx is the solution to the linearized problem (17).
Moreover, the assumptions on f 0 , F, with the implicit function theorem imply that the is the unique solution to (1) corresponding to u, is twice Fréchet differentiable. Therefore, denoting by δx(u, δu) ∈ H 1 (0, T ; R n ) the unique solution to (17) corresponding to δu and u, the following expansion holds where θ(u, δu) . For our discussion we need to estimate θ(u, δu). Therefore the following property is proved.

Lemma 3.4 In the above setting, for any h
and Proof Consider the operator c(·, ·) and compute its second derivative. We get and hence If x(u) is the solution of (1) corresponding to the control u, then c(x(u), u) = 0. Therefore By replacing (22) (20). The estimate can be proved similarly to the previous Lemma. In fact, integrating (20) over (0, t), we get Hence, Applying the Gronwall theorem we get with C(T ) monotonically increasing with T .
Next, we introduce some functions used in the proof of the main result.
T are the desired players' targets and x(t), x i (t), for t ∈ [0, T ], are the trajectories of (1) corresponding to the controls u := (u 1 , . . . , u N ) and u i : After this preparatory work, we can state the following result, similar to [28] for the linear case, which focuses to our NE game with bilinear structure.

Lemma 3.6 There exists T
Proof Since the functions that constitute σ are continuous, then also σ : for any 0 < T < T 0 . For the proof, let u ∈ U(M), w ∈ L 2 (0, T ) be fixed and compute the derivatives of σ . We have Then, we get where δx(u, w) and δx i (u i , w) are the solutions of (17) with controls (u 1 , . . . , u N ) and (u 1 , . . . , u i−1 , w i , u i+1 , . . . , u N ). Therefore, we obtain The coercivity of σ is then guaranteed if require Therefore it is possible to choose 0 and T 0 such that (30) holds and σ is convex [29, Corollary 42.8 page 248]. Moreover, since σ is continuous, it is weakly lower semicontinuous. Let Proof We have In this expression, the first sum is continuous and the third sum is constant. By Lemma 3.6 there is a T 0 > 0 such that if T < T 0 then the first sum plus the second sum is weakly lower semicontinuous and grows indefinitely as u L 2 grows indefinitely. Therefore ψ v is weakly lower semicontinuous, so that the complement set of U v , U c v = {u : ψ v (u) ≤ 0} is weakly closed and its complement U v is weakly open.
Next, we prove existence of a NE for our differential game. Proof Suppose the theorem is false. Then, using Lemma 3.5, for each u ∈ U(M), there is a v ∈ U(M) such that ψ v (u) := ψ(u, u) − ψ(u, v) > 0, i.e. u ∈ U v , where U v is defined above. Therefore U(M) has the following weakly open cover Next, we show that there is a finite subset of vector functions First suppose M < ∞. Then U(M) is a convex, bounded and closed subset of L 2 (0, T ; R N ) so that it is weakly compact. Then (32) must have a finite subcover (33).
Since V is finite-dimensional, the weak and strong topology coincide. Notice that ψ v j is strongly continuous on U(M), hence it is continuous on V , and so γ j is continuous on V .
Finally, since we have seen that ∀v ∈ V there is at least one j ∈ {1, . . . , p} such that ψ v j (v) > 0, then the function Then η is continuous and we can apply the Brouwer fixed-point theorem to get the existence of a point v * ∈ V such that η(v * ) = v * . Suppose γ j (v * ) > 0, j = 1, . . . , and γ j (v * ) = 0, j > . Then and the fixed-point condition becomes Finally, it is possible to prove the convexity of ψ(v * , v) in v as in Lemma (3.6), defin- , under the condition ν ≥ k 1 (T )N T 3 , with k 2 defined in Lemma 3.6.
which is in contrast with (34). Therefore the theorem is proved.
As already remarked at the beginning of this section, our results concern open-loop NE games. Thus, in particular, the value of the time horizon T 0 specified in Theorem 3 may depend on the choice of the initial condition.

A Numerical NE Game Framework
In this section, we illustrate a numerical procedure for solving our differential NE game. For this purpose, let us consider the case of two players, and suppose that u * = (u * 1 , u * 2 ) is a Nash equilibrium for the game. Then the following holds [5].
1. The control u * 1 is optimal for Player 1, in the sense that it solves the following optimal control problem min u 1 2. The control u * 2 is optimal for Player 2, that is, it is a solution of the following optimal control problem min u 2 Therefore u * = (u * 1 , u * 2 ) solves simultaneously two optimal control problems whose solutions are characterized by the following optimality system T ), where x (n) T , n = 1, 2, are the final targets to the players and P U n (M) denotes the L 2 projector operator on U n (M); see [9] for all details. Now, referring to the two optimal control problems above, and as in Lemma 3.4, we can consider the control-to-state maps u 1 → x(u 1 , u * 2 ) and u 2 → x(u * 1 , u 2 ), where x(u 1 , u 2 ) is the unique solution to our governing model, with given initial condition x(0) = x 0 , corresponding to given u 1 , u 2 . Correspondingly, we can introduce the reduced cost functionals J 1 (u 1 , u 2 ) := J 1 (x(u 1 , u 2 ), u 1 , u 2 ) and J 2 (u 1 , u 2 ) := J 2 (x(u 1 , u 2 ), u 1 , u 2 ). Therefore the solution to (35) can be written as u * 1 = arg min u 1 J 1 (u 1 , u * 2 ), and the solution to (36) can be written as u * 2 = arg min u 2 J 2 (u * 1 , u 2 ). In this framework, a classical iterative method for solving our NE problem is the relaxation scheme discussed in [20] and implemented in the following algorithm.

Algorithm 4.1 (Relaxation scheme)
Initialize (u 0 1 , u 0 2 ); set τ ∈ (0, 1), tol u and k = 0. repeat In this algorithm, τ is a relaxation factor that we specify in our numerical experiments. In general, there is no a priori choice of τ available. However, in our numerical experiments we always observe convergence of this scheme by a moderate choice of the relaxation factor. At the end of this section, a proof of the convergence of Algorithm 4.1 is given.
The main advantage of the above algorithm is that we can computeū 1 andū 2 separately (in parallel) using an efficient optimization scheme. Specifically, given u k 2 , we computeū 1 by solving the optimality system given by In a similar way, given u k 1 , we can computeū 2 by solving the following optimality system Next, we give a short description of the semi-smooth Newton scheme that we implement for solving (separately) the optimality systems (38) and (39); for more details, see [8,9]. Denote with η := (x, u 1 , p 1 ) and define the map F (η) : , which represent the residual of the adjoint, state, and optimality condition equations. Therefore, the solution to our optimality systems corresponds to the root of F (η) = 0, which can be determined by a Newton procedure. However, for this purpose, we need the Jacobian of F , which is not differentiable in a classical sense with respect to u n , because of the projection function.
On the other hand, by sub-differential calculus, it is possible to construct a generalized Jacobian, such that the following Newton equation is obtained [9] This equation can be solved by a Krylov method that requires to implement the action of the Jacobian on an input vector, and thus allows to avoid the assembling of ∇ η F , which leads to the possibility to define a Krylov and semi-smooth Newton matrix-free procedure. This is the method that we use in our calculations in the Steps 1. & 2. of Algorithm 4.1, and for computing the Pareto points discussed in the following section.
Notice that we have discussed our solution methodology at a functional level. However, its numerical realisation requires to approximate the optimality system by appropriate numerical schemes. In particular, we approximate our model using the so-called modified Crank-Nicolson (MCN) scheme. For this purpose, the time domain [0, T ] is subdivided in uniform intervals of size h and N t points, such that t j = ( j − 1)h and 0 = t 1 < · · · < t N t = T .
To conclude this section we prove that Algorithm 4.1 is convergent. For this purpose, we need the following result.
The proof is similar to those given in Lemmas 3.1 and 3.2, hence omitted. We want to show that the map A : is a contraction, whereū(v) is the solution of (38)-(39), i.e., as first, we show the Lipschitz continuity of the map v →ū(v), to get For this purpose and recalling (43), we need to estimate and We estimate (44) and the same holds for (45). The aim is to use the boundedness and Lipschitz continuity of the functions x and p 1 proved in Lemmas 3.1, 3.2, 4.1, and of F 1 . Adding and subtracting the following quantities from (44), we get, Since x, p 1 are Lipschitz continuous in u and bounded, and F 1 is Lipschitz in x and bounded, it follows and hence whereĈ := 2C L F,x, p , i.e. it is a monotonically increasing funtion of T . Hence, the map v →ū(v) is Lipschitz continuous with constant Therefore A is a contraction if Since B(v * ) ⊂ L 2 (0, T ) is a complete space, choosing the parameters τ, ν and T such that (48) holds, the map A admits a unique fixed point. Hence the Algorithm 4.1 converges.

Bargaining Solution
Consider the case N = 2, and assume that the players agree to follow a Nash's bargaining scheme. We can define the set S of all feasible values of the functionals, which can be achieved with an admissible set of controls, as follows Hence, we consider the following Nash bargaining (NB) problem: where d i = J i (u NE ) is a disagreement outcome for Player i. We shall assume that there exists at least one point s ∈ S such that d i > s i , ∀i, and suppose that the NB problem has a solution u * ∈ U(M). Now, we recall that the best achievable outcomes of cooperation are given by the Pareto points [14], which form the so-called Pareto frontier. Assuming that this curve is convex, then it is characterized by all solutions to (7). Therefore the solution to the NB problem is sought on the Pareto frontier, as discussed by J. Nash in [23].
Next, we present a theorem that provides a characterisation of the solution to the NB problem on the Pareto frontier. This result is essential to formulate the algorithm for the computation of the NB point; see also [12].

Theorem 5.1 Let u * ∈ U(M) be such that
for all u ∈ U(M), where and . (53) Proof Let us consider the following equation Adding term by term with (51), it follows Replacing the values of μ 1 , μ 2 , we get .

Multiplying both sides for
Then, dividing by (d 2 − J 2 (u * ))(d 1 − J 1 (u * )) > 0, the following holds Therefore we obtain Notice that in (52) and (53), μ 1 , μ 2 ∈ (0, 1) and μ 1 +μ 2 = 1. Theorem 5.1 states that it is possible to find a point on the Pareto frontier that maximizes the product 2 i=1 d i − J i (u) . Based on this result, we can introduce a computational method to obtain a solution for the bargaining problem; see also [12,13]. It works as follows 1 , solve the optimality system and compute the corresponding payoffs. 2. Update the weight according to the following formula 3. k := k + 1 until (52) holds Set μ * = μ (k) .
In the above algorithm, we need to solve the optimization problem for the Pareto points in order to get the bargaining solution to our model. To this end, the following optimal control problem is considered where μ 1 , μ 1 ∈ (0, 1), with μ 1 + μ 2 = 1. A solution to (57) is characterized by the following optimality system T ) Introducing the control-to-state map (u 1 , u 2 ) → x(u 1 , u 2 ) as in the previous section and the reduced cost functional J (u 1 , u 2 ) := J (x(u 1 , u 2 ), u 1 , u 2 ), the solution to (57) can be written as This problem is solved utilizing the semismooth Newton method discussed in Sect. 4.

Numerical Experiments
In this section, we present results of numerical experiments to validate the theoretical discussion and the proposed algorithms. We start considering a model of two uncoupled spin-1/2 particles, whose state configuration represents the density matrix operator and the corresponding dynamics is governed by the Liouville-von Neumann master equation. This is a basic model of importance in nuclear magnetic resonance (NMR) spectroscopy. We remark that in the case of quantum dynamics constrained on the energy ground state, only transitions between magnetic/spin states are possible. In this case, as discussed in detail in [4], the Pauli-Schrödinger equation represented in a basis of spherical-harmonics becomes where the complex-valued vector a(t) represents the time-dependent coefficients of the spectral discretization, and in the Hamiltonian defined by the operator in the square brackets,H 0 andH x are Hermitian matrices, andH y is a skew-symmetric matrix. In an experimental control setting, we can have that the z-component of the magnetic field, B z , is fixed, and we would like to manipulate the spin orientation of the particle by acting with the transversal magnetic fields B x and B y , which we identify with u 1 and u 2 , respectively, and in this case the controlled dynamics of each spin-1/2 particle is described by the Hamiltonian whereν is the Larmor-frequency, u is the control, and I x , I y and I z are the Pauli matrices. To represent this model, it results very convenient to choose a frame rotating with the Larmor frequency and use the so-called real-matrix representation [4]. We obtain the following model T . As in [2], we choose T = 0.008, and based on our estimate (31), we take ν = 2 · 10 −1 . Let u 0 := (u 0 1 , u 0 2 ). We chose u 0 (t) = (0.1, 0.1), t ∈ [0, T ]. Our problem is to find u 1 ∈ U 1 (M), with M = 60, such that the system aims at performing a transition from the initial state x 0 , where both spins are pointing in the z-direction, to a target state x (1) T where both spins are pointing in the x-direction, and to find u 2 ∈ U 2 (M) that has the aim to drive the system to x (2) T , where an inversion of orientation is desired. To solve this NE problem, we use the relaxation scheme in Algorithm 4.1 with τ = 0.5 and tol u = 10 −3 . In this method, the semi-smooth Newton scheme [7,9] is employed to solve the optimality systems corresponding to the two controls. In this implementation, the differential equations are approximated by the MCN scheme with N t = 1000.
The tolerances for the convergence of the semi-smooth Newton and for the Krylov linear solver are 10 −7 and 10 −8 , respectively. These tolerances are meat always before the maximum number of allowed iterations, which is set equal to 100, is reached. With this procedure, we obtain the NE controls depicted in Fig. 1, which give NE point that is shown in Fig. 2 T 2 = 0.5996. With this point, we can consider the problem of Nash bargaining that assumes cooperation in order to get an improvement of the players' objectives. We solve the bargaining problem T 2 = 0.4155. Furthermore, we consider the Nash bargaining problem. In this case, Algorithm 5.1 provides the solution μ * = 0.5, which is plotted in Fig. 4.
A comparison of the results of the two experiments is conveniently done based on the Figs. 2 and 4 . It is clear that, in the second experiment, the two targets are placed on the same hemisphere and therefore the game is more balanced than the game of the first experiment. This fact can be seen also in the positioning of the NE points and NB points with respect to the Pareto frontier.
Next, in order to provide more insight in the performance of our algorithms, we report some results concerning the relaxation procedure and the semi-smooth Newton scheme. Concerning the relaxation scheme in Algorithm 4.1, we give in Tables 1 and 2 the convergence history of this scheme towards a NE point for the first and second experiment, respectively. In both tables, we see that, initially, both functionals are minimized, and thereafter we see that the values of the norms of the two gradients decrease until both reach zero to high accuracy. Algorithm 4.1 stops when the updates of the controls become smaller than a given tolerance tol u . In some sense, these tables show the history of the game up to reaching the Nash equilibrium. Notice that similar behaviour has been already recorded [20].
To conclude this experiment, we show that our semi-smooth Newton method provides a quadratic convergent behaviour to the solution of the given optimality system. For this pur-  Table 3. Next, we present an application to the so-called competitive Lotka-Volterra equations; see [21] for more details. This model describes the case of two species competing for the same 10 3 limited food resources. For example, one can consider a competition for the territory which is directly related to food resources. In particular, each species has logistic growth in the absence of the other. Then, we focus on the following systeṁ where b i (> 0), i = 1, 2, are the birth rates of the two species and a i j (> 0), i, j = 1, 2, are the competition efficiencies.
Suppose a ii and a ji can be controlled by player i, i.e. a i j = a * i j + u i , with a * i j > 0. Then, the system (60) can be written as (1) in the following waẏ Note that in (61) there are no restriction on the sign of a i j , i.e. also phenomena of mutualism or symbiosis are admitted. In the dynamical system (61), the function f 0 (x) has the steady states (The latter representing co-existence.) Now, in our NE setting with the players' objectives given in (4), we choose x (1)

22
) T , that is, each species aims at the extinction of the other one. In our numerical simulations, the parameters are chosen as in [22], i.e. b 1 = 1, b 2 = 1, a * 11 = 2, a * 22 = 2, a * 12 = 1, a * 21 = 1. Therefore x We obtain the NE controls depicted in Fig. 5, which give the NE point shown in Fig. 6, as a '*'-point. At the Nash equilibrium solution, we get x(T ) − x Concerning the convergence behaviour of our algorithms, we obtain results very similar to those shown in the previous experiments. Therefore they are omitted.

Conclusion
A theoretical and numerical investigation of Nash equilibria (NE) and Nash bargaining (NB) problems governed by bilinear differential models was presented. In this setting, existence of Nash equilibria was proved and computed with a semi-smooth Newton scheme combined with a relaxation method. A related Nash bargaining problem was discussed and a computational procedure for its determination was presented. Results of numerical experiments were presented that successfully demonstrated the effectiveness of the present NE and NB computational framework.