Linear Quadratic Differential Games with Mixed Leadership: The Open-Loop Solution

This paper is concerned with open-loop Stackelberg equilibria of two-player linear-quadratic differential games with mixed leadership. We prove that, under some appropriate assumptions on the coefficients, there exists a unique Stackelberg solution to such a differential game. Moreover, by means of the close interrelationship between the Riccati equations and the set of equations satisfied by the optimal open-loop control, we provide sufficient conditions to guarantee the existence and uniqueness of solutions to the associated Riccati equations with mixed-boundary conditions. As a result, the players' open-loop strategies can be represented in terms of the system state.


Introduction
In 1934, H. Von Stackelberg introduced a concept of a hierarchical solution for markets where some firms have power of domination over others [22]. This concept of solution is now known as the Stackelberg equilibrium or the Stackelberg solution which, in the context of two-person nonzero-sum games, involves players with asymmetric roles, one leading (accordingly called the leader ) and the other one following (called the follower ). The game proceeds with the leader announcing his policy first (which would be his action if the information pattern is not dynamic), and the follower reacting to it by optimizing his performance index in accordance with the leader's announced policy. Of course, the leader has to anticipate this response (assuming that he knows the utility or cost function of the follower) and picks that policy which will optimize his performance index given the follower's rational response. Assuming that the follower's optimum (rational) response is unique to each announced policy of the leader (that is, the follower has a unique rational response curve), then the best policy of the leader is the one that optimizes his performance index on the rational reaction curve of the follower, which together with the corresponding unique policy/action of the follower is known as the Stackelberg solution. If the follower's response is not unique, however, the rational response curve is replaced with a rational reaction set, in which case taking a pessimistic approach on the part of the leader, his optimization problem is to find the best policy under worst choices by the follower (worst from the point of view of the leader) from the rational response set; such a solution is known as the generalized Stackelberg solution [17,4].
The notion of the Stackelberg solution was later extended to multi-period settings in the early 1970's by Simaan and Cruz [20,21], who also introduced the notion of a feedback Stackelberg solution where the leader dictates his policy choices on the follower only stage-wise, and not globally. Such a solution concept requires (in a dynamic game setting) that the players know the current state of the game at every point in time, and its derivation involves a backward recursion (as in dynamic programming), where at every step of the recursion the Stackelberg solution of a static game is obtained. When the leader has dynamic information, and is able to announce his policy for the entire duration of the dynamic game ahead of time (and not stage-wise), then the Stackelberg solution, even though well defined as a concept, is generally very difficult to obtain, because the underlying optimization problems are then on the policy spaces of the two players, with the reaction sets or functions generally being infinite dimensional. Derivation of such global Stackelberg solutions for dynamic games with dynamic information patterns also have connections to incentive design or mechanism design problems, and is still an active research area; see the text by Başar and Olsder [4] as well as the recent work by Bensoussan et al. [7] for these connections, and for an overview of various types of Stackelberg solutions.
It is possible to introduce the global and the feedback Stackelberg solutions also for dynamic games defined in continuous time, the so-called differential games. The latter, that is the feedback Stackelberg solution, is introduced in differential games as the limit of a sequence of feedback Stackelberg solutions of discretized (in time) versions of the original differential game, which become the collection of point-wise Stackelberg solutions of coupled Hamilton-Jacobi-Bellman systems, as first demonstrated by Başar and Haurie [2]. One can also refer to Bensoussan et al. [6] for infinite-horizon stochastic differential games. The feedback Stackelberg solution (for a class of dynamic and differential games) is shown indeed an equilibrium solution, in the sense of Nash, where the leader and the follower enter the game symmetrically, albeit with some additional information on the actions of the follower provided to the leader as part of the information pattern (that is, even though the play is symmetric, the information pattern is not). For applications in economics and management science, see [8], [9], [10], [11], [12], [13], [14], [15], [16], and [19].
Most of the literature on Stackelberg games, with a few exceptions, has assumed that the roles of the players are fixed at the outset, and the leader remains a leader for the entire duration of the game; likewise follower remains as follower. Perhaps the first paper that brought up the issue of whether being a leader is always advantageous to a player is [1]. It turns out that leadership is not always a preferred option for the players: for some classes of games, leadership of one specific player is preferred by both (in the sense that both players collect highest utilities when compared with what they would collect under other combinations), which is a stable situation, whereas there are other classes of games where either both players prefer their own leadership or neither do, which leads to a stalemate situation. Sometimes, the players do not have the option of leadership open to them, but leadership is governed by an exogenous process (say a Markov chain), which determines who should be the leader and who the follower at each stage of the game, perhaps based on the history of the game or the current state of the game. Başar and Haurie [2] have shown that feedback Stackelberg solution is a viable concept for such (stochastic) dynamic games also, and have obtained recursions for the solution.
More recently, Başar et al. [5] introduced the notion of mixed leadership in nonzerosum differential games with open-loop information patterns, where the same player can act as leader in some decisions and as follower in others, depending on the instrument variables he is controlling. This kind of game proceeds as follows. The two players first announce their policies both as leaders, simultaneously, and then both of them respond simultaneously, in the role of followers, to optimize their corresponding cost functions in the sense of Nash equilibrium. Given the rational response, the two players again act as leaders to optimize their respective performance index also in the sense of Nash equilibrium. Therefore, the mixed differential game consists of two Nash games (the parallel play) and one stackelberg game (hierarchical play). Başar et al. [5] used the Maximum Principle to obtain a set of algebraic equations and differential equations with mixed-boundary conditions that characterize an optimal open-loop Stackelberg solution. In particular, a linear quadratic differential game was discussed as a specific case and related coupled Riccati equations with mixed boundary conditions were derived by which the optimal Stackelberg solution can be represented in terms of the system state. While this is not the same as feedback Stackelberg solution, the representation in terms of the system state may perform better in presence of not explicitly modeled small noise.
Riccati equations with terminal conditions, arising from deterministic as well as stochastic linear-quadratic optimal control problems, have been extensively studied due to their important role in the feedback representation of optimal controls. However, Riccati equations with mixed boundary conditions, which seem to appear for the first time in [5], have not been explored in the literature. Inspired by the work of Tang [23] which completed the challenging problem concerning the existence and uniqueness of solutions to backward stochastic Riccati equations, we will apply in this paper the interrelationship between the optimality conditions and the related Riccati equations to obtain existence and uniqueness results on such Riccati equations arising from dynamic games with mixed leadership under the open-loop information structure. As for the theory of the Stackelberg equilibrium under feedback information pattern in the mixed-leadership frame, we will leave it for a future paper.
The paper is organized as follows. In section 2 we impose some assumptions on the coefficients and prove the existence and uniqueness of the Stackelberg solution to the linear-quadratic two-player differential game with mixed leadership. Section 3 is devoted to explore the relationship between the optimality conditions for control laws and the related Riccati equations. Section 4 concludes the paper.

The existence and uniqueness of Stackelberg equilibrium
We consider a two-player nonzero-sum linear-quadratic differential game with mixed leadership on the fixed duration [0, T ]. Denote by (u ∈ R 2 , the continuous decision variables of player 1 and player 2, respectively. Since we restrict the study to the open-loop information structure, the above decision variables depend only on the time variable and the initial state x 0 . Let the state system be described byẋ and the cost functionals to be minimized be given as where Q 1 , Q 2 , B 1 and B 2 are positive and bounded functions, r 1 , r 2 , s 1 and s 2 are bounded functions, and S 1 and S 2 are positive constants. Then, Theorem 2 in [5] yields that the optimal Stackelberg solution, denoted by (u * 1 , v * 1 ) and (u * 2 , v * 2 ), satisfies the following system (we omit time variable t for the sake of simplicity): Then the above system of optimality conditions can be rewritten as Substituting u(t) = −N −1 (t)E(t) y(t) into the differential equations, we get the following two-point boundary-value system, which is also called a coupled forward-backward ordinary differential equation (FBODE for short) Then we have the following existence and uniqueness result.
where the positive constant C depends on λ, k 1 , k 2 , k 3 , C 1 , and C 2 .
Proof. In an approach similar to that of the stochastic case in [18], we use the contraction mapping principle to give a concisely self-contained proof. To this end, let L 2 (τ, T ; R m ) be the set of all R m -valued square integral functions defined on the interval [τ, T ]. For any y ∈ L 2 (τ, T ; R 4 ), we can obtain the unique solution x ∈ L 2 (τ, T ; Then we can define a mapping In the same way, for any x ∈ L 2 (τ, T ; R 5 ), we can define another mapping where y is the solution to the backward equation
In view of (2.14) and (2.16), we can deduce from (2.13) that where the positive constant C depends on λ, k 1 , k 2 , k 3 , C 1 , and C 2 , and decreases as k 1 decreases.

The relationship between Riccati equations and FBODEs
In order to further represent the Stackelberg equilibrium in terms of the corresponding state trajectory, we introduce continuously differential functions P j , j = 1, 2, · · · , 8, such that Then we can decouple the FBODE (2.4) and get the following Riccati equations which are exogenous in the sense that they only depend on the coefficients in the linear system (2.1) and quadratic cost functionals (2.2): In contrast to the Riccati equations with terminal conditions from optimal control problems, the Riccati equations here appear with mixed-boundary conditions. This feature results from the complex hierarchical structure in the differential game. In what follows, we will prove the existence and uniqueness of solutions to such Riccati equations by virtue of the analysis of the FBODEs. In the context of optimal control theory, it is well known that the Riccati equations can serve as an efficient tool to decouple the Hamiltonian system (conditions for optimal control) so as to derive the optimal feedback control. On the other hand, the homomorphism of the stochastic flows of the Hamiltonian system, as demonstrated in [23], also yields the existence of the solutions to the related Riccati equations. In this section, we utilize the interrelationship between the Riccati equations and the Hamiltonian system to tackle the Riccati equations (3.2) derived from the linear-quadratic differential game with mixed leadership. To be more precise, we impose some additional assumption on the coefficients of the game to ensure the existence and uniqueness of the solutions to Riccati equations (3.2) so as to represent the open-loop Stackelberg equilibrium in terms of the system state.
Theorem 2. Let (H1) and (H2) hold. Then, for any h ∈ R 5 and τ ∈ [0, T ], In particular, Proof. From the definitions of ( x τ,t (h), y τ,t (h)) and (X τ,t , Y τ,t ), and the linearity of FBODE (2.5), it is easy to verify (3.3). Moreover, Owing to the arbitrariness of h, we get Next we prove that Y t,t satisfies some Riccati equation. Since the FBODE (2.4) is derived from a linear quadratic game problem rather than an optimal control problem, the related Riccati equation is different from the one in [23] in that the Riccati equation here is non-symmetric. For the study of non-symmetric matrix Riccati equations, one can refer to [24]. By analogy with the stochastic case in [23], we construct the solution to such a Riccati equation via the solution to FBODE (2.4) rather than treating the Riccati equation directly. Since the case we consider here is deterministic, the key point in [23] that X t has an inverse for every t ∈ [0, T ] can be obtained easily. Proof. According to the definition of (X t , Y t ), it satisfies the equation In view of (3.4), X t satisfies the linear equation Then the inverse X −1 t of matrix X t exists, and it satisfies (3.8) Theorem 4. Let P t := Y t,t . Then, P t is continuous and differentiable on the interval [0, T ], and satisfies which implies that P t is continuous and differentiable. Suppose dP t =P t dt. Then differentiating both sides of the equality Y t = P t X t , we have Owing to the invertibility of X t , we havẽ Now we go back to equation (2.3) and u(t) can be represented as The above expression implies that the decision variables u 1 , v 1 , u 2 , v 2 can be represented via x, ψ 1 , ψ 2 , β 1 , and β 2 . In order to further represent the decision variables only in terms of the state variable x, we need the solutions P 4 , P 5 , P 7 , P 8 to the Riccati equations (3.2), i.e., the relationship between x and ψ 1 , ψ 2 , β 1 , β 2 . From the expression we conclude that the fact of x(t) = x 1 t (e 1 )x 0 being not equal to 0 on the interval [0, T ] is equivalent to the fact that the coupled Riccati equations (3.2) with the mixed-boundary conditions have unique solutions. Indeed, on one hand, if there exists a t 0 ∈ [0, T ] such that x(t 0 ) = x 1 t 0 (e 1 )x 0 = 0 and the coupled Riccati equations have solutions P i (t), i = 1, · · · , 8, contradicts with the fact that X t = ( x t (e 1 ), x t (e 2 ), x t (e 3 ), x t (e 4 ), x t (e 5 )) has an inverse at each t ∈ [0, T ]. On the other hand, if x(t) = x 1 t (e 1 )x 0 is not equal to 0 on [0, T ], we define , , , , and define , , , P 6 (t) := α(t) x(t) = y 4 t (e 1 ) x 1 t (e 1 ) , which can also be written as where P is the solution to the Riccati equation (3.9). It can be verified that the above defined P i , i = 1, 2, · · · , 8, solve the Riccati equation (3.2). The uniqueness of solutions to (3.2) can be obtained by the uniqueness of solution to FBODE (2.4). Finally, we impose some additional condition on the coefficients to guarantee that x(t) = Taking into account FBODE (2.4) that ( x t ( x 0 ), y t ( x 0 )) satisfies as well as (3.3), we have Noticing the estimation of Y t in Theorem 1, we have the following result. Then, x(t) = x 1 t (e 1 )x 0 is not equal to 0 on the interval [0, T ].
Proof. We can deduce from (3.13) that Therefore, under the assumptions (H1), (H2) and (H3), we prove that the coupled Riccati equations (3.2) with the mixed-boundary conditions have unique solutions.