Time-Inconsistent Stochastic Linear-quadratic Differential Game

We consider a general time-inconsistent stochastic linear-quadratic differential game. The time-inconsistency arises from the presence of quadratic terms of the expected state as well as state-dependent term in the objective functionals. We define an equilibrium strategy, which is different from the classical one, and derived a sufficient conditions for equilibrium strategies via a system of forward-backward stochastic differential equations. When the state is one-dimensional and the coefficients are all deterministic, we find an explicit equilibrium strategy. The uniqueness of such equilibrium strategy is also given.


Introduction
Time inconsistency in dynamic decision making is often observe in social systems and daily life.Motivated by practical applications, especially in mathematical economics and finance, time-inconsistency control problems have recently attracted considerable research interest and efforts attempting to seek equilibrium, instead of optimal, controls.At a conceptual level, the idea is that a decision made by the controller at every instant of time is considered as a game against all the decisions made by the future incarnations of the controller.An "equilibrium" control is therefore one such that any deviation from it at any time instant will be worse off.The study on time inconsistency by economists can be dated back to Stroz [23] and Phelps ([21, 22]) in models with discrete time (see [17] and [18] for further developments), and adapted by Karp ([15,16]), and by Ekeland and Lazrak ([5,6,7,8,9,10]) to the case of continuous time.In the LQ control problems, Yong [24] studied a time-inconsistent deterministic model and derived equilibrium controls via some integral equations.
It is natural to study time inconsistency in the stochastic models.Ekeland and Pirvu [11] studied the non-exponential discounting which leads to time inconsistency in an agent's investment-consumption policies in a Merton model.Grenadier and Wang [12] also studied the hyperbolic discounting problem in an optimal stopping model.In a Markovian systems, Björk and Murgoci [3] proposed a definition of a general stochastic control problem with time inconsistent terms, and proposed some sufficient condition for a control to be solution by a system of integro-differential equations.They constructed some solutions for some examples including an LQ one, but it looks very hard to find not-to-harsh condition on parameters to ensure the existence of a solution.Björk, Murgoci and Zhou [4] also constructed an equilibrium for a mean-variance portfolio selection with state-dependent risk aversion.Basak and Chabakauri [1] studied the mean-variance portfolio selection problem and got more details on the constructed solution.Hu, Jin and Zhou [13,14] studied the general LQ control problem with time inconsistent terms in a non-Markovian system and constructed an unique equilibrium for quite general LQ control problem, including a non-Markovian system.
To the best of our knowledge, most of the time-inconsistent problems are associated with the control problems though we use the game formulation to define its equilibrium.In the problems of game theory, the literatures about time inconsistency is little [2,19].However, the definitions of equilibrium strategies in the above two papers are based on some corresponding control problems like before.In this paper, we formulate a general stochastic LQ differential game, where the objective functional of each player include both a quadratic term of the expected state and a state-dependent term.These non-standard terms each introduces time inconsistency into the problem in somewhat different ways.We define our equilibrium via open-loop controls.Then we derive a general sufficient condition for equilibrium strategies through a system of forward-backward stochastic differential equations (FBSDEs).An intriguing feature of these FBSDEs is that a time parameter is involved; so these form a flow of FBSDEs.When the state process is scalar valued and all the coefficients are deterministic functions of time, we are able to reduce this flow of FBSDEs into several Riccati-like ODEs.Comparing to the ODEs in [13], though the state process is scalar valued, the unknowns are matrix-valued because of two players.Therefore, such ODEs are harder to solve than those of [13].Under some more stronger conditions, we obtain explicitly an equilibrium strategy, which turns out to be a linear feedback.We also prove that the equilibrium strategy we obtained is unique.
The rest of the paper is organized as follows.The next section is devoted to the formulation of our problem and the definition of equilibrium strategy.In Section 3, we apply the spike variation technique to derive a flow of FBSEDs and a sufficient condition of equilibrium strategies.Based on this general results, we solve in Section 4 the case when the state is one dimensional and all the coefficients are deterministic.The uniqueness of such equilibrium strategy is also proved in this section.

Problem setting
Let T > 0 be the end of a finite time horizon, and let (W t ) 0≤t≤T = (W 1 t , ..., W d t ) 0≤t≤T be a d-dimensional Brownian motion on a probability space (Ω, F , P). Denote by (F t ) the augmented filtration generated by (W t ).
As in [13], let S n be the set of symmetric n×n real matrices; L 2 F (Ω, R l ) be the set of square-integrable random variables; L 2 F (t, T ; R n ) be the set of {F s } s∈[t,T ] -adapted square-integrable processes; and L 2 F (Ω; C(t, T ; R n )) be the set of continuous {F s } s∈[t,T ] -adapted square-integrable processes.
We consider a continuous-time, n-dimensional nonhomogeneous linear controlled system (cf.[13]) Here A is a bounded deterministic function on [0, T ] with value in R n×n .The other parameters B 1 , B 2 , C, D 1 , D 2 are all essentially bounded adapted processes on [0, T ] with values in R l×n , R l×n , R n×n , R n×l , R n×l , respectively; b and σ j are stochastic processes in L 2 F (0, T ; R n ).The processes u i ∈ L 2 F (0, T ; R l ), i = 1, 2 are the controls, and X is the state process valued in R n .Finally, x 0 ∈ R n is the initial state.It is obvious that for any controls As time evolves, we need to consider the controlled system starting from time t ∈ [0, T ] and state For any controls u i ∈ L 2 F (0, T ; R l ), i = 1, 2, there exists a unique solution X t,x t ,u 1 ,u 2 ∈ L 2 F (Ω, C(0, T ; R n )).We consider a two-person differential game problem.At any time t with the system state X t = x t , the i-th (i = 1, 2) person's aim is to minimize her cost (if maximize, we can times the following function by −1): , where X = X t,x t ,u 1 ,u 2 , and Here, for i = 1, 2, Q i and R i are both given essentially bounded adapted process on [0, T ] with values in S n and S l , respectively, G i , h i , λ i , µ i are all constants in S n , S n , R n×n and R n , respectively.Furthermore, we assume that Q i , R i are non-negative definite almost surely and G i are non-negative definite.
Given a control pair (u * 1 , u * 2 ).For any t ∈ [0, T ), ǫ > 0, and (2.4) F (0, T ; R l ) be a given strategy pair, and let X * be the state process corresponding to (u * 1 , u * 2 ).The strategy pair where u t,ǫ,v i i , i = 1, 2 are defined by (2.4), for any t ∈ [0, T ) and v 1 , v 2 ∈ L 2 F t (Ω, R l ).Remark.The "≥" in (2.5)-(2.6)because of each person want to minimize his/her cost as we claimed before.The above definition means that, in each time t, the equilibrium is a static Nash equilibrium in a corresponding game.

Sufficient conditions
Let (u * 1 , u * 2 ) be a fixed strategy pair, and let X * be the corresponding state process.For any t ∈ [0, T ), define in the time interval [t, T ] the processes (p i (•; t), (k , 2 are the solutions to the following equations: for i = 1, 2. From the assumption that Q i and G i are non-negative definite, it follows that P i (s; t) are nonnegative definite for i = 1, 2. where Proof.Let X t,ǫ,v 1 ,v 2 be the state process corresponding to u t,ǫ,v i i , i = 1, 2. Then by standard perturbation approach (cf.[20,13] or pp.126-128 of [25]), we have where Moreover, by Theorem 4.4 in [25], we have
Because of R i,s and P i (s; t), i = 1, 2 are non-negative definite, H i (s; t), i = 1, 2 are also non-negative definite.In view of (3.9)-(3.10),a sufficient condition for an equilibrium is Similar to Proposition 3.3 of [14], we have the following lemma: Lemma 3.2 For any triple of state and control processes Therefore, we have another characterization for equilibrium strategies: Denote X * as the state process, and (p i (•; t), (k as the unique solution for the BSDE (3.7), with k i (s) = k i (s; t) according to Lemma 3.2 for i = 1, 2 respectively.For i = 1, 2, letting then u * is an equilibrium strategy if and only if Proof.The proof is by Lemma 3.4 of [14] and Theorem 3.4.
The following is the main general result for the time-inconsistent stochastic LQ differential game.
) is an equilibrium strategy pair if the following two conditions hold for any time t: (i) The system of SDEs ) Proof.Given a strategy pair proving the first condition of Definition 2.1, and the proof of the second condition is similar.Theorem 3.4 involve the existence of solutions to a flow of FBSDEs along with other conditions.The system (3.26) is more complicated than system (3.6) in [13].As declared in [13], "proving the general existence for this type of FBSEs remains an outstanding open problem", it is also true for our system (3.26).
In the rest of this paper, we will focus on the case when n = 1.When n = 1, the state process X is a scalar-valued rocess evolving by the dynamics where A is a bounded deterministic scalar function on [0, T ].The other parameters B, C, D are all essentially bounded and F t -adapted processes on [0, T ] with values in R l , R d , R d×l , respectively.Moreover, b ∈ L 2 F (0, T ; R) and σ ∈ L 2 F (0, T ; R d ).In this case, the adjoint equations for the equilibrium strategy become ) for i = 1, 2. For convenience, we also state here the n = 1 version of Theorem 3.4: ) is an equilibrium strategy pair if, for any time t ∈ [0, T ), (i) The system of SDEs

Existence and uniqueness of equilibrium strategy when coefficients are deterministic
The unique solvability of (3.31) remains a challenging open problem even for the case n = 1.However, we are able to solve this problem when the parameters Throughout this section we assume all the parameters are deterministic functions of t.In this case, since G 1 , G 2 has been also assumed to be deterministic, the BSDEs (3.30) turns out to be ODEs with solutions

An intuitional idea and the uniqueness of the equilibrium strategy
As in classical LQ control, we attempt to look for a linear feedback equilibrium strategy pair.For such purpose, motivated by [13], given any t ∈ [0, T ], we consider the following process: where M i , N i , Γ i , Φ i are deterministic differentiable functions with Ṁi = m i , Ṅi = n i , Γi = γ i and Φi = φ i for i = 1, 2. The advantage of this process is to separate the variables X * s , E t [X * s ] and X * t in the solutions p i (s; t), i = 1, 2, thereby reducing the complicated FBSDEs to some ODEs.
For any fixed t, applying Ito's formula to (4.32) in the time variable s, we obtain, for i = 1, 2, Notice that k(s; t) turns out to be independent of t.
Putting the above expressions (4.32) and (4.34) of p i (s; t) and k i (s; t), i = 1, 2 into (3.25),we have (4.35) for i = 1, 2. Then we can formally deduce and hence ) Next, comparing the ds term of dp i (s; t) in (3.31) and (4.33) (we supress the argument s here), we have due to the omission of s.This leads to the following equations for M i , N i , Γ i , Φ i : 2 are scalars, M, N, Γ, Φ are now matrices because of two players.Therefore, the above equations are more complicated than the similar equations (4.5)-(4.8) in [13].Before we solve the equations (4.41)-(4.44),we first prove that, if exist, the equilibrium constructed above is the unique equilibrium.Indeed, we have and exist, and for i = 1, 2, (p i (s; t), k i (s; t)) ∈ L 1 × L 2 , the equilibrium strategy is unique.
Proof.Suppose there is another equilibrium (X, u 1 , u 2 ), then the equation system (3.7), with X * replaced by X, admits a solution (p i (s; t), k i (s), u i,s ) for i = 1, 2, which satisfies B i,s p i (s; s) where k i (s) = k i (s; t) by Lemma 3.2.
We define p(s; t) = diag(p 1 (s; t)I l , p 2 (s; t)I l ), p(s; t) = diag( p1 (s; t)I l , p2 (s; t)I l ), and u = u 1,s u 2,s .By the equilibrium condition (3.25), we have ) and hence for i = 1, 2, where we suppress the subscript s for the parameters, and we have used the equations (4.41)-(4.44)for M i , N i , Γ i , Φ i in the last equality.From (4.47) and (4.48), we have ( pi , ki ) ∈ L 1 × L 2 .Therefore, by Theorem 4.2 of [14], we have p(s; t) ≡ 0 and k(s) ≡ 0. Finally, plugging p ≡ k ≡ 0 into u of (4.50), we get the u being the same form of feedback strategy as in (4.36), and hence (X, u 1 , u 2 ) is the same as (X * , u * 1 , u * 2 ) which we got before.

Existence of the equilibrium strategies
The solutions to ( Finally, once we get the solution for (M 1 , M 2 , N 1 ), (4.44) is a simple ODE.Therefore, it is crucial to solve (4.54).
Formally, we define M = M 1 M 2 and J 1 = M 1 N 1 and study the following equation for (M 1 , M, J 1 ): where T s A t dt I l ).By a direct calculation, we have Proposition 4.2 If the system (4.55)admits a positive solution (M 1 , M, J 1 ), then the system (4.54)admits a solution (M 1 , M 2 , N 1 ).
In the following, we will use the truncation method to study the system (4.55).For convenienc, we use the following notations: Moreover, for a matrix M ∈ R m×n and a real number c, we define

.59)
We first consider the standard case where R − δI 0 for some δ > 0. We have Proof.For fixed c > 0 and K > 0, consider the following truncated system of (4.55):

.61)
Since R − δI 0, the above system (4.60) is locally Lipschitz with linear growth, and hence it admits a unique solution (M c,K 1 , Mc,K , J c,K 1 ).We will omit the superscript (c, K) when is no confusion.We are going to prove that J 1 ≥ 1 and that M 1 , M ∈ [L 1 , L 2 ] for some L 1 , L 2 > 0 independent of c and K appearing in the truncation functions.We denote Then λ (2) is bounded, and M 1 satisfies
The equation for M is (4.64) hence M admits an upper bound L 2 independent of c and K. Choosing K = L 2 and examining again (4.64), we deduce that there exists L 1 > 0 independent of c and K such that M ≥ L 1 .Indeed, we can choose As a result, choosing c < L 1 , the terms M + c can be replaced by M = diag(M 1 I l , M 1 M I l ), respectively, in (4.60) without changing their values.Now we prove J ≥ 1. Denote J = J 1 − 1, then J satisfies the ODE: ˙J = −λ (1) J − [λ (1)  (1) J − a (1) , (4.65) where a (1) = λ (1)   and consequently a (1) ≥ tr{(R + MD ′ D) −1 H} ≥ 0. We then deduce that J ≥ 0, and hence J 1 ≥ 1.The boundness of M 1 can be proved by a similar argument in the proof of Theorem 4.2 in [13].
Similarly, for the singular case R ≡ 0, we have Concluding the above two theorems, we can present our main results of this section:
Proof.Define p i (s; t) and k i (s; t) by (4.32) and (4.34), respectively.It is straightforward to check that (u * 1 , u * 2 , X * , p 1 , p 2 , k 1 , k 2 ) satisfies the system of SDEs (3.31).Moreover, in the both cases, we can check that α i,s and β i,s in (4.36) are all uniformly bounded, and henceu * i ∈ L 2 Finally, denote Λ i (s; t) = R i,s u * i,s + p i (s; t)B i,s + (D i,s ) ′ k i (s; t), i = 1, 2. Plugging p i , k i , u *i define in (4.32),(4.34)and(4.36) into Λ i , we haveΛ i (s; t) = R i,s u * i,s + (M i,s X * s − N i,s E t [X * s ] − Γ i,s X * t + Φ i,s )B i,s + M i,s D ′ i,s [C s X * s + D 1,s u * 1,s + D 2,s u * 2,s + σ s ] (4.69) = (R t + M t D ′ t D t )u * t + M t (B t + D ′ t C t )X * t − N t B t E t [X * t ] − Γ t B t X * t + (Φ t B t + M t D ′ t σ t ) = −[(M t − N t − Γ t )B t + M t D ′ t C t ]X * t − (Φ t B t + M t D ′ t σ t ) +M t (B t + D ′ t C t )X * t − N t B t X * t − Γ t B t X * F (0, T ; R l ) and X * ∈ L 2 (Ω; C(0, T ; R)). t + (Φ t B t + M t D ′ t σ t ) = 0. (4.70)Therefore, Λ i satisfies the seond condition in(3.25).