Switching Game of Backward Stochastic Differential Equations and Associated System of Obliquely Reflected Backward Stochastic Differential Equations

This paper is concerned with the switching game of a one-dimensional backward stochastic differential equation (BSDE). The associated Bellman-Isaacs equation is a system of matrix-valued BSDEs living in a special unbounded convex domain with reflection on the boundary along an oblique direction. In this paper, we show the existence of an adapted solution to this system of BSDEs with oblique reflection by the penalization method, the monotone convergence, and the a priori estimates.


Introduction
Let (Ω, F, P ) be a complete probability space, carrying a standard Brownian motion W = {W t } t≥0 with values in R d . {F t } t≥0 is the natural filtration of the Brownian motion W augmented by the P -null sets of F.
Consider two players I and II, who use their respective switching control processes a(·) and b(·) to control the following BSDE  Here, ξ is an m-dimensional random variable measurable with respect to the past of W up to time T . ξ is called the terminal condition and ψ is called the coefficient (also called the generator). A a(·) (·) and B b(·) (·) are the cost processes associated with the switching control processes a(·) and b(·), respectively; they are càdlàg processes. Under suitable conditions, the above BSDE has a unique adapted solution, denoted by (U a(·),b(·) , V a(·),b(·) ).
Player I chooses the switching control a(·) from a given finite set Λ to minimize the cost J(a(·), b(·)) = U a(·),b(·) (0), (1.2) and each of his instantaneous switchings from one scheme i ∈ Λ to another different scheme i ′ ∈ Λ incurs a positive cost which will be specified by the function k(i, i ′ ), (i, i ′ ) ∈ Λ × Λ. While Player II chooses the switching control b(·) from a given finite set Π to maximize the cost J(a(·), b(·)), and each of his instantaneous switchings from one scheme j ∈ Π to another different scheme j ′ ∈ Π incurs a positive cost which will be specified by the function l(j, j ′ ), (j, j ′ ) ∈ Π × Π. We are interested in the existence and the construction of the value process as well as the saddle point. The solution of the above-stated switching game will appeal to the following new type of reflected backward stochastic differential equation (RBSDE for short) with oblique reflection: for (i, j) ∈ Λ × Π and t ∈ [0, T ], One-dimensional RBSDEs were first studied by El Karoui et al. [7] in the case of one obstacle, and then by Cvitanic and Karatzas [3] in the case of two obstacles. In both papers, it is recognized that one-dimensional reflected BSDEs, with one obstacle and with two obstacles, are generalizations of optimal stopping and Dynkin games, respecively. Nowadays, the literature on one-dimensional reflected BSDEs is very rich. The reader is referred to Hamadène and Hassani [9] and Peng and Xu [17], among others, for the one-dimensional reflected BSDEs with two obstacles.
Multidimensional RBSDE was studied by Gegout-Petit and Pardoux [8], but their BSDE is reflected on the boundary of a convex domain along the inward normal direction, and their method depends heavily on the properties of this inward normal reflection (see (1)-(3) in [8]). We note that in a very special case (e.g., ψ is independent of z), Ramasubramanian [18] studied a BSDE in an orthant with oblique reflection. Multi-dimensional BSDEs reflected along an oblique direction rather than a normal direction, still remains to be open in general, even in a convex domain, let alone in a nonconvex domain. Note that there are some papers dealing with SDEs with oblique reflection (see, e.g. [14] and [5]).
Recently, the authors [13] studied the optimal switching problem for one-dimensional BSDEs, and the associated following type of obliquely reflected multi-dimensional BSDEs: It should be added that an incomplete and less general form of RBSDE (1.4) (where the minimal condition of (1.4) is missing and the generator ψ does not depend on (y, z)) is suggested by [2]. But they did not discuss the existence and uniqueness of solution, which is considered to be difficult. See Remark 3.1 in [2]. Lately, Hamadène and Zhang [11] also studied RBSDE (1.4) in a more general form. More recently, Tang and Zhong [22] discussed the mixed switching and stopping problem for one-dimensional BSDEs, and obtained the existence and uniqueness result for the the associated following type of obliquely reflected multi-dimensional BSDEs: Here, S is a previously given {F t , 0 ≤ t ≤ T }-adapted process with some suitable regularity. RBSDE (1.3) arises from the switching game for BSDEs, and its form is more complicated than that of RBSDE (1.4), which arises from the optimal switching problem for BSDEs. For each fixed j ∈ Π, if we do not impose the following constraint: (1.6) and its related boundary condition: for any i, i ′ ∈ Λ such that i ′ = i and j ∈ Π; y ij > y ij ′ − l(j, j ′ ) for any j, j ′ ∈ Π such that j ′ = j and i ∈ Λ , which is convex and unbounded. The boundary ∂Q of domain Q consists of the boundaries ∂D − ij and ∂D + ij , (i, j) ∈ Λ × Π, with In the interior of Q, each equation in (1.3) is independent of others. On the boundary, say ∂D − ij (resp. ∂D + ij ), the (i, j)-th equation is switched to another one (i ′ , j) (resp. (i, j ′ )), and the solution is reflected along the oblique direction −e ij (resp. e ij ), which is the negative (resp. positive) direction of the (i, j)-th coordinate axis. The existence and uniqueness of solution for RBSDE (1.3) constitutes a main contribution of this paper. We prove the existence by a penalization method, whereas the uniqueness is obtained by a verification method: first we introduce an optimal switching problem for multi-dimensional RBSDEs of form (1.4) , then we prove that the first component Y of any adapted solution (Y, Z, K, L) of RBSDE (1.3) is the (vector) value for the optimal switching problem.
Solution of RBSDE (1.3) presents new difficulties when one follows either our previous work [13] using the penalization method, or Hamadène and Zhang [11] using a Picard approximation. In fact, even for the proof of the existence, we have to use the representation for obliquely reflected BSDEs-an extended version of our previous representation result proved in [13]. Moreover, we have to impose, for the proof of the existence, the additional technical condition that the generator ψ is uniformly bounded.
There exist different methods in the literature for the study of switching control and game problems. For the classical method of quasi-variational inequalities, the reader is referred to the book of Bensoussan and Lions [1]. See Tang and Yong [21] and Tang and Hou [20] and the references therein for the theory of variational inequalities and the dynamic programming for optimal stochastic switching control and switching games. But these works are restricted to the Markovian case. Recently, using the method of Snell envelope (see, e.g. El Karoui [6]) combined with the theory of scalar valued RBSDEs, Hamadene and Jeanblanc [10] studied the switching problem with two modes (i.e., m = 2) in the non-Markovian context. Djehiche, Hamadene and Popier [4] generalized their result to the above switching problem with multi modes. The BSDE approach, firstly developed in Hu and Tang [13] for optimal stochastic switching and taking the advantage of modern theory and techniques of BSDEs, permits to state and solve these problems in a general non-Markovian framework. This paper is devoted to the development of the BSDE approach for stochastic switching games.
The paper is organized as follows: in Section 2, we introduce some notation and formulate the switching game for one-dimensional BSDEs. We prove the existence of solution by a penalization method in Section 3, whereas in Section 4 we study the uniqueness. The last section is devoted to the proof of the existence of the value process and the construction of the saddle point for our switching game.
2 Notations, and Formulation of our switching game

Notations
Let us fix a nonnegative real number T > 0. First of all, W = {W t } t≥0 is a standard Brownian motion with values in R d defined on some complete probability space (Ω, F, P ). {F t } t≥0 is the natural filtration of the Brownian motion W augmented by the P -null sets of F. All the measurability notions will refer to this filtration. In particular, the sigma-field of predictable subsets of [0, T ] × Ω is denoted by P.
We denote by S 2 (R m 1 ×m 2 ) or simply by S 2 the set of R m 1 ×m 2 -valued, adapted and càdlàg processes We denote by M 2 ((R m 1 ×m 2 ) d ) or simply by M 2 the set of (equivalent classes of) predictable processes M 2 is then a Banach space endowed with this norm.
We define also is then a Banach space.

Hypotheses
Consider now the RBSDE (1.3). The generator ψ is a random function ψ : for each pair (i, j) ∈ Λ × Π, and the terminal condition ξ is simply an R m 1 ×m 2 -valued F T -measurable random variable. The cost functions k and l for two players are defined on Λ × Λ and Π × Π, respectively; their values are both positive. We assume the following Lipschiz condition on the generator.
We make the following assumption on the cost functions k and l of both players, which is standard in the literature.

Statement of our switching game
Let {θ j } ∞ j=0 be an increasing sequence of stopping times with values in [0, T ] and ∀j, α j is an F θ j -measurable random variable with values in Λ, and χ is the indicator function. We assume moreover that there exists an integer-valued random variable N (·) such that θ N = T P -a.s. and N ∈ L 2 (F T ). Then we define the admissible switching strategy for player I as follows: We denote by A the set of all these admissible switching strategies for Player I, and by A i the subset of A consisting of admissible switching strategies starting from the scheme i ∈ Λ. In the same way, we denote by A t the set of all the admissible strategies for Player I, starting at the time t (or equivalently θ 0 = t ), and by A i t the subset of A t consisting of admissible switching strategies starting at time t from the scheme i ∈ Λ.
For any a(·) ∈ A, we define the associated (cost) process A a(·) as follows: Obviously, A a(·) (·) is a càdlàg process.
In an identical way, we define the admissible switching strategy b(·) for Player II, and introduce the notations B, B j , B t , B j t for the scheme j ∈ Π for Player II, as well as Now we are in position to introduce the switched BSDEs for both players.
This is a (slightly) generalized BSDE: it is equivalent to the following standard BSDE: via the simple change of variable: Hence, BSDE (2.3) has a solution in S 2 × M 2 . We denote by U a(·),b(·) , V a(·),b(·) this solution. We note that U is only a càdlàg process.
The switching game problem with the initial scheme (i, j) ∈ Λ × Π is stated as follows: Player I aims to minimize U a(·),b(·) (t) over a(·) ∈ A i t , while Player II aims to maximize U a(·),b(·) (t) over b(·) ∈ B j t .

Existence
In this section, we state and prove our existence result for RBSDE (1.3). We need the following additional technical assumption.
Hypothesis 3.1. Assume that the generator ψ is uniformly bounded with respect to all its arguments.
we shall use ψ ∞ to denote the least upper bound of |ψ|.
We are now in position to state the existence result.
We shall use a penalization method to construct a solution to RBSDE (1.3). We observe (as mentioned in the introducion) that RBSDE (1.3) consists of the m 2 systems of m 1 -dimensional obliquely reflected BSDEs of the form like (1.4): with the unknown processes being (the process (L 1j , · · · , L m 1 j ) is taken to be previously given) for j = 1, 2, . . . , m 2 . These m 2 systems have been well studied by Hu and Tang [13]. In RBSDE (1.3), they are coupled together by the processes (L 1j , · · · , L m 1 j ) through the constraint and the minimal boundary condition: Therefore, it is natural to consider the following penalized system of RBSDEs (the unknown processes are (Y ij , Z ij , K ij ; i ∈ Λ, j ∈ Π)): (3.4) Note that when j ′ = j, we have, in view of Hypothesis 2.2 (ii), A striking difference between RBSDE (3.4) and the RBSDE which is studied by Hu and Tang [13], lies in the fact that the i-th set of unknown variables in the latter is replaced in the former with By slightly adapting the relevant arguments in our previous work [13], we can show the following theorem.
Here, for any a(·) ∈ A i t , (U a(·),n , V a(·),n ) is the unique solution to the following BSDE: (3.7) Intuitively, as n tends to +∞, we expect that the sequence of solutions together with the penalty term will have a limit (Y, Z, K, L), which solves RBSDE (1.3). For this purpose, it is crucial to prove that the penalty term is bounded in some suitable sense. Then we are naturally led to compute using Itô-Tanaka's formula, as done in Hu and Tang [13]. However, in our present situation, the additional term K n appears in RBSDE (3.4), which gives rise to a serious difficulty to derive the bound of L n in the preceding procedure. In what follows, we shall use the respresentation result for Y n in the preceding theorem to get around the difficulty.
We have the following lemma Proof. We suppress the superscripts (a(·), n) of U a(·),n j and U a(·),n j ′ for simplicity. The processŪ jj ′ (s) := U j (s)−U j ′ (t)+l(j, j ′ ), s ∈ [t, T ] satisfies the following BSDE: (3.9) Applying Tanaka's formula (see, e.g. [19]), we havē and L jj ′ is the local time of the processŪ jj ′ at 0. We havē (3.12) We claim that the integrands of the integrals in the last two terms of (3.12) are all less than or equal to zero. In fact, since Secondly, for j, j ′ , j ′′ ∈ Π, taking into consideration both Hypothesis 2.2 (ii), i.e., l(j, j ′ ) + l(j ′ , j ′′ ) > l(j, j ′′ ), and the elementary inequality that x − 1 − x − 2 ≤ (x 1 − x 2 ) − , for any two real numbers x 1 and x 2 , we have (3.14) The last equality holds in the last relations, since Concluding the above, we havē (3.15) Using Itô's formula, we have (3.16) Taking the conditional expectation with respect to F s on the both sides of the last equality, in view of Hypothese 3.1, we have This ends the proof.
We have the following estimates for the L ∞ bound of Y n ij .
Proof. In view of the comparison result for multi-dimensional RBSDEs of Tang and Zhong [22] (which is a natural generalization to RBSDEs of the comparison theorem for multi-dimensional BSDEs), we see that the sequence {Y n ij (t)} ∞ n=1 is decreasing. We have the following two facts: In view of the respresentation formula in Theorem 3.2, we conclude the proof. Therefore, from the previous representation result, we have We have Proof. From the RBSDE for Y n ij , using Itô's formula and Lemmas 3.2 and 3.3, we have (3.23) and (3.24) They together conclude the proof. Define We have Lemma 3.5. For (i, j) ∈ Λ × Π and integer n, there is a uniformly bounded process α n ij such that K n ij has the following form: The two matrix-valued processes Proof. Fix the integer n. Consider the following penalized BSDEs: Therefore, {α n,m ij } ∞ m=1 has a weak limit in L 2 F (0, T ), denoted by α n ij . Then α n ij is also uniformly bounded by the same constant c. Define  Proof. Note that Y n ij is decreasing in n. In view of Lemma 3.2, using Lebesgue dominant convergence theorem, we can show the strong convergence of {Y n } in the space M 2 . Note that (Y n , Z n ) solves the following BSDE: with {α n } and {β n } being bounded in M 2 . We now prove the strong convergence of Z n . Using Itô's formula, we have First, fixing t ∈ [0, T ] and letting n → ∞ in BSDE (3.32), we take the weak limit in L 2 (F T ). Then, we see that (Y, Z, K, L) solves the following BSDE: Now we check out the boundary conditions. From Hu and Tang [13], we have Setting n → ∞, we have That is, On the other hand, from the construction, we have Setting n → ∞, we have That is, The proof is then complete.

Uniqueness
In this section, we prove the uniqueness by a verification method. Let (Y, Z, K, L) be a solution in the space S 2 × M 2 × N 2 × N 2 to RBSDE (1.3). We shall prove that Y is in fact the (vector) value for an optimal switching problem of RBSDEs. For this purpose, we introduce the following optimal switching problem for RBSDEs. For a(·) ∈ A i t , we denote by (U a(·) j , V a(·) j , K a(·) j ; j ∈ Π) the uniuque solution of the following RBSDEs: Proof. Assume t = 0 without loss of generality. Otherwise, it suffices to consider the admissible switching strategies starting at time t.
For the following a(·) ∈ A i 0 : we define for s ∈ [0, T ] and j ∈ Π, Noting that Y a(·) j (·) is a càdlàg process with jump Y αp,j (θ p ) − Y α p−1 ,j (θ p ) at θ p , p = 1, · · · , N − 1, we deduce that (4.8) and it is an increasing process due to the fact that Consequently, we conclude that ( Y a(·) j , Z a(·) j , L a(·) j ; j ∈ Π) is a solution of the following RBSDE: with the following boundary condition: (4.10) Since both K a(·) and A a(·) are increasing càdlàg processes, from the comparison theorem of Tang and Zhong [22] for multi-dimensional RBSDEs, we conclude that On the other hand, we set θ * 0 = 0, α * 0 = i. We define the sequence {θ * p , α * p } ∞ p=1 in an inductive way as follows: 11) and α * p is an F θ * p -measurable random variable such that 14) and it satisfies the following RBSDE: Now, it is standard to show from RBSDE (4.15) that A a * (·) (T ) ∈ L 2 (F T ). Then, following the same arguments of Hu and Tang [13], we see that P -a.s. ω, there exists an integer N (ω) ∈ L 2 (F T ) such that θ * N = T . The switching strategy a * (·) is admissible, i.e., a * (·) ∈ A i 0 . Moreover, The proof is complete.
Define for i, p ∈ Λ, j, q ∈ Π, and either i = p or j = q, the following set: or y ij = y iq − l(j, q) if i = p .

(5.5)
In this section, we need the following additional assumption, which is standard in the literature of switching games (see, e.g., Tang and Hou [20, Hypothesis 4, page 924]).
The assumption of no loop of zero cost implies that inf y(ipjp)∈F for any loop {i p , j p } N p=0 and a.s.ω. This infimum will be called the length of the loop, relative to Y (ω).
Since there are only a finite number of primary loops, the least one among all these primary loops' lengths relative to Y (ω) is strictly positive a.s., which will be denoted by c(ω).
On the other hand, as Y satisfies (1.3), it is easy to check that As a consequence, there exists N (ω) such that θ N ∧ τ N = T . Otherwise, there would be infinite number of loops, whose length is almost surely not less than c(ω), and therefore it is contradictory to the last inequality.
is the value process for our switching game, and the switching strategy a * (·) := (θ * p ∧ τ * p , α * p ) for Player I and the switching strategy b * (·) := (θ * p ∧ τ * p , β * p ) for Player II is a saddle point of the switching game.

Proof.
We assume t = 0 without loss of generality. Otherwise, it suffices to consider the admissible switching strategies starting at time t.