Ergodicity conditions for zero-sum games

A basic question for zero-sum repeated games consists in determining whether the mean payoff per time unit is independent of the initial state. In the special case of"zero-player"games, i.e., of Markov chains equipped with additive functionals, the answer is provided by the mean ergodic theorem. We generalize this result to repeated games. We show that the mean payoff is independent of the initial state for all state-dependent perturbations of the rewards if and only if an ergodicity condition is verified. The latter is characterized by the uniqueness modulo constants of nonlinear harmonic functions (fixed points of the recession function associated to the Shapley operator), or, in the special case of stochastic games with finite action spaces and perfect information, by a reachability condition involving conjugate subsets of states in directed hypergraphs. We show that the ergodicity condition for games only depends on the support of the transition probability, and that it can be checked in polynomial time when the number of states is fixed.

1. Introduction 1.1. Motivation and related work. The ergodicity of dynamical systems or of stochastic processes can be considered in several guises. In the elementary case of a discrete time Markov chain (ξ k ) k 0 with finite state space S = [n] := {1, . . . , n}, ergodicity can be classically defined by any of the equivalent properties listed in the following theorem. Note that these properties only involve the transition probability matrix P = (P (ξ k+1 = j | ξ k = i)) i,j=1,...,n ∈ R n×n . Theorem 1.1. Let P ∈ R n×n be a stochastic matrix. The following properties are equivalent.
(i) Every vector η ∈ R n such that P η = η is constant; (ii) For every vector g ∈ R n , the Cesaro limit lim k→∞ k −1 (g + P g + · · · + P k−1 g) (1) is a constant vector; (iii) For every vector g ∈ R n , the ergodic equation where 1 denotes the unit vector of R n , admits a solution (λ, u) ∈ R × R n ; (iv) The directed graph associated to the matrix P has only one final class; (v) The matrix P has only one invariant measure, that is a stochastic row vector m ∈ R 1×n such that mP = m.
Recall that a matrix P = (P ij ) ∈ R n×n (resp. a row vector m = (m j ) ∈ R 1×n ) is said to be stochastic when all its entries are nonnegative and each of its rows sums to one, meaning that P ij 0 and n ℓ=1 P iℓ = 1 for all i, j ∈ [n] (resp. m j 0 for all j ∈ [n] and n j=1 m j = 1). The directed graph associated with P is composed of the nodes 1, . . . , n and of the arcs (i, j), i ∈ [n], j ∈ [n] with P ij > 0. A class of the matrix P is a maximal set of nodes such that every two nodes of the set are connected by a directed path. A class is said to be final if every path starting from a node of this class remains in this class. We refer the reader to [7,Chap. 8] for details. The previous properties are well known, in particular, the equivalence between (iv) and (v) follows from Theorem 3.23 in the latter reference, whereas the remaining equivalences follow from Theorem 6.1 in [30].
The scalar λ in the ergodic equation (2), known as the ergodic constant, gives the coordinates of the constant vector (1).
The term ergodicity is generally used to refer to the uniqueness of the invariant measure, and so, following Kemeny and Snell [17], we call ergodic a Markov chain with the above properties of its transition probability matrix. We warn the reader that some authors use the word "ergodic" in a stronger sense, requiring, for a finite Markov chain, the matrix P to be irreducible and aperiodic.
In this paper, we extend the notion of ergodicity to zero-sum two-player repeated games with finite state space S = [n]. We refer the reader to Section 2 for the detailed definition of these games. For the moment, we shall only need to know that the game in horizon k with initial state i has a value, denoted by v k i ∈ R, and that the value vector v k = (v k i ) 1 i n is determined from the Shapley operator T = T (r, P ). The latter is the map R n → R n given by (3) [T (r, P )(x)] i = inf a∈Ai sup b∈Bi (r ab i + P ab i x) , for all x = (x i ) i∈S . Here, A i denotes the set of actions of player MIN in state i ∈ S, B i denotes the set of actions of player MAX in the same state, r ab i denotes an running payment made by player MIN to player MAX in state i when the actions a, b are chosen, and P ab i is a row vector such that (P ab i ) j represents the probability of transition from state i to state j, when the actions a, b are chosen. It is known that the value vector v k = (v k i ) i∈S can be computed recursively by v k = T (v k−1 ), v 0 = 0 .
Here, we will be interested in the mean payoff vector where T k := T • · · · • T denotes the kth iterate of T , so that [χ(T )] i represents the mean payoff per time unit of the game starting from state i, as the horizon tends to infinity.
A basic analytic tool to establish the existence of the limit is the so called nonlinear ergodic equation If a solution (λ, u) ∈ R × R n exists, then, it it easily seen that In particular, the mean payoff is independent of the initial state, and it is given by the ergodic constant λ, as in the case of Markov chains. The ergodic equation has been much studied in the one-player stochastic case, i.e., in "ergodic control", where it is also known as the "average case optimality equation", see [16] for background.
The ergodic equation (4) is equivalent to a nonlinear spectral problem which has also received attention in nonlinear Perron-Frobenius theory, see specially the work of Nussbaum [22,23], and also [15,19]. Indeed, the map T is conjugate to the self-map G = exp •T • log of the interior of the standard positive cone of R n , C := {x ∈ R n | x 0}, where exp is the map from R n to the interior of C which does exp entrywise, and log := exp −1 . The ergodic problem is equivalent to the nonlinear spectral problem Since the map G is order-preserving and positively homogeneous of degree one, conditions for the existence of an eigenpair (v, µ) may be thought of as nonlinear extensions of the Perron-Frobenius theorem. It is useful to keep this equivalence in mind as several results relevant to Problem (4) have appeared in the context of the nonlinear eigenproblem (5), see for instance [15,10].
The problem of characterizing the set of solutions u of the ergodic equation T (u) = λ1+u has also appeared in the setting of max-plus spectral theory [6,3], and in weak KAM theory [12,13]. These theories concern the one-player deterministic case. It is known that the above set is sup-norm isometric to a set of Lipschitz functions on a certain set (critical classes in the max-plus setting, or projected Aubry set in the weak KAM setting). Some of these results have been extended to one-player stochastic games with finite state space in [1]. The extension of such results to the two player case appears to be an open question, which is among the motivations leading to the present study.
A useful tool to address the issue of the solvability of the ergodic equation (4), or of the corresponding nonlinear eigenproblem (5), is the recession function associated with the Shapley operator, (6)T : x ∈ R n →T (x) = lim ρ→+∞ T (ρx) ρ , which has already been used in several ways [27,28,15]. In particular, Rosenberg and Sorin [27] gave conditions for the existence of the mean payoff vector of a two-person zero-sum stochastic game. In their framework, the recession function appears as the Shapley operator of the "projective" game, which corresponds to the game with no running payments.
If the transition payment r is bounded, the recession functionT does exist, and it is given by (7) [ Hence,T = T (0, P ), with T as in (3), so that the recession function of the Shapley operator associated with the game with payment function r is merely the Shapley operator of the game in which r is replaced by 0. For this reason, we shall refer to the maps of the form (7) as payment-free Shapley operators.
Observe that every constant vector is a fixed point of a payment-free Shapley operator. We shall refer to such a fixed point as trivial. In [15], Gaubert and Gunawardena show that the ergodic equation is solvable ifT has only trivial fixed points. A sufficient explicit condition for this to hold, involving a sequence of aggregated directed graphs, generalizing the classical directed graph of Perron-Frobenius theory, was given there.
Then, in [10], Cavazos-Cadena and Hernández-Hernández introduced a weak convexity property, and showed that when the conjugate map G = exp •T • log is weakly convex, the recession functionT has only trivial fixed points if and only if the first of the directed graphs of [15] consists of a single final class and of trivial classes (reduced to one node, and loop free). They deduced that when G is weakly convex, the ergodic equation for all maps g + T with g ∈ R n is solvable if and only ifT has only trivial fixed points. We shall consider the same additive perturbations g + T of the Shapley operator, but without any assumption on T except that the payment r be bounded. Indeed, this weak convexity property is rarely satisfied for games although it captures an interesting class of risk sensitive problems.
1.2. Description of the main results. Our main results, summarized in Theorem 8.1 at the end of the paper, show that most of the classical characterizations of ergodicity for finite state Markov chains, seen as zero-player games, carry over to the two-player case. More precisely, given a zero-sum game with finite state space and bounded transition payment r, we show in Section 3 that the following conditions are equivalent: (i) all the fixed points of the recession functionT of the Shapley operator T are trivial (i.e constant); (ii) all the games obtained by adding to the transition payment r a perturbation depending only of the state have a constant mean payoff vector; (iii) the ergodic equation (4) is solvable for all maps g + T with g ∈ R n .
In the zero-player special case, the above conditions correspond to Points (i)-(iii) of Theorem 1.1. Hence, a zero-sum game will be said to be ergodic if it satisfies one of these properties. An ingredient of this equivalence is the result of Gaubert and Gunawardena in [15] described above.
In Section 4, we give a characterization of ergodicity in terms of a Galois connection acting on faces of the hypercube [0, 1] n . Then, in Section 5, we show that under a compactness assumption on the action spaces and a continuity assumption on the transition probability, the latter characterization of ergodicity of a game (involving a Galois connection) is equivalent to a reachability condition involving a pair of directed hypergraphs. These two characterizations are a fundamental discrepancy with Point (iv) of the zero-player case. However, the characterization of ergodicity involving the hypergraphs still keeps the same flavor. Indeed, the condition that a directed graph has only one final class can be thought of as an accessibility condition in this directed graph (see in particular Remark 5.1). Under the compactness and continuity assumption on the action spaces and the transition probability, the Galois connection as well as the hypergraphs are shown to depend only on the support of the transition probability P , which we define to be the set of points at which the function (i, a, b, j) → (P ab i ) j takes nonzero values. As a result, we get that the ergodicity of a game is a structural property depending only on the support of the transition probability.
We then consider (in Section 6) several algorithmic problems concerning games with finite action spaces. The first one is to check ergodicity. The restricted version of this problem concerning deterministic games was addressed by Yang and Zhao [31], in the context of discrete event systems. They showed that this problem is coNP-hard. However we show, as a corollary of the hypergraph characterization, that checking the ergodicity of a stochastic game is fixed parameter tractable: if the dimension is fixed, we can solve it in polynomial time. Note also that ergodicity can be checked in polynomial time for one-player stochastic games [1]. We finally characterize the situation in which there exists a fixed point having its minimal and maximal entries in prescribed positions. As a by product, we get a polynomial time algorithm to check the latter property, from which it follows that checking ergodicity is a coNP-complete problem. In Section 7, we illustrate our results on some examples.
The present results have been announced in the conference article [4].
2. Zero-sum games with perfect information and mean-payoff 2.1. Basic definitions and results. In this subsection, we describe formally the zero-sum game with perfect information mentioned above, and state preliminary results.
Recall that S = [n] is the state space, A i is the set of actions of player MIN, B i is the set of actions of player MAX, (i, a, b) → r ab i from ∪ i∈S ({i} × A i × B i ) to R is the transition payment, and (i, a, b) → P ab i from the same set to ∆(S) ⊂ R 1×n , the set of nonnegative row vectors of sum one, is the transition probability. This game, which we denote by Γ(r, P ), is played as follows. Starting from a given state i 0 at time k = 0, known by the players, MIN chooses an action a 0 ∈ A i0 . Then, knowing this choice, player MAX chooses an action b 0 ∈ B i0 . Player MIN has to pay r a0b0 i0 to player MAX and the next state, i 1 , is chosen according to the probability P a0b0 i0 . The same procedure is repeated at each time step, giving an infinite sequence (i ℓ , a ℓ , b ℓ ) ℓ 0 .
A strategy σ (resp. τ ) of player MIN (resp. MAX) is a map which assigns an action of player MIN (resp. MAX) to every finite history known by the player. A triple (i 0 , σ, τ ) defines a probability measure on the set of plays (or histories), that is, the set of sequences (i ℓ , a ℓ , b ℓ ) ℓ 0 for which a ℓ ∈ A i ℓ and b ℓ ∈ B i ℓ . We denote by E i0,σ,τ the corresponding expectation. The total payoff of the game with finite horizon k (consisting in k time steps, that is k successive alternated moves of players MIN and MAX) is given by Player MIN wishes to minimize this quantity, while player MAX wishes to maximize it. The value of the k-stage game (the game played in finite horizon k) starting at state i is thus defined as the infimum and the supremum being taken over the set of strategies of players MIN and MAX, respectively. Here, the infimum and supremum commute. It is known (see e.g. [21]) that the value vector v k = (v k i ) satisfies v k = T (v k−1 ) and v 0 = 0, where T = T (r, P ) is the Shapley operator defined by (3).
Let A denote the set of (feedback) policies of player MIN, which are the maps σ from S to ∪ i∈S A i such that σ(i) ∈ A i for all i ∈ S, and let B denote the set of policies of player MAX, which are the maps τ from ∪ i∈S ({i} × A i ) to ∪ i∈S B i such that τ (i, a) ∈ B i for all i ∈ S and a ∈ A i . Recall that a strategy of player MIN (resp. MAX) is Markovian if it only depends on the information of the current stage k 0, that is a k = σ k (i k ) for some σ k ∈ A (resp. b k = τ k (i k , a k ) for some τ k ∈ B). Moreover, such a strategy is stationary if it is independent of k (σ k = σ ∈ A and τ k = τ for all k 0), in which case it can be identified with the corresponding policy. Then it is known that the above (dynamic programming) equation provides optimal or ǫ-optimal strategies of the two players that are Markovian. Indeed, T can be rewritten as follows: , and similarly for r στ , and the infimum and supremum are taken for the usual partial order of R n (the product partial order of the usual order on R). Moreover, the infimum and supremum can be approached arbitrarily by the value of r στ + P στ x for some policies σ and τ , and they are equal to such a value when the action spaces A i and B i are compact and the transition payment and probability functions are continuous. In the latter case, we say that σ and τ are optimal for T (x). Optimal strategies for the game in horizon k 0 are then obtained by taking for all 0 ℓ < k, a ℓ = σ ℓ (i ℓ ) and b ℓ = τ ℓ (i ℓ , a ℓ ) for some σ ℓ ∈ A and τ ℓ ∈ B optimal for T (v k−ℓ−1 ).
The Shapley operator T satisfies the following properties: -monotonicity, a.k.a. order preservation: x y ⇒ T (x) T (y), where R n is endowed with its usual partial order; -additive homogeneity: T (x + α1) = T (x) + α1, x ∈ R n , α ∈ R, recalling that 1 denotes the unit vector of R n ; -nonexpansiveness in the sup-norm: T (x) − T (y) x − y , x, y ∈ R n , where x := max 1 i n |x i |.

2.2.
Games with mean payoff. The mean payoff vector is defined as the limit for all x ∈ R n . Since T is nonexpansive, the existence and the value of the latter limit is independent of the choice of x. In particular, we have the following standard result: Proposition 2.1. If the following ergodic equation is solvable: then χ(T ) exists and is equal to λ1. In particular, the average payment (per time unit) of Γ(r, P ) is asymptotically independent of the initial state.
Moreover, if u is a solution of the above ergodic equation, optimal policies σ and τ of players MIN and MAX for T (u), if they exist, provide optimal strategies of the two players that are Markovian and stationary.
The ergodic equation (8) can be studied by means of the recession functionT of T , defined by (6). The recession function of T is well defined as soon as the transition payment is bounded. Then,T is given by (7), so thatT = T (0, P ). Definition 2.1 (Payment-free Shapley operators). A Shapley operator is said to be payment-free if it is of the form F = T (0, P ), where P is a transition probability and T is as in (3).
As any Shapley operator, a payment-free Shapley operator F is monotone and additively homogeneous. It is also positively homogeneous, that is, F (λx) = λF (x), for all x ∈ R n , λ > 0. As a consequence, it satisfies F (λ1) = λ1 for every λ ∈ R. We call such fixed points the trivial fixed points of F . We shall use the following sufficient condition for the solvability of the ergodic equation.

Realizable mean payoffs
We now show that the recession function of the Shapley operator T of the game Γ(r, P ) can be used to characterize the realizable mean payoff vectors of the games Γ(r + g, P ), where g is a bounded additive perturbation of the transition payment r.
Observe first that such a bounded additive perturbation g of the transition payment r does not change the recession function T = T (0, P ). Moreover, combining Theorem 2.2 and Proposition 2.1, we get that if T has only trivial fixed points, then χ(T ) exists and is a constant vector. When the mean payoff vector is already known to exist, the following result, noted by several authors, extends this assertion, since it concerns also the case whereT has non trivial fixed points. [27,28,15]). Consider a game Γ(r, P ), such that the recession functionT and the mean payoff vector χ = χ(T ) exist. ThenT (χ) = χ.
We give the short proof for the convenience of the reader.
Proof. Since T is nonexpansive in the sup-norm · , we have, for every vectors x, y and every integer n, T (nx) − T (ny) n x − y .
Hence, taking x = χ and y = T n (0)/n, we get All the terms in the above inequality converge. Taking their limit, we obtain T (χ) − χ 0 .
We can also show a converse statement, leading to the following equivalences.
Proposition 3.2 (Realizable mean payoffs). Let us fix a state space S = {1, . . . , n}, and the actions spaces A i and B i of the two players. Consider a payment-free Shapley operator F = T (0, P ) with transition probability P , and T as in (3). Then, the following assertions are equivalent: (i) ν ∈ R n is a fixed point of F ; (ii) there exist a bounded transition payment r such that the mean payoff vector of the game Γ(r, P ) exists and is equal to ν; (iii) there exist a transition payment r such that the recession functionT and the mean payoff vector of the game Γ(r, P ) exist and are equal to F and ν respectively.
Proof. The implication (ii)⇒(iii) is easy, and the implication (iii)⇒(i) comes from Proposition 3.1. Let us show (i)⇒(ii). Consider the transition payment r such that r ab i = ν i for every i ∈ S and every (a, b) ∈ A i × B i . The Shapley operator T of the game Γ(r, P ) satisfies, by construction, For every integer k we have T (kν) = kF (ν) + ν = (k + 1)ν, so that, by induction, T k (0) = kν. This proves that the mean payoff vector of Γ(r, P ) exists and is equal to ν.
Hence, for parametric games Γ(·, P ) with fixed state space, action spaces and transition probability, the fixed points of the corresponding payment-free Shapley operator give exactly all the realizable mean payoff vectors.
We shall say that the game Γ(r, P ) is ergodic if it satisfies the conditions of the following theorem.
Theorem 3.1 (Ergodicity of zero-sum games). Let us fix a state space S = {1, . . . , n}, and the actions spaces A i and B i of the two players. Let r be a bounded transition payment, P be a transition probability, and let T = T (r, P ) be the Shapley operator of the game Γ(r, P ). Then, the following properties are equivalent: (i) the recession function T = T (0, P ) has only trivial fixed points; (ii) the mean payoff vector of the game Γ(r + g, P ) does exist and is constant for all additive perturbations g of the transition payment depending only of the state (so g a,b i = g i , for all i ∈ S, a ∈ A i and b ∈ B i ); (iii) the ergodic equation g + T (u) = λ1 + u is solvable for all vectors g ∈ R n ; (iv) the mean payoff vector of the game Γ(r + g, P ) does exist and is constant for all bounded additive perturbations g of the transition payment r; for all Shapley operators T ′ = T (r + g, P ) associated to bounded additive perturbations g of the transition payment.
Proof. As said above, a bounded additive perturbation g of the transition payment r does not change the recession function: T = T (0, P ) = T ′ , when T ′ = T (r + g, P ). Hence the implication (i)⇒(v) follows from Theorem 2.2. Moreover, the implication (v)⇒(iv) follows from Proposition 2.1. Similarly, we have (iii)⇒(ii), since if g is an additive perturbation of the transition payment depending only of the state (that is g a,b Hence, all the equivalences will follow from the implication (ii)⇒(i), that we now prove. Assume that (ii) holds. This means that the mean payoff vector of the game Γ(r + g, P ) does exist and is constant for all additive perturbations g of the transition payment depending only of the state. Let η be a fixed point of the recession function T = T (0, P ). Denote T = T (r, P ), and let C > 0 be a bound of the transition payment r. We have Let s be an integer, consider the additive perturbation g s = sη of the transition payment, and denote T s = T (r + g s , P ) = g s + T . Let us show by induction: Indeed, T s (0) = sη+T (0) and by (9), we get that −C T (0) C, which shows (10) for k = 1. Assume that (10) holds for k 1. Then, by the monotonicity of T s , we get that (T s ) k+1 (0) T s (k(sη + C)). Then, using the definition of T s , the additive homogeneity of T and (9), we deduce: Since T is positively homogeneous and η is a fixed point of T we obtain that T (ksη) = ksη, hence T s (k(sη + C)) sη + (k + 1)C + ksη = (k + 1)(sη + C), which shows the second inequality of (10) for k + 1. The first inequality is obtained with the same arguments. Now, by (ii) the mean payoff vector lim k→∞ (T s ) k (0)/k = χ s of the game Γ(r + g s , P ) exists and is constant. From (10), we deduce that Since χ s is a constant vector, we get that s(max i∈S η i )−C (χ s ) j s(min i∈S η i )+ C for all j ∈ S. Hence, s(max i∈S η i − min i∈S η i ) 2C, and since this inequality holds for all s > 0, we deduce that max i∈S η i − min i∈S η i = 0. This implies that η is a constant vector, hence any fixed point of the recession function T is a constant vector, which shows Assertion (i).
Note that we could have shown the direct implication (iv)⇒(i), by using the implication (i)⇒(ii) in Proposition 3.2.

Characterization of ergodicity in terms of Galois connections
In this section, we shall fix a state space S = [n], and consider any payment-free Shapley operator F defined over S, without specifying the actions spaces A i and B i of the two players nor the transition probability P . Indeed, the results of this section only use the fact that F : R n → R n is order-preserving, additively homogeneous, and positively homogeneous.

4.1.
Invariant faces of the hypercube. We begin with an observation about the fixed points of a payment-free Shapley operator. But first, let us fix some notation. If K is a subset of S, denote by 1 K the vector with entries 1 on K and 0 on S \ K. Lemma 4.1. Let F be a payment-free Shapley operator. If u is a nontrivial fixed point of F then, denoting by I = arg min u and J = arg max u, we have Proof. By the additive and positive homogeneity of F , we may assume that 1 S\I u and that min s∈S u s = 0. Hence, by the monotonicity of F , we get that F (1 S\I ) u.
In particular, we have [F (1 S\I )] i 0 for every i ∈ I. Since 1 S\I 1, we also have F (1 S\I ) 1 (recall that any trivial vector is a fixed point of F ). It follows that We show the second inequality using the same arguments (but this time assuming that u 1 J and max s∈S u s = 1).
Remark 4.2. Conditions (H1) and (H2) are dual. Indeed, introduce F the conjugate operator of F defined by F (x) := −F (−x). Then, F is a payment-free Shapley operator (obtained from F by changing min to max and vice versa). Moreover, Conditions (H1) and (H2) can be stated in geometric terms. Given two subsets I and J of S, denote by C − I := {x ∈ [0, 1] n | ∀i ∈ I, x i = 0} and by C + J := {x ∈ [0, 1] n | ∀j ∈ J, x j = 1} two faces of the hypercube. We shall call them lower and upper faces, respectively. Note that they can alternatively be defined by Hence, by the monotonicity of F , we easily get the following.
Conditions (H1) and (H2) are thus equivalent to the invariance of faces of the hypercube.

Galois connection.
We first recall the definition of a Galois connection between lattices, as introduced by Birkhoff [8] for lattices of subsets and then generalized by Ore [24]. Let (A, ≺ A ) and (B, ≺ B ) be two partially ordered sets and let ϕ : The pair (ϕ, γ) is a Galois connection between A and B if one of the following equivalent assertions is verified: where, given a partially ordered set (E, ≺ E ), id E is the identity map over E and max E X states for the maximum of the subset X ⊂ E with respect to the partial order ≺ E .
If (ϕ, γ) is a Galois connection between A and B, then (γ, ϕ) is a Galois connection between B and A, and according to (11c) (resp. to (11d)), γ (resp. ϕ) is uniquely determined by ϕ (resp. γ). Denote by ϕ ⋆ := γ and likewise by γ ⋆ := ϕ. These maps have the following properties: We say that an element a ∈ A (resp. b ∈ B) is closed with respect to the Galois . We can show that the set of closed elements in A with respect to (ϕ, ϕ ⋆ ) isĀ := ϕ ⋆ (B) and that the set of closed elements in B with respect to (ϕ ⋆ , ϕ) isB := ϕ(A). Then, ϕ is an isomorphism fromĀ toB, and its inverse is ϕ ⋆ Given a payment-free Shapley operator F , we denote by F − (resp. F + ) the families of subsets of S verifying (H1) (resp. (H2)): These families F − and F + are lattices of subsets with respect to the inclusion partial order. Indeed, since F is order preserving, for all I 1 , I 2 ∈ F − , we have so that I 1 ∪ I 2 ∈ F − . This implies that the supremum of two sets in F − coincides with their supremum in the powerset lattice P(S) of S, i.e., the union I 1 ∪I 2 . Hence, F − is a sub-supsemilattice of P(S). Then, since F − has a bottom element (the empty set) and since it is a finite ordered set, it is automatically an inf-semilattice: the infimum of two sets I 1 , I 2 ∈ F − is given by ∪ I3∈F − ,I3⊂I1,I3⊂I2 I 3 . Note that the latter infimum may differ from the infimum in P(S) (the intersection). The lattice F + has dual properties. According to the geometric interpretation, the two lattices F − and F + can be identified with the families of lower and upper invariant faces of the hypercube, respectively. Note that F − and F + both contain ∅ and S.
Given I ∈ F − , we are interested in the subsets J ∈ F + satisfying I ∩ J = ∅ (see Lemma 4.1). We shall consider in particular the greatest subset J with the latter property. Vice versa, starting from a subset J, we may consider the greatest subset I with the same property. In geometric terms, to each lower invariant face C − I of [0, 1] n we associate the smallest upper invariant face C + J with nonempty intersection with C − I . This defines a Galois connection between the lattices F − and F + . Let (Φ, Φ ⋆ ) be the pair of functions from F − (resp. F + ) to F + (resp. F − ), that have just been introduced. Formally, they are defined for every I ∈ F − and J ∈ F + by: It follows from this definition that Φ and Φ ⋆ are antitone, and that . Hence condition (11a) is satisfied for the pair (Φ, Φ ⋆ ) which proves that it is a Galois connection between the lattices of subsets F − and F + . We now explore some properties of this Galois connection. By a simple application of the definitions, we can first complete Lemma 4.1.
Lemma 4.2. Let F be a payment-free Shapley operator. If u is a nontrivial fixed point of F , then arg min u ∈ F − and arg max u ∈ F + . Furthermore, we have arg max u ⊂ Φ(arg min u) and arg min u ⊂ Φ ⋆ (arg max u).
For x ∈ R n , we shall use the notation as soon as the latter limit exists. This is the case in particular when F (x) x or x F (x). Indeed, since F is order-preserving, the former (resp. latter) inequality implies that the sequence (F k (x)) k 0 is nonincreasing (resp. nondecreasing). Moreover, since F is nonexpansive and has a fixed point (namely, 0), the sequence (F k (x)) k 0 is bounded, so that it converges as k tends to infinity. Moreover, F ω (x) is necessarily a fixed point of F .
Proof. Firstly, note that since F (1 S\I ) 1 S\I , the sequence (F k (1 S\I )) k 0 is nonincreasing and so the limit u : Secondly, by definition of the Galois connection, we have 1 Φ(I) 1 S\I . Using the monotonicity of F and the characterization of F − and F + , we get that The vector u being a fixed point of F , we know from Lemma 4.2 that arg min u ∈ F − , arg max u ∈ F + , and arg max u ⊂ Φ(arg min u). Hence, we have by the previous inclusions, Φ(arg min u) = Φ(I) = arg max u.
Suppose now that I is closed with respect to the Galois connection. This means that Φ ⋆ (Φ(I)) = I. Then, from the previous equalities, we get that Φ ⋆ • Φ(arg min u) = I. This implies that arg min u ⊂ I and since we already know that I ⊂ arg min u, we can conclude that I = arg min u.
The analogous results for J ∈ F + follow by duality.
We say that a subset of states is proper if it differs from the empty set and from the whole set of states. We say that I, J ⊂ S are conjugate subsets of states with respect to the Galois connection (Φ, Φ ⋆ ) if I ∈ F − \ ∅, J ∈ F + \ ∅ and if J = Φ(I) and I = Φ ⋆ (J). x j , i ∈ S , These Boolean operators can be extended to R n . Then, we have F − F F + .
We now make some observations. Firstly, the expressions of F + and F − involve the operators min and max (instead of inf and sup). This owes to the fact that the action spaces are nonempty, and the state space is finite, hence, given x ∈ R n , the min and max operations are applied to nonempty subsets of the finite set {x i } i∈S . Secondly, F + and F − are only determined by the support of the transition probability, that is the set of (i, a, b, j) such that (P ab i ) j > 0. Finally, recalling that F is the conjugate operator of F defined by F (x) = −F (−x). These Boolean operators are helpful to characterize the families F − and F + as well as the Galois connection (Φ, Φ ⋆ ). However, we need to make the following assumption.
Assumption A.
(i) For every state i ∈ S, the action spaces A i and B i are nonempty compact sets; (ii) The transition probability P is separately continuous, meaning that given i ∈ S and a ∈ A i the function b ∈ B i → P ab i ∈ ∆(S) is continuous, and given i ∈ S and b ∈ B i the function a ∈ A i → P ab i ∈ ∆(S) is also continuous.
This assumption implies in particular the existence of optimal policies for both players, a property which is used implicitly in the proof of the following result.
Lemma 5.1. Let F be the payment-free Shapley operator associated with the actions spaces A i and B i of the two players, and the transition probability P , and let F + and F − be defined by (13) and (14) respectively. For all subsets I and J of S, consider the conditions: Let us first make some observations. First, for all x ∈ [0, 1] n , i ∈ S, a ∈ A i and b ∈ B i , we have (15) P ab i x 0 ⇔ (P ab i ) j = 0 or x j = 0 ∀j ∈ S ⇔ max j:(P ab i )j >0 x j = 0 , since all entries of P ab i and x are nonnegative. Similarly (16) since all entries of x are less or equal to 1, and P ab i x = 1 − P ab i (1 − x). Next, for all x ∈ R n , by Assumption A, for all i ∈ S and a ∈ A i there exists b ∈ B i depending on i and a such that P ab i x = max b ′ ∈Bi P ab ′ i x (the supremum is thus a maximum). Assumption A also implies that, for all i ∈ S and b ∈ B i , the map A i → R, a → P ab i x is continuous. Since the supremum of continuous maps is lower semicontinuous, we get that for all i ∈ S, the map A i → R, a → max b∈Bi P ab i x is lower semicontinuous. Since A i is compact, this implies that, for all i ∈ S, there exists a ∈ A i such that b ∈ B i , P ab i 1 S\I 0, which implies, by (15), that max j:(P ab i )j >0 [1 S\I ] i = 0. Since this last equality holds for some a ∈ A i and all b ∈ B i , we deduce that With similar arguments, we show that (H2)⇒(H2').
In the sequel, we shall also consider the sets F ′− and F ′+ defined like F − and F + , but with the conditions (H1') and (H2') instead of the conditions (H1) and (H2) respectively: Moreover, we shall denote by Φ ′ and Φ ′ * the maps defined by (12) with F ′− and F ′+ instead of F − and F + respectively. Then, (Φ ′ , Φ ′⋆ ) is also a Galois connection between the lattices of subsets F ′− and F ′+ .
Remark that from the definitions, F ′− and F ′+ and then (Φ ′ , Φ ′⋆ ) only depend on the support of the transition probability P, i.e. the set of elements (i, a, b, j) such that i, j ∈ S, a ∈ A i , b ∈ B i , and (P ab i ) j > 0. Furthermore, Lemma 5.1 shows that under Assumption A, the former new sets and maps coincide with the corresponding old ones.
Corollary 5.1. Given the payment-free Shapley operator F associated with actions spaces A i and B i of the two players, and a transition probability P , the families F + and F − , as well as the Galois connection (Φ, Φ ⋆ ), depend only on the support of the transition probability P, when Assumption A holds.
Using Theorem 4.4 together with the previous result, we deduce the following one.
Corollary 5.2. Given the payment-free Shapley operator F associated with actions spaces A i and B i of the two players, and a transition probability P , such that Assumption A holds, the property "F has only trivial fixed points" depends only on the support of the transition probability P.
The theorem below gives a way to compute the images of subsets of states by the Galois connection.
Theorem 5.2. Let F be the payment-free Shapley operator associated with actions spaces A i and B i of the two players, and a transition probability P , let F + and F − be defined by (13) and (14) respectively. For I ∈ F ′− and J ∈ F ′+ we have Proof. We show only the first assertion, the second follows by duality.
Since 1 L is a fixed point of F − , L belongs to F ′+ . Furthermore, it satisfies 1 L 1 S\I , that is, I ∩ L = ∅. Then L ⊂ Φ ′ (I).
Let K ∈ F ′+ such that I ∩ K = ∅, that is, 1 K 1 S\I . By induction, we get that (F − ) k (1 K ) (F − ) k (1 S\I ) for every integer k. By definition, we also have that We conclude this subsection by giving an interpretation in terms of zero-sum game of the conditions (H1') and (H2') of Lemma 5.1.  (13)), we obtain that, for every i ∈ I, there is an action a ∈ A i of player MIN such that max j:(P ab i )j >0 [1 S\I ] j = 0 for every action b ∈ B i of player MAX, which is equivalent, by (15), with P ab i 1 S\I = 0. Since S is finite, there exists an element σ of A, that is a policy σ of player MIN, σ : i ∈ S → σ(i) ∈ A i , such that P σ(i)b i 1 S\I = 0 for all i ∈ I and b ∈ B i . Denote, as in Section 2.1, by (i k ) k 0 the (random) sequence of states of the game Γ(r, P ). If the current state i k is in I, then the probability that the state i k+1 at the following stage is in S \ I is equal to P ab i 1 S\I if actions a and b are chosen. In particular, if player MIN selects the action σ(i), then this probability is 0, whatever player MAX chooses. Hence, if player MIN chooses the Markovian stationary strategy corresponding to σ (a k = σ(i k ) for all k 0), and if the initial state i 0 is in I, then for any strategy (Markovian or not) of player MAX, the probability that the sequence of states (i k ) k 0 leaves I is 0. This shows the "only if" part of (i).
Conversely, suppose that there exists a policy σ : i ∈ S → σ(i) ∈ A i of player MIN such that for any initial state i 0 in I, if player MIN chooses the Markovian stationary strategy corresponding to σ, then (for any strategy of player MAX), the state of the game Γ(r, P ) stays in I almost surely. In particular, for any i ∈ I and any b ∈ B i , taking i 0 = i, the strategy a k = σ(i k ), k 0, for player MIN and any strategy of player MAX such that b 0 = b, we get that the probability that i 1 is outside I is equal to 0. Since this probability coincides with P σ(i)b i 1 S\I , we deduce, using (15), that max j:(P σ(i)b i )j >0 [1 S\I ] j = 0. This holds for all b ∈ B i and i ∈ I, hence [F + (1 S\I )] i max b∈Bi max j:(P σ(i)b i )j >0 [1 S\I ] j = 0 for all i ∈ I. It follows that F + (1 S\I ) 1 S\I , that is (H1').
(ii): Suppose that (H2') holds for F =T and J ⊂ S. Then, for all i ∈ J, we have [F − (1 J )] i = 1. Since F − involves min and max operators (see (14)), we obtain that, for every i ∈ J and a ∈ A i there is an action b ∈ B i of player MAX such that min j:(P ab i )j >0 [1 J ] j = 1, which is equivalent, by (16), with P ab i 1 J = 1. By the axiom of choice, there exists a map τ : (i, a) ∈ ∪ i∈S ({i} × A i ) → τ (i, a) ∈ B i , that is a policy of player MAX, such that P aτ (i,a) i 1 J = 1 for all i ∈ J and a ∈ A i . By the same arguments as above, we get that for the game Γ(r, P ), if player MAX chooses the Markovian stationary strategy corresponding to τ (b k = τ (i k , a k ) for all k 0), and if the initial state i 0 is in J, then for any strategy (Markovian or not) of player MIN, the probability that the sequence of states (i k ) k 0 leaves J is 0. This shows the "only if" part of (ii). The "if" part is obtained by the same arguments as for (i).

Hypergraph characterization.
In this subsection, we introduce directed hypergraphs which will allow us to represent the Boolean operators F + and F − . In particular, we shall see that finding Φ ′ (I) (resp. Φ ′⋆ (J)) for a given I ∈ F ′− (resp. J ∈ F ′+ ), is equivalent to solving a reachability problem in a directed hypergraph. We refer the reader to [14,5] for more background on reachability problems in hypergraphs.
A directed hypergraph is a pair (N, E), where N is a set of nodes and E is a set of (directed) hyperarcs. A hyperarc e is an ordered pair (t(e), h(e)) of disjoint nonempty subsets of nodes; t(e) is the tail of e and h(e) is its head. We shall often write t and h instead of t(e) and h(e), respectively, for brevity. When t and h are both of cardinality one, the hyperarc is said to be an arc, and when every hyperarc is an arc, the directed hypergraph becomes a directed graph.
In the following, the term hypergraph will always refer to a directed hypergraph. The size of a hypergraph G = (N, E) is defined as size(G) = |N | + e∈E |t(e)| + |h(e)|, where |X| denotes the cardinality of any set X. Note that we shall consider in the sequel hypergraphs with an infinite number of nodes or hyperarcs, leading to size(G) = ∞ (we set |X| = ∞ when X is infinite).
Let G = (N, E) be a hypergraph. A hyperpath of length p from a set of nodes I ⊂ N to a node j ∈ N is a sequence of p hyperarcs (t 1 , h 1 ), . . . , (t p , h p ), such that t i ⊂ ∪ i−1 k=0 h k for all i = 1, . . . , p + 1 with the convention h 0 = I and t p+1 = {j}. Then, we say that a node j ∈ N is reachable from a set I ⊂ N , if and only if there exists a hyperpath from I to j. Alternatively, the relation of reachability can be defined in a recursive way: a node j is reachable from the set I if either j ∈ I or there exists a hyperarc (t, h) such that j ∈ h and every node of t is reachable from the set I. A set J is said to be reachable from a set I if every node of J is reachable from I. We denote by Reach(I, G) the set of reachable nodes from I.
A subset I of N is invariant in the hypergraph G if it contains every node that is reachable from itself, that is Reach(I, G) ⊂ I. If N ′ ⊂ N , we shall also say that a subset I of N ′ is invariant in the hypergraph G relatively to N ′ , if it contains every node of N ′ that is reachable from itself, that is Reach(I, G) ∩ N ′ ⊂ I. One readily checks that the set of nodes of N ′ that are reachable from a given set I ⊂ N ′ is the smallest invariant set in the hypergraph G relatively to N ′ , containing I. The reachability notion will be illustrated in Example 5.1.
We now make the connection with our problem. Let F be the payment-free Shapley operator associated with the actions spaces A i and B i of the two players, and the transition probability P , and let F + and F − be its Boolean abstractions defined by (13) and (14) respectively. We construct two hypergraphs G + = (N + , E + ) (Figure 1) and G − = (N − , E − ) (Figure 3) as follows. We first need to introduce a copy of S, denoted by S ′ . It is a set disjoint from S and given by a bijection π : S → S ′ .
For the purposes of the following constructions, we also need to assume S ′ disjoint from the two sets The node set of The hyperarcs of G + are of the form: {(i, a)}), for all j, i ∈ S and a ∈ A i such that there exists b ∈ B i with (P ab i ) j > 0. As shown on Figure 1, this hypergraph is structured in two layers; the first layer consists of the arcs ({j}, {(i, a)}) whereas the second layer consists of the hyperarcs x j , i ∈ S .
Denoting by y i,a := max b∈Bi max j:(P ab i )j >0 x j , we also have [F + (x)] i = min a∈Ai y i,a . If x = 1 J for some J ⊂ S, then y i,a = 1 if, and only if, there is b ∈ B i and j ∈ J such that (P ab i ) j > 0. This is also equivalent to the node (i, a) being reachable from J in G + . Then, [F + (x)] i = 1 if, and only if, y i,a = 1 for every a ∈ A i , which is equivalent to all the nodes in the tail of the hyperarc ({i} × A i , {π(i)}) being reachable from J in G + . According to the recursive definition of reachability, this is equivalent to π(i) being reachable from J in G + . Hence, we have the following result.
Proposition 5.2. Let F be the payment-free Shapley operator associated with the actions spaces A i and B i of the two players, and the transition probability P , and let F + be defined by (13). Then, the node π(i) ∈ S ′ is reachable from J ⊂ S in G + if, and only if, [F + (1 J )] i = 1.
Example 5.1. Let us consider the following payment-free Shapley operator defined on R 3 : where ∧ stands for min and ∨ for max. The Boolean operator F + associated with F is defined by Figure 2 shows the hypergraph G + associated with F , where the element π(i) of S ′ is denoted i ′ . It can be checked that the nodes 1 ′ and 3 ′ are reachable from {2, 3}, whereas the node 2 ′ is not. According to Proposition 5.2, this is equivalent to the fact that 3' Figure 2. The hypergraph G + associated with F The node set of G − is N − = {(i, a, b) | i ∈ S, a ∈ A i , b ∈ B i } ∪ S ∪ S ′ , and its hyperarcs are: such that (P ab i ) j > 0. Again, G − consists of two layers (see Figure 3).
Like G + , the motivation for the construction of G − is the following result.
Proposition 5.3. Let F be the payment-free Shapley operator associated with the actions spaces A i and B i of the two players, and the transition probability P , and let F − be defined by (14). Then, the node π(i) ∈ S ′ is reachable from So far we did not make any assumption about the action spaces, which may be infinite, leading to infinite hypergraphs G + and G − . The absence of symmetry between G + and G − reflects the lack of symmetry between F + and F − .
DenoteḠ + andḠ − the hypergraphs obtained from G + and G − , respectively, by identifying every node i ∈ S with node π(i) ∈ S ′ . The following proposition is immediate.
Proposition 5.4. A subset I ⊂ S belongs to F ′− if, and only if, its complement in S is an invariant set in the hypergraphḠ + relatively to S: Reach(S \I,Ḡ + )∩S = S \I. A subset J ⊂ S belongs to F ′+ if, and only if, its complement in S is an invariant set in the hypergraphḠ − relatively to S: Figure 3. Hypergraph G − associated with F − Corollary 5.3. Let F be the payment-free Shapley operator associated with the actions spaces A i and B i of the two players, and the transition probability P , and let F + and F − be defined by (13) and (14) respectively. Let I ∈ F ′− and J ∈ F ′+ . Then Φ ′ (I) is given by the complement in S of all the nodes of S that are reachable from I inḠ − . Moreover, Φ ′⋆ (J) is given by the complement in S of all the nodes of S that are reachable from J inḠ + .
Proof. It follows readily from the definition of Φ ′ (by (12) with F ′− and F ′+ instead of F − and F + ), that S \ Φ ′ (I) is the smallest set I ′ containing I such that S \ I ′ ∈ F ′+ . By Proposition 5.4, the latter condition holds if, and only if, I ′ satisfies Reach(I ′ ,Ḡ − ) ∩ S = I ′ . Hence, Φ ′ (I) is the complement in S of the set of nodes of S that are reachable from I inḠ − . The argument for Φ ′⋆ is dual.
We shall say that I, J ⊂ S are conjugate subsets of states with respect to the hypergraphsḠ + ,Ḡ − if I, J are nonempty and if and Theorem 5.3. Let F be the payment-free Shapley operator associated with actions spaces A i and B i of the two players, and a transition probability P , such that Assumption A holds. Then the following assertions are equivalent: (i) F has a nontrivial fixed point; (ii) there exist nonempty disjoint subsets I, J ⊂ S such that S \ I is invariant inḠ + relatively to S, and S \ J is invariant inḠ − relatively to S; (iii) there exist conjugate subsets of states I, J ⊂ S with respect to the hyper-graphsḠ + ,Ḡ − . Furthermore, for any sets I, J ⊂ S, they are conjugate with respect to the Galois connection (Φ, Φ ⋆ ) if, and only if, they are conjugate with respect to the hypergraphs G + ,Ḡ − .
Remark 5.1. It is instructive to specialize the latter result to the case in which each player has only one possible action in each state. Then, we can write F (x) = P x, where P is a stochastic matrix. The two hypergraphsḠ + andḠ − are isomorphic (up to the identification of (i, a) and (i, a, b) with π(i)) to the transpose of the digraph G associated to P (the arcs ofḠ + andḠ − are in the opposite direction). In particular, S \ I is invariant inḠ + (orḠ − ) relatively to S if and only if there are no arcs or paths from I to S \ I in G. Similarly, (I, J) is a pair of conjugate subsets of states with respect to (Ḡ + ,Ḡ − ) if and only if J is the greatest set of nodes with no paths in G to a node of I, and vice versa. So Assertion (ii) of Theorem 5.3 implies the existence of two distinct final classes in G: I and J are disjoint and there are no arcs from I to S \ I, and similarly there are no arcs from J to S \ J, so there exists a final class of G included in I and also a final class of G included in J, and since I and J are disjoint, there are two distinct final classes of G. Moreover, if I and J are two distinct final classes, then there are no arcs or paths from I to S \ I in G. The same is true for J so that Assertion (ii) of Theorem 5.3 holds. Hence in the present case, Assertion (ii) of Theorem 5.3 corresponds to the condition that the digraph associated to P has two distinct final classes, that is the opposite of Assertion (iv) in Theorem 1.1. 6.1. Checking ergodicity. From Theorem 3.1, the negation of the following problem is equivalent to the next one.

Algorithmic issues
Problem (NonTrivialFP). Does a given payment-free Shapley operator F : R n → R n with finite action spaces have a non trivial fixed point, that is, does there exist u ∈ R n \ R 1 such that u = F (u)?
Problem (Ergodicity). Is a given game Γ(r, P ) with finite action spaces and bounded payment r ergodic?
It is known that in a directed hypergraph G, the set of reachable nodes from a set I can be computed in O(size(G)) time [14]. Hence, the following result follows from Proposition 5.4 and Corollary 5.3 and the property that, when the actions spaces are finite, F ′− , F ′+ , Φ ′ and Φ ′⋆ coincide with F − , F + , Φ and Φ ⋆ respectively. Using Proposition 6.1 and Theorem 5.3, we obtain the following result, which shows that checking the ergodicity of a game is fixed parameter tractable: if the dimension is fixed, we can solve it in a time which is polynomial in the inputsize. Thus, for instances of moderate dimension, but with large action spaces, the ergodicity condition can be checked efficiently.
Theorem 6.1. Let us fix a state space S = [n], and the nonempty finite actions spaces A i and B i of the two players. Let r be a bounded transition payment and P be a transition probability. Then, the ergodicity of Γ(r, P ), that is the property "T has only trivial fixed points", can be checked in O(2 n nm 2 ) O(2 n n 2 m 2 ) time.
Problem NonTrivialFP has already been addressed in the deterministic case with finite action spaces by Yang an Zhao [31]. Suppose indeed that in the expression (7), the support of each transition probability is concentrated on just one state and consider the restriction of such an operator to the Boolean vectors {0, 1} n . We obtain a monotone Boolean operator.
Recall that a Boolean operator, defined on Boolean vectors {0, 1} n , is expressed using the logical operators AND, OR and NOT. Monotone Boolean operators are those whose expression involves only AND and OR operators. These can be interpreted as min and max operators, respectively. So, deterministic payment-free Shapley operators are equivalent to monotone Boolean operators and Problem Non-TrivialFP can be expressed in a simpler form.
Problem (MonBool). Does a given monotone Boolean operator have a nontrivial fixed point, that is, different from the zero vector and the unit vector? Theorem 6.2 (Yang, Zhao [31]). Problem MonBool is NP-complete.
Using this result and the characterizations of the previous section, we obtain: Proof. As a direct consequence of Theorem 6.2, we get that Problem NonTriv-ialFP is NP-hard. We now show that it is in NP. Suppose that a payment-free Shapley operator F has a nontrivial fixed point u. Then arg min u and arg max u are proper subsets of states, and by Lemma 4.2, arg min u ∈ F − and Φ(arg min u) ⊃ arg max u = ∅. Hence, arg min u ∈ F − is a proper subset of states such that Φ(arg min u) is nonempty, and we know by Theorem 4.4 that these conditions are sufficient to guarantee the existence of a nontrivial fixed point. Furthermore, these conditions can be checked in polynomial time (this is a consequence of Proposition 6.1). Hence, arg min u is a short certificate to Problem NonTrivialFP.

Problem I=Min.
A way to analyze Problem NonTrivialFP would be to characterize the fixed point set W := {w ∈ R n | F (w) = w} of a payment-free Shapley operator F . This problem also arises in several other situations. First in Proposition 3.2, we have shown that W is exactly the set of possible mean payoff vectors of the game Γ(r, P ) when the transition payment r varies. Next, in [11], Everett introduced the notion of recursive games which are modified versions of the game Γ(0, P ) in which payments occur in some absorbing states. These games are well posed if there exists a unique element of W with prescribed values in the absorbing states. Finally, W allows one to determine the set E of solutions u of the ergodic equation T (u) = λ1 + u. Indeed, it is shown in [2] that if the Shapley operator T is piecewise affine, if u is any point in E, and if V is a neighborhood of 0, then, E ∩ (u + V) = u + {w ∈ V | F (w) = w} = u + (V ∩ W), where F is a payment-free Shapley operator (the semidifferential of T at point u). Hence, the local study of the ergodic equation reduces to the characterization of the fixed point set W.
In an attempt to understand the structure of the set of fixed points of a paymentfree Shapley operator, we shall consider the following simpler problem.
Problem (I=Min). Let I be a subset of S. Does a given payment-free Shapley operator with finite action spaces have a fixed point u satisfying I = arg min u?
We know from Lemma 4.2 that a necessary condition is I ∈ F − . Under Assumption A (which is the case if action spaces are finite), this is equivalent to F + (1 S\I ) 1 S\I . In fact, there is a stronger necessary condition. Lemma 6.3. Let F be the payment-free Shapley operator associated with actions spaces A i and B i of the two players, and a transition probability P , such that Assumption A holds and let I ⊂ S. Suppose that F has a fixed point u verifying arg min u = I. Then, F + (1 S\I ) = 1 S\I .
Proof. If I = S, the conclusion of the lemma is trivial. Assume I = S and let u be a fixed point of F verifying arg min u = I. We may suppose that min i∈S u i = 0 and max i∈S u i = 1, so that u 1 S\I . Since F F + , we get u = F (u) F + (u) F + (1 S\I ). The last vector is Boolean, so this inequality implies 1 S\I F + (1 S\I ). Moreover, according to Lemma 4.2 and Lemma 5.1, we already know that F + (1 S\I ) 1 S\I . Hence the result.
We continue with another necessary condition. Lemma 6.4. Let F be the payment-free Shapley operator associated with actions spaces A i and B i of the two players, and a transition probability P , and let I ∈ F − . If Φ(I) = ∅, then F has no nontrivial fixed point u satisfying I ⊂ arg min u.
Proof. Suppose on the contrary that there is a nontrivial fixed point u such that I ⊂ arg min u. Let I ′ = arg min u and J := arg max u. We know from Lemma 4.2 that I ′ ∈ F − , J ∈ F + and that J ⊂ Φ(I ′ ). Since I ⊂ I ′ , we have Φ(I ′ ) ⊂ Φ(I). Hence J ⊂ Φ(I), and since J = ∅, we get a contradiction.
If I = ∅, the answer to Problem I=Min is trivially negative, and if I = S it is trivially positive. Assume now that I is a proper subset of S. The above results show that a necessary condition to have a positive answer to problem I=Min is that I ∈ F − and Φ(I) = ∅. Moreover, by Lemma 4.3, a sufficient condition to have a positive answer to problem I=Min is that I is closed with respect to the Galois connection (Φ, Φ ⋆ ).
It remains to examine the case in which I ∈ F − is proper, with Φ(I) = ∅ and I =Ī, where for I ∈ F − ,Ī := Φ ⋆ (Φ(I)) denotes the closure of I with respect to the Galois connection (Φ, Φ ⋆ ) (likewise, for J ∈ F + ,J is the closure of J with respect to the Galois connection (Φ ⋆ , Φ)). This implies in particular thatĪ = S (otherwise we would have Φ(I) = Φ(Ī) = ∅).
Assume that Assumption A holds. We define a reduced operator F △ : RĪ → RĪ as follows. According to the game-theoretic interpretation (Proposition 5.1), we know that player MIN can force the state of the game Γ(0, P ) (which has F as Shapley operator) to stay inĪ. Hence, we consider the actions of player MIN that achieve this goal: for every i ∈Ī, let These sets are nonempty, sinceĪ ∈ F − = F ′− . Another formulation of A △ i is the following: For x ∈ R n and K ⊂ S, we denote by x K the restriction of x to R K . We apply the same notation to elements of ∆(S). Then, for every i ∈Ī, let From the definition of A △ i , we have that (P ab i )Ī ∈ ∆(Ī) for all i ∈Ī, a ∈ A △ i and b ∈ B i . Hence, F △ is a payment-free Shapley operator overĪ, with actions spaces A △ i and B i and transition probability P △ : (i, a, b) → (P ab i )Ī . Moreover, we have (19) [ Theorem 6.5. Let F be the payment-free Shapley operator associated with finite actions spaces A i and B i of the two players, and a transition probability P . Let I ∈ F − be proper, such that Φ(I) = ∅ and I =Ī. Then F has a fixed point whose arg min is I if, and only if, the same holds for the reduced operator F △ .
Proof. We first show the "only if" part of the theorem. Let u be a fixed point of F such that I = arg min u. Recall that I = S by hypothesis. So we may suppose that max i∈S u i = 1 and min i∈S u i = 0. It follows from (19) , so that (F △ ) ω (uĪ ) exists. Let us denote it by v. It is a fixed point of F △ and it satisfies uĪ v. As a consequence, v i > 0 for every i ∈Ī \ I.
Furthermore, Lemma 4.1 implies that I ∈ F − , meaning that F (1 S\I ) 1 S\I . Then, for all i ∈ I, there exists a ∈ A i such that for all b ∈ B i , P ab i 1 S\I = 0. Since I ⊂Ī, this implies that (P ab i ) j = 0 for all j ∈ S\Ī, and since this holds for all b ∈ B i , we deduce that a ∈ A △ i , by definition. Hence, min a∈A △ i max b∈Bi P ab i 1 S\I = 0, and using (19), we deduce that F △ (1Ī \I ) = 0 for all i ∈ I. Therefore F △ (1Ī \I ) 1Ī \I , which means that I still satisfies condition (H1) with the operator F △ . Since uĪ 1Ī \I , it follows that v = (F △ ) ω (uĪ ) (F △ ) ω (1Ī \I ) 1Ī \I . Hence v i = 0 for every i ∈ I, which shows that arg min v = I.
We now prove the "if" part of the theorem. Assume that F △ has a fixed point v such that arg min v = I. We may suppose that max i∈S v i = 1 and min i∈S v i = 0.
Let w = F ω (1 S\Ī ). We know from Lemma 4.3 that w is a fixed point of F such that arg min w =Ī. Thus, it satisfies wĪ = 0 and w s > 0 for every s ∈ S \Ī, hence w α1 S\Ī for some α > 0.
We next use the notions of semidifferentiability and semiderivative, referring the reader to [26,2] for the definition of these notions and for their basic properties. Since the action spaces are finite, F is piecewise affine and so it is semidifferentiable at point w. Furthermore, denoting F ′ w its semiderivative at w, there is a neighborhood V of 0 such that (20) F (w + x) = F (w) + F ′ w (x), ∀x ∈ V . We next give a formula for F ′ w . For every i ∈ S, let and for a ∈ A i (w), let Then we have, for every x ∈ R n and every i ∈ S, Observe that for i ∈Ī, we have A i (w) = A △ i and B a i (w) = B i , for every a ∈ A i (w). This is because [F (w)] i = w i = 0 and α1 S\Ī w 1 S\Ī , then a ∈ A i (w) if and only if max b∈Bi P ab i 1 S\Ī = 0 and b ∈ B a i (w) if and only if P ab i 1 S\Ī = 0. Then, using (19), we obtain [F ′ w (x)]Ī = F △ (xĪ ) for every x ∈ R n . We introduce now the vector z ∈ [0, 1] n given by zĪ = v and z S\Ī = 0. By the above property of F ′ w , we get that is a payment-free operator, and z 0, we get that F ′ w (z) 0, so F ′ w (z) z. Hence, z = (F ′ w ) ω (z) exists and is a fixed point of F ′ w , belonging to [0, 1] n . Again by the above property of F ′ w , we get that Choose ε > 0 small enough so that εz is in V and let u = w+εz. Then, from (20), we get that F (u) = F (w) + εF ′ w (z) = w + εz = u, where we used the fact that F ′ w is positively homogeneous. Then u is a fixed point of F . Moreover, by construction u = w + εz w and u ǫz, and since arg min w =Ī and arg minz ∩Ī = I, we deduce that u I = 0 and u s > 0 for every s ∈ S \ I, that is arg min u = I.
The previous result together with the observations made before lead to Algorithm 1 below, which solves Problem I=Min, as detailed in Theorem 6.6. There, we are still assuming that for each state i ∈ S the action spaces A i and B i are finite. Moreover, if F is a payment-free Shapley operator, we write (Φ F , Φ ⋆ F ) the Galois connection associated to that operator.  Proof. The fact that Algorithm 1 provides the right answer is a direct consequence of Lemma 6.3, Lemma 6.4, Lemma 4.3 and Theorem 6.5. We next show that it stops after at most n iterations of the loop. Suppose that during the execution of a loop, the first two conditions (which are stopping criteria) are not satisfied. Then the closure of I with respect to the Galois connection (Φ F , Φ ⋆ F ) associated with F is a proper subset of states. Hence, the cardinality of the state space for the reduced operator F △ is strictly less than the one of F .
Moreover, each operation in the loop requires at most O(nm 2 ) O(n 2 m 2 ) time (see Proposition 6.1).
6.3. Mixed problem. So far, we have only considered the problem with a single constraint on the fixed point, concerning the indices of the minimal entries. The dual problem, concerning the maximal entries of fixed points, is equivalent. We address now a mixed-condition problem.
Problem (IMinJMax). Let I and J be nonempty disjoint subsets of S. Does a given payment-free Shapley operator with finite action spaces have a fixed point u satisfying I = arg min u and J = arg max u?
Let F be a payment-free Shapley operator with finite action spaces and let I, J be two nonempty disjoint subsets of S. We already know from Lemma 6.3 and its dual formulation that F + (1 S\I ) = 1 S\I and F − (1 J ) = 1 J are necessary conditions to have a positive answer to problem IMinJMax. The following theorem shows that the two constraints can be treated separately.
Theorem 6.7. Let F be the payment-free Shapley operator associated with actions spaces A i and B i of the two players, and a transition probability P . Let I ∈ F − and J ∈ F + be two nonempty disjoint subsets. Then F has a fixed point u satisfying I = arg min u and J = arg max u if and only if F has fixed points v, w satisfying arg min v = I and arg max w = J.
Proof. We only need to prove the "if" part of the theorem. Suppose that F has fixed points v, w satisfying arg min v = I and arg max w = J. Then, we may impose min i∈S v i = 0, max i∈S v i = min i∈S w i = 1/2 and max i∈S w i = 1.
Let L = {z ∈ R n | v ∨ 1 J z w ∧ 1 S\I }. Put in words, L is the set of all elements in [0, 1] n whose entries are 0 on I, 1 on J and comprised between those of v and w elsewhere. In particular, the entries outside I or J of the elements in L are in (0, 1).
The set L is a complete lattice.
which shows that L is invariant by F . As F is order-preserving, Tarski's fixed point theorem guarantees the existence of a fixed point of F in L. Proof. According to Theorem 6.7, Problem IMinJMax can be solved by two instances of Problem I=Min, one with inputs F and I, one with inputs F and J. We consider the game with perfect information defined by the graph represented in Figure 4. There are four states represented by gray nodes. A token is initially placed in one of these nodes. At each stage, the token is moved along the edges of the graph until it reaches another state, according to the following rule: player MIN moves the token at circle nodes, player MAX at square ones and at the diamond nodes, an edge is selected at random according to the probabilities indicated on the edges starting from the node. A payment occurs only for the edges starting from a MAX node (its value is given by the label attached to such edges). The Shapley operator of this game is It can be shown that T verifies the ergodic equation (8) with ergodic constant λ = 1/3 and u = (4/3, 0, 2/3, 4) T . Let us check whether this game is ergodic, or equivalently, whether the recession function of T , denoted by F and given by has only trivial fixed points.
To answer these questions, we need to construct the Galois connection induced by the game. Firstly, we check that This can be seen on the graph represented in Figure 4. Indeed, following the gametheoretic interpretation (Proposition 5.1), we observe that player MIN can always make sure that the state remains in {1} or in {1, 2}, and that player MAX can always make sure that it stays in {4}. Alternatively, we can construct the Boolean abstractions of F , namely and check that  The set {1, 2} is closed with respect to the Galois connection. Thus, according to Lemma 4.3, F has a fixed point whose arg min is {1, 2}. Moreover, its arg max can only be {4}. We can check that the vector (0, 0, 1/2, 1) T is a fixed point with these properties.
As for the set {1}, we cannot conclude directly from Lemma 6.4 or Lemma 4.3. According to Theorem 6.5, we need to construct a reduced operator, F △ , defined on R {1,2} ({1, 2} being the closure of {1}): .
The directed graph associated with this operator is represented in Figure 5. We check that for this reduced operator we have Hence, Φ({1}) = ∅ and by Lemma 6.4, we know that F △ has no fixed point whose arg min is {1}. According to Theorem 6.5, the same holds for F . We conclude that any nontrivial fixed point u of F verifies u 1 = u 2 < u 3 < u 4 . Furthermore, from (21) we readily get that u 3 = 1 2 (u 2 + u 4 ). These conditions are also sufficient for a point to be a nontrivial fixed point of F . As a consequence, assuming that in Figure 4 the value of the payments can change, all the realizable mean payoff vectors χ are characterized by χ 1 = χ 2 χ 4 , χ 3 = 1 2 (χ 1 + χ 4 ) .

Summary and discussion of the main results
It is convenient to give here a synthetic description of our results. We use the notations and definitions of the previous sections, in particular the definition of the Galois connection and of the hypergraph given in Section 4 and 5 respectively, and and the definition of Assumption A given in Section 5. Combining Theorems 3.1, 4.4 and 5.3, and Corollary 5.2, we obtain the following result, which shows that most of the classical characterizations of ergodicity for finite state Markov chains, listed in Theorem 1.1, carry over to the two-player case, up to the essential discrepancy that the directed graph of the transition probability matrix is now replaced by a pair of directed hypergraphs depending on the transition probability.
Theorem 8.1 (Ergodicity of zero-sum games). Let us fix a state space S = [n], and the nonempty actions spaces A i and B i of the two players. Let r be a bounded transition payment, let P be a transition probability, and let T = T (r, P ) be the Shapley operator of the game Γ(r, P ). Then, the following properties are equivalent: (i) the recession function T = T (0, P ) has only trivial fixed points; (ii) the mean payoff vector of the game Γ(r + g, P ) does exist and is constant for all additive perturbations g of the transition payment, depending only of the state (so g a,b i = g i , for all i ∈ S, a ∈ A i and b ∈ B i ); (iii) the ergodic equation g + T (u) = λ1 + u is solvable for all vectors g ∈ R n ; (iv) there does not exist a pair of conjugate subsets of states with respect to the Galois connection (Φ, Φ ⋆ ) associated with the recession function T = T (0, P ).
Assume in addition that Assumption A holds. Then, the preceding conditions are equivalent to the following one: (v) there does not exist a pair of conjugate subsets of states with respect to the hypergraphs (Ḡ + ,Ḡ − ) associated with the transition probability P .
In particular (still making Assumption A), the ergodicity property of a game Γ(r, P ) only depends on the support of P .
The classical theory of additive functionals of the trajectory of Markov chains corresponds to the zero-player case of zero-sum game theory, or equivalently to the case where each player has only one possible action in each state. By applying the above theorem to the degenerate Shapley operator T (x) = g + P x with recession functionT (x) = P x , the characterizations (i)-(iv) of the ergodicity of a Markov chain with transition probability matrix P , listed in Theorem 1.1, are readily recovered, see in particular Remark 5.1 for the characterization (iv) (we exclude the characterization of Point (v) of Theorem 1.1, in terms of the uniqueness of the invariant measure, which has no nonlinear analogue). When the action spaces are finite, an algorithmic issue is to check ergodicity. We noted that a result of Yang and Zhao [31] implies that checking the non-ergodicity is NP-hard, and proved that this problem is NP-complete (Corollary 6.1) but fixed parameter tractable (Theorem 6.1).
As a refinement of the present ergodicity results we have considered the problem of characterizing the fixed point set W := {w ∈ R n | F (w) = w} of a paymentfree Shapley operator F . This problem has been well studied in the one-player case. In particular, when the action spaces are finite, W is known to be sup-norm isometric to a polyhedral cone with a well characterized dimension [1]. In the twoplayer case, the properties of the fixed point set W are less understood. In order to get information on this set, we have considered in particular the problem of the existence of a fixed point w of F such that w i is minimal precisely when i belongs to a prescribed subset I ⊂ S. We showed in Theorem 6.6 that this problem can be solved in polynomial time, by Algorithm 1. We also showed that we can check whether F has a fixed point with prescribed argmin and argmax in polynomial time.
Such results deal with the "order abstraction" of the fixed point set of F . A natural refinement would be to ask whether, for a given partition I 1 ∪ · · · ∪ I k of the state space S, there is a fixed point w of F such that w i = w j , ∀i, j ∈ I m , ∀m ∈ [k], and w i1 < w i2 < · · · < w i k , ∀i 1 ∈ I 1 , . . . , i k ∈ I k .
We do not know whether this can be checked in polynomial time for any k 3.