Necessary and Sufficient Condition for the Existence of Zero-Determinant Strategies in Repeated Games

Zero-determinant strategies are a class of memory-one strategies in repeated games which unilaterally enforce linear relationships between payoffs. It has long been unclear for what stage games zero-determinant strategies exist. We provide a necessary and sufficient condition for the existence of zero-determinant strategies. This condition can be interpreted as the existence of two different actions which unilaterally adjust the total value of a linear combination of payoffs. A relation between the class of stage games where zero-determinant strategies exist and other class of stage games is also provided.


Introduction
Zero-determinant (ZD) strategies are a class of memory-one strategies (strategies which recall only one previous period) in repeated games which unilaterally enforce linear relationships between payoffs of players.ZD strategies were first discovered by two physicists, Press and Dyson, in the repeated prisoner's dilemma games. 1)ZD strategies contain several counterintuitive examples, such as the equalizer strategy, which unilaterally sets the payoff of the opponent, and the extortionate strategy, which always obtains payoff greater than or equal to that of the opponent.ZD strategies also contain the generous ZD strategy, which achieves a cooperative Nash equilibrium. 2)After their discovery, many extensions have been done mainly in two directions.The first direction is extension of the range of application of ZD strategies.
Concretely, ZD strategies were extended to multi-player multi-action games, [3][4][5][6] games with a discount factor, 5,7,8) games with imperfect monitoring, [9][10][11][12] and games with asynchronous update.13) The second direction is extension of the ability of payoff control. The concept of ZD trategies was extended so as to control moments of payoffs, 14) time correlation functions of payoffs, 15) and conditional expectations of payoffs.16) A mathematical framework of ZD strategies has been used to classify memory-one strategies into such as partner strategies and rival strategies, in social dilemma situation.2,7,17) Furthermore, the relation between unbeatable imitation 18,19) and ZD strategies has gradually been clarified in two-player symmetric games.20) Although ZD strategies have been found in several stage games, such as the prisoner's dilemma game, 1) the public goods game, 3,4) the continuous donation game, 5) a two-player two-action asymmetric game, 21) and two-player symmetric potential games, 20) a condition for the existence of ZD strategies has not been clear.For example, it has been known that ZD strategies do not exist in the rock-paper-scissors game.11) It has been believed that the existence of ZD strategies is highly dependent on the structure of the stage game.
In this paper, we provide a necessary and sufficient condition for the existence of ZD strategies.This condition implies that the stage game must be easy to handle in some sense for players who want to use ZD strategies for the existence of ZD strategies.From another perspective, we can introduce a class of stage games in which ZD strategies exist.Such classification of stage games may be useful similarly as symmetric games, 22) potential games, 23) and generalized rock-paper-scissors games. 18)We provide a relation between the class of stage games where ZD strategies exist and other class of games, for the case of two-player symmetric games.This paper is organized as follows.In section 2, we introduce repeated games and ZD strategies.In section 3, we provide our main theorem about the necessary and sufficient condition for the existence of ZD strategies.A relation between the class of stage games where ZD strategies exist and other class of stage games is also provided in the section.Section 4 is devoted to concluding remarks.

Preliminaries
We consider a repeated game. 24)The set of players is described as N := {1, • • • , N}, where N > 1 is the number of players.The action of player j ∈ N in the stage game is written as where M j is a natural number describing the number of action of player j.We collectively write A := N j=1 A j and σ := (σ 1 , • • • , σ N ) ∈ A. We call σ an action profile.The payoff of player j when the action profile is σ is described as s j (σ).Therefore, the stage game is G := N, A j j∈N , s j j∈N .We write a probability M-simplex by ∆ M .We also introduce the notation We repeat the stage game G infinitely.We write an action of player j at round t ≥ 1 as The behavior strategy of player j is described as , where is the conditional probability at t-th round.We write the expectation of the quantity B with respect to strategies of all players by E[B].We introduce a discounting factor δ satisfying 0 ≤ δ ≤ 1 in order to discount future payoffs.The payoff of player j in the infinitely repeated game is defined by In this paper, we consider only the case δ = 1. 1) A time-independent memory-one strategy of player j is defined as a strategy such that 1) .For time-independent memory-one strategies T j of player j, we introduce the Press-Dyson vectors 2,11) T where δ σ,σ ′ is the Kronecker delta.The second term in the right-hand side of Eq. ( 2) can be regarded as a memory-one strategy (called "Repeat") which repeats his/her own previous action, and therefore the Press-Dyson vectors are interpreted as the difference between his/her own strategy and "Repeat".It should be noted that, due to the properties of the conditional probability T j , the Press-Dyson vectors satisfy several relations.First, it satisfies due to the normalization condition of T j .Second, it satisfies for all σ j and all σ ′ .Third, it satisfies The last two comes from the fact that T j takes value in [0, 1].
Definition 1 ( 1, 5) ) A time-independent memory-one strategy of player j is a zero-determinant (ZD) strategy when its Press-Dyson vectors can be written in the form with some nontrivial coefficients {α k } and c σ j (that is, not ) and Eq. ( 6) is not zero for some σ ′ .
In other words, in ZD strategies, a linear combination of the Press-Dyson vectors is described as a linear combination of payoff vectors and a vector of all ones.It has been known that a ZD strategy (6) unilaterally enforces a linear relation between expected payoffs: 1, 2, 20) where • • • * is the expectation with respect to the limit-of-means distribution and is the marginal distribution obtained from the joint distribution of action profiles.Because S k = s k * (∀k), the linear relation ( 7) can be interpreted as a linear relation between payoffs in the repeated game.

Necessary and sufficient condition for the existence of ZD strategies
Although ZD strategies have been found in several stage games, such as the prisoner's dilemma game, 1) the public goods game, 3,4) the continuous donation game, 5) a two-player two-action asymmetric game, 21) and two-player symmetric potential games, 20) the condition of the existence of ZD strategies has not been clear.In this section, we provide a necessary and sufficient condition for the existence of ZD strategies.
Theorem 1 A ZD strategy of player j exists if and only if there exist some nontrivial coefficients {α k } N k=0 and two different actions σ j , σ j ∈ A j of player j such that and N k=0 α k s k is not identically zero, for the stage game G.
Proof.(Necessity) If a ZD strategy of player j exists, then the Press-Dyson vectors satisfy with some nontrivial coefficients {α k } and c σ j and Eq. ( 11) is not identically zero.Below we write B (σ) := N k=0 α k s k (σ) (∀σ).By using Eq. ( 3), this can be written as and where we have defined We also introduce σ min := arg min where ties may be broken arbitrarily.It should also be noted that σ max σ min , because the lefthand-side of Eq. ( 11) becomes 0 if σ max = σ min and therefore c max = c min , which contradicts with the definition of ZD strategies.Then, by using the property (4), we obtain and Therefore, the ZD strategy satisfies the condition (10) with σ j = σ max and σ j = σ min .
(Sufficiency) If there exist some nontrivial coefficients {α k } N k=0 and two different actions σ j and σ j of player j satisfying the condition (10), we can construct a ZD strategy as follows.
We first introduce M := N k=1 M k and a vector notation of a M-component quantity D(σ) ∈ R by D := (D(σ)) σ∈A ∈ R M .We also introduce vectors obtained from D where I( For the quantity B = N k=0 α k s k , our assumption (10) leads to We also collectively write the Press-Dyson vectors of player j by T j σ j := T j σ j |σ ′ σ ′ ∈A .
Below we construct ZD strategies for the case M j > 2 and the case M j = 2 separately.
(Because of the existence of two different actions σ j , σ j , M j must be greater than 1.) For the case, we set a strategy of player j as where we have defined We can easily check that these vectors indeed satisfy the condition of strategies ( 4) and (5).In addition, the condition (3) is also satisfied because Furthermore, the strategy (23) satisfies where we have used Eq.(22).Therefore, the strategy ( 23) is a ZD strategy.
(ii) M j = 2 For the case, we remark that the two actions of player j are σ j and σ j .We set a strategy of player j as where W is defined by Eq. ( 24).We can easily check that these vectors indeed satisfy the condition of strategies ( 3), ( 4), (5).In addition, the strategy (27) satisfies where we have used Eq. ( 22).Therefore, the strategy ( 27) is a ZD strategy.

Example
In this subsection, we construct a ZD strategy for the case of the repeated prisoner's dilemma game.The prisoner's dilemma game is a two-player two-action symmetric game with following payoffs: = (R, S , T, P) T where T > R > P > S and 2R > T + S .(The actions 1 and 2 correspond to cooperation and defection, respectively.)If we consider the quantity B = 2 k=0 α k s k with α 1 = 0 and α 2 = 1, Then, if we choose α 0 as α 0 ∈ [−R, −P], we find that the actions 1 and 2 of player 1 satisfy the condition of Theorem 1 as σ 1 = 2 and σ 1 = 1.Therefore, we conclude that the repeated prisoner's dilemma game contains at least one ZD strategy, which is a well-known result.By using the construction method in the proof of Theorem 1, B is decomposed into and the ZD strategy is Table I.A gRPS game with a ZD strategy.R P S σ σ R 0,0 -1,1 1,-1 0,0 0,0 P 1,-1 0,0 -1,1 0,0 0,0 S -1,1 1,-1 0,0 0,0 0,0 σ 0,0 0,0 0,0 0,0 0,0 σ 0,0 0,0 0,0 0,0 0,0 Next, we prove that σ 1 = σ * (L) : Assume to the contrary that 1 is an anti-symmetric part, σ 2 σ * (L) .)This is rewritten as However, since σ 2 ∈ σ * (1) , • • • , σ * (L−1) and σ * (L) ∈ A (l) for 1 ≤ l ≤ L, this contradicts with Eq. (34).Therefore, we conclude that Eq. (36) indeed holds.We now find that Eqs. ( 35) and (36) correspond to the condition for the existence of ZD strategies in Theorem 1.We remark that a linear relation enforced by the ZD strategy is s It should be noted that an unbeatable imitation strategy exists if and only if a stage game is not a gRPS game. 18)Theorem 2 also constructs an unbeatable ZD strategy, which unilaterally enforces s 1 * = s 2 * , for non-gRPS games.Both results imply that, in non-gRPS games, it is not easy for players to exploit the opponent.We also note that two-player symmetric potential games are a subset of non-gRPS games. 18)Therefore, our result directly leads to the existence of ZD strategies in two-player symmetric potential games. 20) finally remark that the converse of Theorem 2 is not true.That is, ZD strategies can exist for some gRPS games.An example is a game in Table I, which is a modified version of the RPS game.Although this game contains a gRPS cycle when A ′ = {R, P, S }, this game is also a ZD game, since σ and σ are regarded as the two actions in Theorem 1.

Concluding Remarks
In this paper, we have provided the necessary and sufficient condition for the existence of ZD strategies in repeated games (Theorem 1).This condition exactly means the existence of two actions which unilaterally increases and decreases the total value of N k=0 α k s k , respectively.We have now found that such property is necessary for unilateral control of payoffs by ZD strategies.In fact, we can easily check that the rock-paper-scissors game does not contain the two actions as in Theorem 1, which leads to the absence of ZD strategies. 11)From another point of view, stage games satisfying the condition of Theorem 1 can be regarded as a class allowing the existence of ZD strategies.We also provided the relation between this class of stage games (ZD games) and non-gRPS games for the case of two-player symmetric games.
Further investigation on the relation between ZD games and other classes of stage games is needed.
We have investigated only the situation that a discounting factor is δ = 1 and monitoring is perfect.In general, the set of possible ZD strategies decreases as δ decreases and monitoring becomes imperfect. 8,9,11,26) Paticularly, the existence of ZD strategies in games with imperfect monitoring will be highly dependent on the set of signals of each player.Investigation of the necessary and sufficient condition for the existence of ZD strategies in games with discounting and imperfect monitoring is an important subject of future work.It would be interesting if our result can be applied for the existence of memory-m ZD strategies with m ≥ 2. 15) • • • ) is an indicator function which returns 1 if • • • holds, and 0 otherwise.By the definition, any M-component vectors D can be decomposed into linearly independent vectors