Optimal Strategy Synthesis for Stochastic Boolean Games

In this study, we regard a classification of winning strategy and payoff function over infinite plays. The central goal is to examine the values of the game and then to determine the existence of optimal (∈-optimal) strategy. Furthermore, we are interested in the subject of what sort of optimal (∈-optimal) strategy exists. We first review on generalised reachability games and then we introduce a new game, called Boolean games and concentrate on games with a Boolean combination of the reachability games. The primary contribution is on the existence of ∈-optimal finite memory strategy of each player for any ∈ > 0. We also prove every player has no ∈-optimal memoryless strategy for some Boolean game.


Introduction
In this paper, we consider an optimal strategy for stochastic Boolean games. We essentially study the presence of the optimal ( -optimal) strategy of Player I and II for Boolean games of reachability goals. Initially in [13] and [12], they showed the determinacy of turn-based stochastic reachability objectives. In this game, we map with 1 to every play that spans the target positions, and 0 otherwise. Then we heightened their result by determining the optimal and -optimal strategy for each player in simultaneous games [5]. We understand that Martin's theorem conferred all games in Borel space are determined, but our outcomes deliver a distinctive perspective of proving for our games. We already observed games with more complicated objectives, that is, Büchi games where the description of determinacy by values of certain generalization of reachability games [3] (determinacy of result can be found in [4] and [6]).
Note that the games that we considered here are zero-sum games and there also loads of results on non-zero-sum concurrent games where the objectives are not adverse. The concern is the efficient computation of Nash equilibrium for diverse types of players' goals. For instance, we formulated a network game and examined mixed Nash equilibria and their generalization [7]. Synchronous stochastic games are extra challenging to think of as a rival to turn-based games. In concurrent plays, the optimal strategies may not endure, but they're constantly exists a strategy that produces a winning outcome for every real > 0. There are yet numerous challenging obvious problems in the field of synchronous games. Many of the fundamental results are remaining to be discovered in these games, notably their effectiveness of payoff functions, as we stated initially.

Related Works and Motivation
Martin and Sergei [9] demonstrate a formal proof of probabilistic analysis in Boolean games. The given payoff functions by random Boolean formulas fit some attributes of the class. For example, the probability of a winning strategy consists of an asymptotic reaction. In [17], Thomas and Paul formally extended epistemic Boolean games, which is a continuation to the Boolean games model where the players have objectives described as descriptions of modal logic. They also examined the complexity of concerns associating with Nash equilibrium and presents exciting results in such games. Other than that, proper application regarding Boolean games has been studied by inferencing the principle of imposing taxation schemes. In [1], they ascertain essential and satisfactory conditions for the presence of a Nash equilibrium for any taxation query and develop an algorithm that applies taxation queries to learn agents' goals. Egor [8] in his work concentrates on the complexity of algorithmic problems in Boolean games. The main contribution is that the problem of identifying whether a two-player game has an equilibrium serving a given payoff constraint is NEXP-complete. He provides multiple interesting results and techniques concerning computational complexity decision problem.
Then, Andreas et al. [2] have considered a dynamic epistemic logic of visibility and control. The concept of deduction of strategic abilities of coalitions of agents has been implemented of Boolean games so as the occurrence of a Nash equilibrium and that can be prolonged to Boolean games in an apparent direction. Sofie et al. [16] suggested the first technique to study non-intuitive Boolean games that do not need an exponential adaptation to normal-form plays. Their approach is based upon the disjunctive form of Boolean games and gives heuristic methods for smaller Boolean games. While Julian et al. [11] extended the standard Boolean games into the iterated Boolean games in which players regularly take exactness values for Boolean variables. Their iterated Boolean model reflects the strategy and winning objectives given by Linear Temporal Logic formula. In [10], they reveal how Boolean games can be enhanced by dependence graphs that explicitly allow the informational dependencies between variables. Consequently, they capture a more credible type of simultaneous than the concurrent-action model inferable in standard Boolean games. Xueying et al. [18] founded the direction of nperson stochastic evolutionary Boolean game (REBG) by applying the semi-tensor product of matrices. Based upon the expanded arrangement, a significant and adequate requirement is given for the solvability of the regulation of n-person REBG by creating a set of stochastic reachable positions. Then in [19], they continued the optimal control of n-person random evolutionary Boolean games (REBGs). The main outcome is on dynamic programming where an algorithm is created to solve the finite horizon optimal control query validated using MATLAB software.

The Ingredient of Games
This section provides the major aspects of this study, especially the strategies and values. We begin with the definition of the game.
Definition 1 (Ab Ghani [5]) We fixed a quadruplex G = (S, A I , A II , δ) as a two-player coexisting game, where S, A I and A II are nonempty restricted sets of states and movements respectively, with δ : S × A I × A II → S is a transformation function.
For game G, we initiated a series s 0 s 1 s 2 ... ∈ S to be a path or a play where for every n ∈ N, there is a n ∈ A I & b n ∈ A II where s n+1 ∈ δ(s n , a n , b n ). We mean Ω(G) (Ω fin ) to be the set of infinite (finite) plays. Particulars description of stochastic games can be found in [5]

Strategies and measures
We shall now state the definitions of strategies, optimality and values of the game.
Definition 2 (Ab Ghani [5]) Let Gbeagame.Arandomizedstrategyof P layerIinGisaf unctionσ : Ω fin (G) → P(A I ). We let Σ I be the set of all strategies of Player I. A randomized strategy for Player II has interpreted similarly, that is, τ : Ω fin (G) → P(A II ). Note that P(A i ) means the collection of all probability assignments on A i . 2.1.1. Probability measure and expected values. A pair (σ, τ ) ∈ Σ I × Σ II and a state s ∈ S determine a probability proportion P σ,τ s on Ω s = {w ∈ Ω : w(0) = s} as follows.
Definition 3 Let G be a game. For a pair (σ, τ ) ∈ Σ G I × Σ G II of strategies and a state s ∈ S, the probability measure P σ,τ s determined by ∈ Ω : p ⊂ w} and p n denote the set of play restricted to n step.

Values and optimal strategies
Let G be a game. Consider F : Ω(G) → [0, 1] with P σ,τ s (F ) subsists for all σ ∈ Σ G I and τ ∈ Σ G II . We define this sort of function as a payoff function. Formally, the Player I's value in a game satisfies for all s ∈ S. Optimal strategies are 0-optimal strategies.
By the definitions, σ ∈ Σ I is optimal for Player I if and only if inf τ ∈Σ II P σ,τ s (F ) = val s (F ) true for any state s, and τ ∈ Σ II is optimal when sup σ∈Σ I P σ,τ s (F ) = val s (F ) satisfies for all s ∈ S.

Reachability Games
In discussing a particular Boolean game, it is useful to consider certain notions of reachability games. Therefore, we sketch briefly the concepts of reachability game [5] and some related results that can be developed in Boolean games.
Definition 5 Let T ⊂ S be a target state. We fix R G,T : Ω(G) → {0, 1} by A G(R G,T ) is said to be a reachability game.
Definition 6 For every natural number n ∈ N and state s in S, we state inductively V G,T n : For every ∀s, we fix V G,T (s) = lim n→∞ V G,T n (s) to be the limit value.
For any n ∈ N, we describe R G,T n : Ω(G) → {0, 1} by R G,T n (w) = 1 if there is a natural number m ≤ n with w(m) ∈ T and R G,T n (w) = 0 otherwise. Games represent with G(R G,T n ) are said n-step reachability games. In this formation, it demands Player I to stays the objective set T throughout the initial n-step.

Boolean Games
The ultimate result in this section confirms that for some > 0, there is no -optimal memoryless strategy for each player in some game with a Boolean combination of reachability games. We demonstrate that in every game with a Boolean combination of reachability games and for any > 0, there is an -optimal finite memory strategy for each player.
Fix a game G. Let BOOL ⊂ Pow(Ω) be the least class such that R G,T ⊂ BOOL for any T ⊂ S, and that BOOL is closed under the set-theoretic operations of union, intersection and complementation.
Definition 7 Any game expressed with G(A) where A ∈ BOOL, is said a Boolean game.
As we noticed, all strategic deliberations must be set in a definite way so that it incorporates all knowledge regarding the conditions of the game. So, for the matter of Boolean games, we have to establish a particular description of finite memory strategy. The definition can be viewed as a substantial version than the interpretation of finite memory strategy for Büchi games [6].
Definition 8 Let G(A) be a Boolean game. A strategy σ of Player I is called finite memory if there is a function : Pow(S) → Σ M I such that for any play p ∈ Ω fin , σ(p) = (Set(p))(p(|p| − 1)), where Set(p) = {s ∈ S : ∃k < |p|, p(k) = s}. A finite memory strategy of Player II is fixed accordingly. We use Σ I and Σ II to represent set of all finite memory strategies of Player I and Player II, respectively.

Determination of games and optimal strategies
In this part, we first fix the following set of succeeding plays for Player I and then analyze the strategies for the games with Boolean combinations of the R G,T 's.
Proof. The proof succeeds by confirming that the equations are preserved the union, intersection and negation operation.
Theorem 3 There exists a set B ∈ BOOL such that for some > 0, there is no -optimal memoryless strategy of Player I in G(B).
Theorem 4 For some > 0, there is no -optimal memoryless strategy of Player II in G(B).
Proof. By inspecting the complement of the game displayed in the preceding proof, we can explain similarly for the event of Player II, and consequently, the theorem holds. It is clear that the equality val G s (B) = val G (s,{s}) (B ) holds for any s ∈ S. Moreover, there is a bijective translation from any finite memory strategy of each player in the game G to a memoryless strategy of the corresponding player in the game G conserving the value of their strategy. Thus, to show the existence of a finite memory -optimal strategy of each player in the game G(B) for any positive real , it is sufficient to see the existence of a memoryless -optimal strategy of each player in the game G (B ) for any ∈ R + .
We show this by backward induction on the cardinality of T for a state (s, T ) ∈ S . Fix (s, T ) ∈ S . As an inductive hypothesis, we suppose that for any positive real , there exists a memoryless -optimal strategy of any player in the game G (B ) restricted to {(t, U ) ∈ S : #U > #T } as its set of states. We have two cases.  T ) ). Given a positive real number , a memoryless -optimal strategy of a player in the game G (B ) restricted to {(t, U ) ∈ S : #U ≥ #T } is obtained as the combination of a memoryless β-optimal strategy of the player in the generalized reachability game G (R G , (s,T ) ) and a memoryless β -optimal strategy of the player in the Boolean game G (B ) restricted to {(t, U ) ∈ S : #U ≥ #T } for some positive small-enough real numbers β, β .
Case 2. For some i ≤ n, T intersects with T i,j for all j < n and T ∩ T i,n has the empty intersection. Consider a play with the starting point (s, T ) in the game G (B ). If the play stays {(t , T ) : t ∈ T }, then the Player II loses. If the play reaches some state of the form (t, T ∪ {t}) with t ∈ S \ T , then from that state, the Player II can win with the probability

Concluding Remarks
The main result of this study is the introduction of a new game and provides a significant analysis of the winning strategy. Particularly, we define a special type of strategy namely finite memory strategy. We proved that the Boolean game is closed under union, intersection and negation. We then showed Player II has no -optimal memoryless strategy in this game but both players have -optimal finite memory strategy in every Boolean game.