Asymmetric games on networks: mapping to Ising models and bounded rationality

We investigate the dynamics of coordination and consensus in an agent population. Considering agents endowed with bounded rationality, we study asymmetric coordination games using a mapping to random field Ising models. In doing so, we investigate the relationship between group coordination and agent rationality. Analytical calculations and numerical simulations of the proposed model lead to novel insight into opinion dynamics. For instance, we find that bounded rationality and preference intensity can determine a series of possible scenarios with different levels of opinion polarization. To conclude, we deem our investigation opens a new avenue for studying game dynamics through methods of statistical physics.

We investigate the dynamics of coordination and consensus in an agent population.Considering agents endowed with bounded rationality, we study asymmetric coordination games using a mapping to random field Ising models.In doing so, we investigate the relationship between group coordination and agent rationality.Analytical calculations and numerical simulations of the proposed model lead to novel insight into opinion dynamics.For instance, we find that bounded rationality and preference intensity can determine a series of possible scenarios with different levels of opinion polarization.To conclude, we deem our investigation opens a new avenue for studying game dynamics through methods of statistical physics.

I. INTRODUCTION
Group coordination is a collective phenomenon of particular interest in various contexts [1] and can be observed in human communities and other animal groups.For instance, flocks, schools of fish, and ant colonies often show hallmarks of group coordination.The latter emerges, typically, to address specific functions, such as improving the quality of flights and defending from attacks [2].In human societies, to cite a few, coordination underlies several activities, such as sports, business developments, start-up growth, and company organisation.Therefore, understanding group coordination and its mechanisms can clarify relevant social aspects of our society and nature.To this end, game theory allows mapping group coordination to a specific equilibrium in a competition among strategies.For instance, let us consider a group of friends conversing about music genres, sports teams, or political candidates.Here, mapping opinions to strategies and exchanges of ideas to game interactions, the convergence of the group to a common opinion corresponds to the success of a strategy.These friends, mapped to agents playing a game, may have personal preferences.So, in the presence of an opinion of the majority, we wonder whether agents biased towards a different opinion can suffer from social pressure.On the one hand, falsifying their preference [3], i.e. avoiding expressing an opposite opinion, may reduce conflicts and lead to group coordination.On the other hand, in some cases, declaring an honest view can be more profitable.In summary, the described real-world scenario, i.e. the conversating friends, mapped to an agent population allows us to exploit game theory.For instance, in game theory, agents are defined as rational when they act to maximise their payoff and can undergo a strategy revision phase [4,5], so the resulting evolutionary dynamics can lead the population towards some strategy equilibria corresponding to opinion consensus/dissent.Notice that combining game theory with evolutionary mechanisms is at the core of the evolutionary game theory framework [6][7][8][9][10][11][12], fundamental for studying the strategy equilibria in various scenarios.
We consider a population whose agents interact by playing asymmetric games.Thus, conflicting preferences motivate the emergence of equilibria where group coordination is not reached.Yet, we aim to quantify these mechanisms and measure the polarisation under different conditions, such as various levels of rationality.Concerning that, previous studies focused on similar aspects.To cite a few, [13][14][15][16] study the asymmetric games theoretically and through simulations assuming best-response dynamics, [17,18] perform experiments with human subjects on different social networks.[19,20] study a version of the random field Ising model at temperature T = 0, while [21] focuses on the mapping between a general 2-players game and the Ising model.In [22], the authors attempt to connect the asymmetric games on the network to an Ising model, which is encouraged in the review [23].Our main contributions include identifying the conditions for which the studied asymmetric games are potential games [24] by mapping them to a random field Ising model and, through the latter, studying analytically the effects of bounded rationality in asymmetric games on networks.We remark that random fields have already been investigated within the context of social dynamics [25] to describe the dynamics of opinion consensus in open and closed (i.e.finite size) populations.Then, our results, supported by numerical simulations, show a rich spectrum of outcomes leading to various interpretations, such as opinion polarization.The latter becomes particularly interesting as it relates to the bounded rationality assumption considered in the proposed model.
The remainder of the manuscript is organised as follows.In Section II, we introduce asymmetric games on networks with more detail, give a condition for them to be potential games, and map their dynamics to an Ising model.Then, in Section III, we study the model on a complete network and a k-regular network, while Section III focuses more on numerical simulations.Finally, we discuss the main finding in Section V. Two additional appendices report, respectively, the calculations to map the game dynamics to the Ising model (Appendix A) and the study of asymmetric games with infinite rationality (Appendix B).

II. MODEL
Let us now introduce the asymmetric game we use on networks to study the dynamics of group coordination, namely the 2-player Battle of Sexes (BoS) [26].This game has two strategies, x i = {0, 1}, and one fixed identity θ i = {0, 1}.The latter identifies the strategy preference of each player, e.g.θ i = 1 indicates that the i-player prefers the strategy 1 (and vice versa).So, the possible combinations of players' identities lead to the following payoff matrices 1 (1, S) (0, 0) 0 (0, 0) (S, 1) where the elements of the matrices (•, •) indicate respectively the reward of player i and j with the corresponding combination of strategies.S ∈ [0, 1] denotes a parameter related to the preference strength: the higher S, the lower the difference in terms of reward between coordination at the preferred and unpreferred strategy, so the lower the preference intensity.In a network, each agent plays the 2-players BoS with each one of its neighbours simultaneously, i.e. the agent's chosen strategy is the same for all its interactions [13,14].We can write each player's utility function (total payoff) as where and I {•} is the indicator function, equal to 1 if the condition • is verified and zero otherwise.We refer to this class of games as the Broere's model [13] and study the evolutionary dynamics of the agent population, considering various initial conditions.At each time step, an agent i is randomly selected and chooses its strategy in function of the current configuration of its ego network (i.e. its neighbours) through the so-called Logit rule [27,28].Specifically, the selected agent plays x i = 1 with probability (1;θi)] + e Ri[πi(0;θi) ] = e Ri[πi(1;θi)−πi(0;θi)] 1 + e Ri[πi(1;θi)−πi(0;θi)] ( and x i = 0 with probability P i (0) = 1 − P i (1).Notice that the probabilities depend only on the payoff difference between the two strategies.Also, we assume the agents have complete information about their ego networks.The term R i ∈ [0, +∞) represents the rationality of the i-th agent and reflects the inclination to pursue its personal interest (i.e.maximizing its utility), but can have alternative interpretations later discussed.Accordingly, R i → ∞ entails the i-th agent playing the best response, whereas R i = 0 entails an irrational attitude as the strategy is randomly selected.In general, we consider R i = R ∀i = 1, ..., N , and we rescale the rationality parameter R by a factor equal to the average degree of the network so that R = r <k> .
Following the above prescriptions, let us consider, for example, the i-th agent with k i degree (i.e.number of neighbours) and identity θ i = 1.By indicating with w i the number of i-th agent's neighbours currently playing 1, π i (1; 1) = w i and π i (0; 1) = S(k i − w i ), so our agent chooses the strategy 1 with probability and, indeed, the strategy 0 with probability 1 − P i (1).

A. Mapping to Ising models
In the appendix A, we show that a game on a network G = (V, E) with payoff matrices of the type has a potential if and only if (a where C is a constant.Thus, assuming a homogeneous rationality R and Logit rule underlying the system evolution, the game is equivalent to an Ising model evolving by the Glauber dynamics with Hamiltonian H = − i∈V h i σ i − J ij∈E σ i σ j , where the strategies x = {0, 1} N are mapped to the spin random variables σ = {−1, +1} N .The mapping between the two models is realised via the following correspondences: where k i is the degree of the i-th node in the network.Now, applying these results to the payoff matrices (1) associated with the Broere's model, we see that the payoff matrices can be of two types, depending on the individuals' preference.Identifying the agents with personal preference for the strategy 1 (resp.0) as belonging to class A (resp.B), we have that Being (a A ) = (a the condition ( 5) is satisfied and the game has a defined potential.Moreover, the parameters of the corresponding Ising model read III. RESULTS

A. Complete networks
Connectivity plays a role of paramount relevance in a number of phenomena, including the dynamics of evolutionary games [29,30].Understanding the dynamics of a model whose entities are fully connected can be highly beneficial for assessing the effect of some more complex interaction topology.Therefore, before analysing the outcomes of the proposed model in k-regular networks, we observe those achieved by a fully-connected structure with a large number of nodes N .In this setting, each agent has a degree k = N − 1 ≃ N .Also, a fraction α of agents prefer the +1 strategy (group A, N A = αN in number), while the others prefer the strategy 0 (group B, N B = N − N A ). Once the mapping (10), we are left with a bi-populated mean-field Ising model [31][32][33], for which we derive the free energy in the large N limit (12), calculate the stationary points through the mean-field equations (13) and predict the equilibrium states.
The mapping for the complete network reads Notice that we obtain a generalization of the Random Field Ising Model (RFIM) [34,35] with different numbers of sites with positive and negative fields.Considering group A and B's average magnetizations i∈V A/B σ i respectively (notice that m A/B = 2ρ A/B − 1, where ρ A/B is the fraction of individuals of group A/B playing +1), and the corresponding effective Hamiltonian the free energy (see [33] for the derivation) reads where hA/B := is the binary entropy function.The values of the magnetizations m A , m B at equilibrium, indicated with m * A , m * B , correspond to the ones at the global minimum of the free energy functional.The stationary points of the latter functional are the solution(s) of the set of mean-field equations, obtained by setting to zero the derivatives of ( 12) with respect to m A and m B , where, on the right, is the set of mean-field equations having substituted the Ising parameters with the ones of the game.In figure 1 is reported the free energy and its stationary points for four values of the rationality r, for α = 0.5.In this particular case, exploiting the symmetry, we can take m * A + m * B as an order parameter and see that it undergoes a transition from being unique and zero up to a certain value of the rationality (inverse temperature), to show a positive and a negative value (two stable fixed points) after that point.Moreover, for very low rationalities (r = 0.1), the only fixed point is localized around m * A ≃ m * B ≃ 0, meaning that the agents of both classes play 1 or 0 with approximately the same probability.For low rationalities (r = 2 in the figure), the fixed point is still unique but localized in the fourth quadrant (m * A > 0, m * B < 0), and a distance in average strategies between the classes emerges (polarization).For higher rationality (r = 4 in the figure), two fixed points corresponding to local minima appear as well as one saddle point: a phase transition of ferromagnetic type has occurred, and the system ends up in one or the other minimum with equal probability by spontaneous symmetry breaking.In the latter states, one class has managed to induce the other to play in the majority of its preferred strategy.
In [36], it is found that at least for h A = h B = 0 the equilibria of the Glauber (Logit) dynamics and their associated stability correspond to the stationary points of the free energy functionals, i.e. the solutions of the mean-field system (13).We use the stationary states of the mean-field free energy to predict the magnetizations at equilibrium (global minimum) and, as a numerically tested approximation (see Figure 2), to identify the relaxation points of the game dynamics (local minima), also for α ̸ = 0.5.
Last, we mention that the best-response regime is recovered for r → ∞: in this regime (see [19] and, using a dynamical approach, appendix B), the fully polarized state (m A = 1, m B = −1) is a stable fixed point of the dynamics only for S < α 1−α (assuming without loss of generality that α ≤ 0.5, see (B6)), otherwise the only possible equilibria are the two full consensus states (1, 1) and (−1, −1).

B. Random k-regular networks
Now, let us consider k-regular random networks [37], i.e. networks whose nodes have the same finite degree k with connections drawn uniformly randomly.In this setting, by parametrizing R = r k , the mean-field equation system coincides with (13).That is due to the linear dependency of the field h i on its degree k i (eq.( 9)) and the homogeneity of degrees.Nevertheless, the mean-field approach is exact in a complete network for N → ∞, while the same does not apply to regular networks.We expect the approximation to work better in denser networks, i.e., networks having a higher k.In figure 2, we show the mean-field predictions for the relaxation state (local minima of the mean-field free energy) by comparing the simulation outcomes of multiple games: we set α = 0.4, thus S * = 0.67 (eq.B6), and vary the rationality through r for S = 0.8 > S * (upper plots) and for S = 0.2 < S * (lower plots).The effects of the finite degrees are discussed, at least in the regime of infinite rationality, in the appendix B.

C. Polarization and Rationality
The mapping between the BoS game (on networks) and the Ising model allowed us to gain a preliminary overview of the effects of bounded rationality in these dynamics by an analytical approach.Now, we study the relationships between group polarisation and bounded rationality by numerical simulations.For clarity, indicating with ρ A/B (t) the evolution of the density of the agent with +1 opinion in the two groups, the polarization is defined as the distance between the average opinions of the two groups, i. ), and the behaviour of ρA(t), ρB(t) as a function of time for 10 simulations of the game with different initial conditions (ρA(0), ρB(0)), differentiated according to the couples of colors: light green/light blue correspond to (0.9, 0.1), green/red to (0.1, 0.1), dark green/dark red to (0.9, 0.9).The dashed red/green horizontal lines are the mean-field predictions, i.e., the solutions of the mean-field system corresponding to the local minima of the mean-field free energy, respectively . In the upper row, S = 0.8, while in the lower one S = 0.2.The rationalities r are specified in the titles of the plots.
averages of the densities of agents playing 1 at the state reached after relaxation.We see that the behaviour of the polarization as a function of rationality is highly non-trivial.For all the values of S, at low rationalities, the polarization shows a monotonic behaviour for low r.Then, polarization can either keep increasing (high S) until it becomes maximum at infinite rationality (full polarization) or else it can decrease after reaching a peak (middle and low S).For middle S, at a point, it suddenly bumps up to reaching almost full polarization while, for low S, it stays very close to zero (almost consensus) even for infinite rationality.This behaviour roughly follows the mean-field predictions but with some deviations that, in our interpretation, are due to the finiteness of the degrees and the fluctuations induced by limited rationality.The former may generate a cascade effect leading the system to the state corresponding to consensus at the majority's preferred opinion (see section B in the appendix for an extended analysis), while limited rationality increases the fluctuations and thus the possibility to fall into the basin of attraction of the consensus points.Both the effects favour the approach to the (almost-)consensus states even for stronger preference intensities (low S) concerning what is predicted within the mean-field approximation (see Fig. 4d ).

D. Asymmetric games of Hernandez et al.
Both the models of Broere et al. [13] and Hernandez et al. [14] consider two classes of agents, A, B, with conflicting preferences of equal intensities.The difference is in the fact that Hernandez et al. consider a further reward for the agent's choice independent of the neighbours' choices, i.e. what we call single term χ (1) in (A1), which is greater if the chosen strategy corresponds to the preferred one.Thus, for the model of Hernandez et Thus, by following the mapping towards the Ising model reported in the appendix, one notes that χ (1) just adds a term in the magnetic field (eq.A27), which is subleading in k.

IV. DISCUSSION
In this work, we studied asymmetric games with bounded rationality, mapping their dynamics to that of an Ising model.We consider agents endowed with a fixed preference in a binary opinion system, leading to the formation of two communities (or groups).Namely, each group is biased towards one of the two opinions, say 0 and 1, respectively.Also, while preferences cannot change, i.e. are fixed, agents can change opinions.The agents' rationality, corresponding to the system temperature [38], affects the population dynamics [39].Then, mapping the agent population to the Ising model, the convergence of the agents towards stable opinions resembles a phase transition.For instance, low rationality entails the existence of a single equilibrium.Conversely, higher values of rationality may lead the system towards configurations with two or more stable states.In this context, an equilibrium corresponds to a state with two groups having an average density stable over time.While the mathematical formulation of the model is independent of the agent population structure, we performed the analysis considering a complete and a k-regular network.The analytical calculations rely on the mean-field approximation that, as known, is not exact for k-regular networks.Therefore, for the second structure (i.e. the k-regular network), we investigated the effects of finite degrees using numerical simulations.Our analyses include two conditions, i.e. bounded and infinite rationality.It turns out that bounded rationality and preference intensity determine a series of possible scenarios characterized by different levels of polarization (i.e. the distance between the groups in terms of average opinion).Within the mean-field theory, we find the following cases: • low rationality: there exists a unique stable state weakly polarized (whose polarization depends on the preference intensity); • high rationality and low preference intensity: there exist two fixed points corresponding to almostconsensus states at one and the other opinion (so with very low polarization); • high rationality and high preference intensity: there exists instead a single fixed point corresponding to a strongly polarized state; • very high rationality and in the limit of infinite rationality, for high preference intensities: two consensus fixed points pop up and stand together with the polarized one.
In the presence of multiple fixed points, the reached one depends on the initial conditions and single realizations.The finiteness of the network and the network's degrees, breaking the mean-field assumptions, favours in general (almost-)consensus states, especially for low rationality.
Interpreting the rationality as the agents' attention to minimize their personal and social dissonances (as in [40,41]), increasing the latter from a weakly polarized state may lead the system to an almost-complete consensus state or a very polarized one, depending on the preference intensity.Accordingly, a policy-maker aiming to curb conflicts in a community should be aware of groups of individuals with fixed conflicting preferences towards specific issues -see also [42] on this topic.For instance, raising attention to one of these issues can divide public opinion and severely increase polarization.On the other hand, if the preference intensities are sufficiently low, raising attention to an issue makes public opinion reach a low conflict, almost-consensus state, albeit some individuals have falsified their preferences.Beyond that, our work sheds light on the dynamics of asymmetric games, typically studied by numerical simulations and infinite rationality [19].Let us remark that the mapping to the Ising model is performed by assuming opposite preferences of equal intensity (Broere's model), Logit-rule as a dynamical rule for the opinion update, and homogeneous rationality in the population.

V. CONCLUSION
In summary, our work clarifies some aspects of the dynamics of asymmetric games through random field models.Interestingly, the achieved results find interpretations within the context of opinion dynamics and the formation of coordination.While previous investigations, oriented towards social systems, attempted to describe similar scenarios, in this investigation, we focus on asymmetric games played by agents endowed with bounded rationality and map them to random field Ising models.As reported above, results show the emergence of opinion polarization and consensus, depending on the agents' rationality.In light of that, we deem our study sheds light on relevant aspects of asymmetric games, and the proposed formalisation in terms of random fields can support further works in this direction.Notwithstanding, several aspects still deserve attention.To cite a few, future investigations may study the role of heterogeneous networks (e.g.small-world structures and scale-free networks [43], multi-layer networks [44], and networks with higher-order interactions [45]), the effect of homophily reflecting the preferences' assignment, as in a previous work [46], and that of larger opinion systems.Eventually, further developments can relate to the framework of evolutionary game theory.For instance, treating opinions as strategies, the preferences could describe the agents' attitude towards cooperation or defection in dilemma games.To conclude, we remark that our results rely on the mapping to the Ising model.Thus, further developments in applying statistical mechanics to opinion dynamics and (evolutionary) game theory [47] can exploit the formalism we proposed in this work.
Appendix A: Asymmetric network games and Ising models A general coordination or anticoordination game on a network G = (V, E) with arbitrary order interactions, i = 1, ..., N agents, two strategies x i = {0, 1} and response (dynamical rule) depending only on the agent's possible payoffs, can be expressed through the payoff functions of each agent i (x i ) is the reward for agent i choosing strategy x i , χ (2C) i (x i ) the reward for each successfull coordination at x i , χ (2AC) i (x i ) for each pairwise anti-coordination, χ (3C) i (x i ) for each coordination in an hyperedge with two neighbours of i and so on.We consider so far only single and pairwise interactions, so the payoff before reduces to Now we can use the matrix representation of the payoff, expliciting considering for the moment r (1) i = r (0) i = 0, for each pairwise interaction of the agent i Now using the vector representation ⃗ a = (a (11) , a (10) , a (01) , a (00) ), we decompose such vector in four orthogonal components so that a generic payoff matrix can be written in the so called normal-form as from which we interpret α i as the reward to choose the preferred action, γ i the reward for achieving coordination, β i the external payoff depending only on the other agent's action and η i as a free independent reward, all for agent i.The coefficients come from the system ⃗ a which inverted gives It is easy to verify, by solving the fixed point equations and performing a simple linear stability analysis, that each of the zones potentially has a stable fixed point: these are, in the ρ A , ρ B plane, respectively (0, 0) for the first,(α, 0) for the second and (α, 1 − α) for the third zone.If it is true that both the consensus state are always located within the corresponding areas and thus always exist, this cannot be said for the fully polarized one: (α, 0) is a (stable) fixed point if and only if it falls in the area defined by the second condition ρ ∈ ].If without loss of generality, we take α < 0.5, the latter condition corresponds to a threshold on the preference intensity For values of S above S * , starting from everyone choosing his preferred action the system will move to consensus towards one of the two opinions (typically the one of the majority), while for S < S * the preference intensity is sufficient to make the system remain in the fully polarized state.The mean-field dynamics is represented by the vector field and it is tested on a large degree network k = 300 of N = 1000 agents for multiple initial conditions, in figure 4a for S = 0.5 and in figure 4b for S = 0.8.When testing the mean-field predictions on a sparser graph, we see that the accuracy of the mean-field predictions and specifically the mean-field threshold (B6) considerably decreases even for random graphs with large average degrees, as shown in figure 4c for k = 30.In figure 4d we report the empirical threshold as a function of the average degree of a random regular graph of N = 1000 agents, for various compositions α corresponding to the different colours.The empirical threshold is defined as the largest value of S for which, starting from the fully polarized state of the system, over an ensemble of the system's trajectories the majority of them do not approach full consensus.The motivation for the correction to the mean-field predictions for the threshold resides in the validity of the mean-field assumption that assumes that each agent's ego-network is a perfect sample of the whole network, both topologically (the neighbours' belonging classes) and dynamically (distribution of current opinions): in a random graph some nodes have an ego-network that deviates from the average one, the more the smaller is the average degree.Thus, having e.g. a higher number of connections with one class with respect to the average, the agents would adopt more easily that class' preference and may induce other agents of the same population to follow them, generating a cascade effect that provokes consistent fluctuations, i.e. deviations from the mean-field predicted behaviour, possibly making the system fall into the basin of attraction of the consensus points.By analysing the numerical simulations, we speculate that the effect is amplified by the noise induced by bounded rationality, which generally increases the fluctuations and thus the possibility of falling into the basin of attraction of the equilibrium states corresponding to (almost-)consensus.

FIG. 1 .
FIG.1.Mean-field free energy for different rationalities r.The parameters of the game are set to α = 0.5, S = 0.6.In black, the local minima of the free energy, half-white and half-black are the saddle points.In the last figure, as r >> 1 the minima stand almost at the corners (1, 1) and (−1, −1), corresponding to full consensus.
FIG.2.Mean-field predictions and simulations.The network is k-regular with N = 5000 agents and k = 30.The fraction of agents with +1 preference (group A) is α = 0.4.The figures show both the mean-field predictions for the stationary density of +1 spins with preference +1 (ρ * A ) of +1 spins with preference 0 (ρ * B ), and the behaviour of ρA(t), ρB(t) as a function of time for 10 simulations of the game with different initial conditions (ρA(0), ρB(0)), differentiated according to the couples of colors: light green/light blue correspond to (0.9, 0.1), green/red to (0.1, 0.1), dark green/dark red to (0.9, 0.9).The dashed red/green horizontal lines are the mean-field predictions, i.e., the solutions of the mean-field system corresponding to the local minima of the mean-field free energy, respectively ρ * A =

FIG. 3 .
FIG.3.Polarization at the stationary state, as a function of the rationality.We consider simulated games on a regular graph of N = 1000 agents with degree k = 30 and α = 0.4, for various intensities of the preference S as in the legend.The initial condition corresponds to everyone choosing his preferred strategy.The dots correspond to the polarization value at the state after relaxation and are averaged over 10 independent realizations of the model.

FIG. 4 .
FIG.4.Best-response: mean-field predictions and low degree effects.(a),(b) and (c) show the vector field related to the mean-field predictions of the dynamics (eq.B2) and the corresponding stable fixed points (black dots at the corners).Moreover, the trajectories of the system evolutions from various initial conditions, corresponding to different colors, are reported.The initial states are indicated with the colored dots.In all the simulations and predictions N = 1000 and α = 0.4.In (a) and (c), as S < S * the polarized state is a stable fixed point.Nevertheless, the polarized state is actually reached only in (a) (k = 300), while for a lower degree, in (c) (k = 30), it is never reached as the mean-field assumptions break.Figure(d) reports the empirical thresholds (triangles) as a function of the degree of the regular graph, for various values of α as in the legend and N = 1000.For each α, the mean-field prediction S * α (B6) is also reported (horizontal thin lines).