Evolutionary information dynamics over social networks: a review

Purpose – The purpose of this paper is to have a review on the analysis of information diffusion based on evolutionary game theory. People now get used to interact over social networks, and one of the most important functions of social networks is information sharing. Understanding the mechanisms of the information diffusion over social networks is critical to various applications including online advertisement andrumorcontrol. Design/methodology/approach – It has been shown that the graphical evolutionary game theory (EGT)is a very ef ﬁ cientmethodtostudy this problem. Findings – By applying EGT to information diffusion, the authors could predict every small change in the process,getthedetailed dynamicsand ﬁ nally foretellthestable states. Originality/value – In this paper, the authors provide a general review on the evolutionary game-theoretic framework for information diffusion over social network by summarizing the results and conclusions of worksusinggraphical EGT.


Introduction
Nowadays, people cannot live in individual. They have to depend on their social networks more or less and interact with others. A social network is a social structure made up of a set of social actors (such as individuals or organizations), sets of dyadic ties, and other social interactions between actors. Typical social network examples include Facebook/Twitter networks, hyperlink networks of websites, scientific collaboration/citation networks and Internet of Things (IoT) (Wikipedia). With the rapid development of the internet and mobile technologies, social networks are of extremely large scale, e.g. there are over 2.41 billion monthly active users and 1.59 billion daily active users worldwide on Facebook as of June 2019 (Noyes, 2019). Meanwhile, the information size on the social networks is becoming even tremendous-scale. For instance, averagely 277,777 stories are posted on Instagram, 4,500,000 videos are watched on YouTube, 511,200 tweets are sent on Twitter and 510,000 comments are posted on Facebook in every minute (Noyes, 2019;DOMO, 2019;Martin, 2019).
The information disseminated over social networks is of various kinds, e.g. when a total victory happens in a sport match, some political opinions are declared by a party or politics, some deal advertisements are released, titbits or rumors about a superstar are exposed to the public. All these information would experience the process of generation, dissemination and vanishment, and the most important part is the information diffusion. One piece of information may disappear quickly after the appearing, or last for a long time and inspire a heated discussion for its value. To figure out how the information spreads over social networks, simulating the process of information diffusion or even predicting the final destiny of a piece of new information is vital in many applications.
From users' perspective, the diffusion dynamics or the popularity of the information is determined by complicated interactions and decision-making of other users. Based on different people's preferences, the surrounding neighbors' actions, the reliability of the information and many other factors, users would choose to spread information accordingly. For example, when a consumer comes across an advertisement of a new product, he or she would decide whether to trust or further diffuse it according to the comments of friends, the reputation of information publisher and manufacturer, and possibly his or her initial impression. Of course if the consumer is a fan of the product, he or she tends to share this message over social networks, which tightens the connection between the consumer and the information sender. However, if the consumer isn't interested in it at all, he or she may regard the information spreader as worthless, and thus choose to cut the connection. In practice, the mechanisms in the process of information diffusion should be explored in depth.
The study of information diffusion originates from the research of computer virus/ epidemic spreading over networks (Pastor-Satorras and Vespignani, 2001). One of the earliest and prominent works about information diffusion is (Gruhl et al., 2004), which studied the dynamics of information propagation through blogspace from both macroscopic and microscopic points of views. Subsequently, there are numerous works on the information diffusion, and researchers explore the problem from different aspects and adopt multiple methods to solve it. From the view of study object, the existing works can be divided into three categories: (1) diffusion characteristics analysis; (2) diffusion dynamics analysis; and (3) diffusion stability analysis.
Among the first category, Masahiro et al. (2009) discussed how to extract the most influential nodes on a large-scale social network in (Kimura et al., 2007). Later, many methods were proposed to mine top-k influential nodes in mobile social networks, e.g. a community-based greedy algorithm in Yu et al. (2010), the Shapley value-based Influential Nodes algorithm in Narayanam and Narahari (2011) and content-based improved greedy algorithm in Shahsavari and Golpayegani (2017) which decreased the total amount of computations. Authors in Usui et al. (2013) proposed a network growth model that can produce networks that have the necessary features for analysis and also analyzed how each feature affects information diffusion. The second category focuses on analyzing the dynamic diffusion process over different kinds of networks using different mathematical models IJCS (Damon, 2010;Yagan et al., 2013). In Damon (2010), the authors studied how a social network affected the spread of behavior and investigated the effects of network structure on users' behavior diffusion. Rather than focusing on the behavior diffusion, Bakshy et al. (2012). Studied the role of social networks in general information diffusion through an experimental approach. As online social networks, e.g. Facebook and Twitter, became more and more popular, some empirical analysis were conducted using large-scale datasets, including predicting the speed and range of information diffusion on Twitter (Jiang and Scott, 2010), modeling the global influence of a node on the rate of diffusion on Memetracker (Yang and Leskovec, 2010) and illustrating the statistical mechanics of rumor spreading on Facebook (Ostilli et al., 2010). Moreover, the information diffusion on overlaying socialphysical networks was analyzed in Yagan et al. (2013). The third category of information diffusion analysis focuses on the stability and consequence of information diffusion (Peng et al., 2013;Kuhnle et al., 2018). In Peng et al. (2013), the conditions for information diffusion vanishing and information diffusion being persistent in social networks were studied. Peng et al. used a mathematical model to predict the information diffusion process of multi-source news and validated its accuracy (Yuan and Ji, 2019). How to restrain the private or contaminated information diffusion was studied in Masahiro et al. (2009) and Ilyas et al. (2011) through identifying the important information links and hubs, respectively. How to maximize information diffusion through a network was discussed in Kim and Yoneki (2012) by designing effective neighbors selection strategies, whereas in Kuhnle et al. (2018), authors proposed approximation algorithms to realize the influence maximization.
From the view of adopted method, works could also be classified into two categories. The first category focuses on macro exploration, usually adopting machine learning or data mining techniques to predict the dynamics or properties of network. Among the first category, Pinto et al. used early diffusion data to predict future diffusion (Pinto et al., 2013) while the community structure was further exploited to improve the performance of prediction of viral memes in Weng et al. (2014). Given the information diffusion data, efficient algorithms were developed to infer the underlying information diffusion network in Rodriguez et al. (2014), Rodriguez et al. (2013) and Gomez Rodriguez et al. (2013). Hao et al. (2014). proposed a matrix factorization based predictive model and used gradient descent to optimize objective function (Hao et al., 2014), whereas in Alsuwaidan and Ykhlef (2017), a novel model based on a physical radiation energy transfer mechanism was proposed to predict the diffusion graph of a certain contagion. Authors in Tsai et al. (2014) studied diffusion of preference on social networks by a rank-learning based data-driven approach. Jiang et al. (2015) proposed a K-center method to realize multi-source identification of information diffusion and the corresponding infection regions in general networks. In Chejara and WilfredGodfrey (2018), authors presented an analysis of various heuristic based influence maximization techniques and proposed a machine learning based approach to find the spread of information in the network. A common limitation of these ML or data mining based approaches is the lack of understanding of the underlying microscopic mechanisms of the individuals' decision-making that dominate the information diffusion process, which is the focus of the papers in the second category. The second category models the information diffusion from the microscopic aspect, emphasizing more on the decisions and motivations of individuals. Assuming each user played the best response to the population's strategies, Morris studied the conditions for global contagion of behaviors (Morris, 2000). The authors in Lin et al. (2013) studied the problem of predicting dynamic trends according to each users' activeness under a dynamic activeness model. Based on the correlation, Lee and Chung (2014) proposed a probabilistic model to estimate the probability of a user's adoption of the naive Evolutionary information dynamics Bayes classifier. A game-theoretic framework for the study of competition between firms who aimed to maximize adoption of their products by consumers located in a social network was proposed in Goyal and Kearns (2012) and Fazeli and Jadbabaie (2012). Ding et al. (2017), Jin-lou et al. (2011) and Wang et al. (2017) proposed information diffusion models to study the spreading by defining different objective functions for each user and then solving the corresponding minimization or maximization problem.
To fully understand the details in information diffusion and simulate the whole process including diffusion dynamics as well as the consequence of information, in recent years some researchers put forward a novel model based on graphical evolutionary game theory (EGT). Authors in  and  proposed an evolutionary gametheoretic framework to model the dynamic information diffusion process among nodes in social networks, where the authors in  paid more attention to the final stable state, while in , the emphasis was on the evolutionary dynamics in Cao et al. (2016) then extended the analysis of the information diffusion process to the heterogeneous social networks where nodes can have different types. In these works, the model was proved to be rather effective in accuracy, with less calculation compared with ML or data mining approaches. Based on the conclusions in the evolutionary game-theoretic framework, the dynamics and stable states in the process of information diffusion could be quickly predicted, and as a result to be applied to plenty areas such as online advertisements, rumor control and network security. Therefore, in this paper, we have a review on these works that focus on information diffusion analysis based on evolutionary game-theoretic model. We summarize the system framework and key conclusions to illustrate how the graphical EGT can be used to model information diffusion.
The rest of the paper is organized as follows. Section 2 introduces evolutionary game theory, describes the graphical EGT in detail and gives some basic concepts of them. The evolutionary game-theoretic system framework and analysis results of several cases are shown in Section 3 Conclusions are drawn in Section 4.

Graphical evolutionary game theory and the correspondence of information diffusion
Initially, EGT is a biological concept, starting with the problem of how to explain ritualized animal behaviour in a conflict situation (Wikipedia), and developing to make up for traditional game theory's defects. In classical game theory, players are required to make rational choices, which means they should carefully consider sophisticated reasonings like what they want, what their opponents want and what their opponents know and determine the optimal strategies in competitions. Evolutionary game theory, on the other hand, does not require players to act rationally and assumes very little about the reasoning processes of the players. It focuses on the process of natural selection, i.e. evolution. It defines a framework of contests, strategies and the mathematical criteria that can be used to predict the performances of competing strategies. The results of a game include the dynamics of changes in the population, the success of strategies and any equilibrium states reached. These basic elements in a game could just correspond with the things in the process of information diffusion. We could regard the whole information diffusion process as a game, for users being players, their adopted strategies being the strategies in the game, the process of information spreading being the evolution and the consequence of information (survive or vanish, and if survive, how many users accept this piece of information) being the equilibrium states. Different from methods using a large amount of data, by applying EGT to information diffusion, we could predict every small change in the process, get the detailed dynamics and finally foretell the stable states. We are able to interpret the mechanisms of IJCS how users interact with others from the view of individuals themselves instead of the whole network, which helps us understand the diffusion more deeply.
Due to the structure of social network with intricate connections between users, graphical presentation of the network is combined with EGT to analyze the information dissemination as it makes problem more visualized. The social network graph is shown in Figure 1. As it shows, nodes with different colors represent different kinds of users in heterogeneous social network, and edges represent the connections between users. If users are treated as homogeneous individuals, nodes with different colors are identical in the analysis. In the model, users only have two strategies: forwarding the information and not forwarding. For the center user with a certain amount of neighbors, the numbers of his neighbor nodes adopting forwarding strategy and not forwarding strategy are certainly available. For two users with the connection, by interacting with each other with their own strategies, both sides would get same instant payoff, which equals to the benefit of interaction and could be obtained according to a predefined payoff matrix. Based on the payoff, we could calculate the fitness of every user, involving baseline fitness, payoffs and selection intensity. Baseline fitness represents the player's inherent property, e.g. a user's own interests on the released news in a social network. In this framework, the baseline fitness is normalized as one. Payoffs are determined by the payoff matrix and the graph structure. As for selection intensity, it is the relative contribution of the game to fitness. When selection intensity approaches zero, it indicates the limit of weak selection (Ohtsuki et al., 2007), while selection intensity approaching one denotes strong selection, where fitness equals payoffs. Here the weak selection is assumed thus selection intensity is a quite small value.
After introducing fitness, how to update the strategy should be defined. There are many strategy updating rules from the evolutionary biology field and they are used to model the resident/mutant evolution process. According to Ohtsukia and Nowak (2006), there are three typical and prevalent strategy updating rules: birth-death (BD), death-birth (DB) and imitation (IM) (shown in Figure 2).
(1) BD updating rule: a player is chosen for reproduction with the probability being proportional to fitness (Birth process). Then, the chosen player's strategy replaces one neighbor's strategy with uniform probability (Death process).
(2) DB updating rule: a random player is chosen to abandon his/her current strategy (Death process). Then, the chosen player adopts one of his/her neighbors' strategies with the probability being proportional to their fitness (Birth process). Once a new piece of information is released by a user, other users are expected to receive the information and retransmit it to more. However, whether to forward the information depends on different users' own choices, i.e. their strategies. Therefore, by analyzing the dynamics of users' strategies on information forwarding, we can infer how the information propagates to other users, how popular the information is, and then finally what's the result of the information. To get the dynamics, firstly, the global and local network states should be defined: global population state p i , representing the proportion of population using the strategy; global edge state p ij , representing the proportion of edges with specific strategies; and local network state p ijj , representing the proportion of a user's neighbors adopting the strategy.

IJCS
Among these notations i and j indicate different strategies, and in the model they can only be f for forwarding and n for not. To analyze the dynamic changing along with time, we discretize the information diffusion process into time slots. In each time slot, users are able to observe the strategies of other adjacent users in the population. Based on the observed information, in the next time slot, each user decides on whether forwarding the information by finding out which strategy can give higher fitness. Thus, as the users' strategies update slot by slot, the network state also keeps changing slot by slot. Therefore, the evolutionary dynamics are defined as the variation between every two time slots. To simplify the problem, network state and dynamics that only related to the forwarding strategy are considered, so corresponding dynamics of states are as follows: population dynamics _ p f : dynamics of global population state, illustrating the dynamics of whole network; relationship dynamics _ p ff : dynamics of global edge state, illustrating the dynamics of relationship among users; and influence dynamics _ p f jf : dynamics of local network state, illustrating the influence of one user on his/her neighbors.
As the information spreading process proceeds, the whole network would gradually reach an evolutionary stable state (ESS), i.e. all players tend to adopt their optimal stable strategies (Weibull, 1997). Intuitively, when the whole population is adopting optimal strategy, a small group of invaders using any alternate strategies should have strictly lower fitness than the users of the majority, and eventually die off with a high probability. The final ESSs can be found at the stable points of different kinds of dynamics. In summary, graph structure, players, strategy, fitness (payoff) and ESS are five basic elements of a graphical evolutionary game, and these elements have perfect corresponding contents in information diffusion analysis as stated above (briefly shown in the Table I), which proves that EGT is a practical and effective method to analyze information diffusion.

Analysis framework and results of several cases
Based on the definition in graphical EGT and the coherence of information diffusion over social network, the information propagation could be studied through characteristics stated before, especially the evolutionary dynamics and ESSs which are the two most important things in the evolutionary game-theoretic framework. No matter which strategy updating rule is adopted, ESS is concluded based on the analysis of evolutionary dynamics.
The basic framework of analysis can be summarized as follows. Above all, to calculate each user's fitness, we need a certain amount of neighbors adopting forwarding strategy and not forwarding strategy. However, this number is not known in every time slot during prediction of detailed changing process, so we have to get its distribution. On account of the assumption that the social network is large enough, it's reasonable to treat global population state p i or local network state p ijj as the probability of center user encountering neighbors adopting strategy i. Thus, when the total number of neighbor nodes is given, the distribution of the number of neighbor nodes adopting every strategy is available. Based on this number, predefined baseline fitness, payoffs and selection intensity, each user's fitness could be obtained. Then according to the strategy updating rule, the probability of a user being chosen to reproduction or updating his/her strategy is proportional to the fitness in a manner. With the probability of users' strategy changing, we can know the probability for the increase or decrease of evolutionary states, which is exactly the evolutionary dynamics that we look for. One common method to find ESS is to set the differential of state as zero, i.e. set the evolutionary dynamics as zero. Settle the equation and then the ESS under different conditions could finally be shown. Using the derived formula of evolutionary dynamics and final ESS, we are able to predict the evolutionary process in every time slot, and foretell the stable states of the information diffusion, e.g. how many users still forwarding this piece of message in the social network. Here we seek out several different network structures to illustrate the framework.
3.1 Analysis of homogeneous network 3.1.1 Results over uniform degree network. In the uniform scenario, the social network based on a homogenous graph with the same degree for all users is considered. In other words, in this kind of social network, there is no difference between all users and they could be regarded as a whole of one type. Meanwhile, every user has the same number of neighbor nodes. The main reason to discuss the uniform degree network is to provide more insight into the complicated problem of information diffusion, and the derivation and results (e.g. the fitness calculation and dynamics derivations) of the uniform degree network analysis will be used in the non-uniform degree case. Under the weak selection, the dynamics and ESSs could be derived in a close form. According to the formula of population dynamics, it only relies on the initial population state, the values of payoff matrix and the degree of the network, regardless of the network scale information. Therefore, the population dynamics of information diffusion over uniform degree networks show the scale-free property. Moreover, the formula shows that the dynamic is an increasing function in terms of the payoff that both sides adopt forwarding strategy. The corresponding physical meaning is that when the higher payoff can be obtained by forwarding the information, the increasing rate will also be higher. On the other hand, if not forwarding the information can gain higher payoff, the increasing rate will be lower, which is just the reason why population dynamic is a decreasing function in terms of the payoff that both users do not forward the information. For the formula of relationship dynamics and influence dynamics, they are functions involving themselves and also the population dynamics, which are more sophisticated.
Based on the evolutionary dynamics analysis, the evolutionary stable network state could be obtained by solving equations. There are three scenarios of ESSs, which are respectively one under the case of u ff > u fn > u nn , zero when u nn > u fn > u ff , and a value between zero and one under other conditions, where u ff represents the payoff that both sides forward information and u fn , u nn are defined in the same way. Zero and one are two extreme stable states, representing that no user forward the information and all users forward the information, respectively. When the population state is one, it means that both users forwarding the information can gain the most payoff, while not forwarding gains the least IJCS payoff. In a social network, this is corresponding to the scenario where the released information is an extremely hot topic, forwarding which can attract more attentions. On the contrary, things are just the opposite when population stable state is zero, which is corresponding to the scenario where the released information is useless or negative advertisement, forwarding which can only incur unnecessary cost. As for the third ESS that between zero and one, there are two other cases, one of which is u fn > u nn , u fn > u ff . In this case, unilateral forwarding can bring more payoff than no forwarding or both forwarding. In a social network, this case is corresponding to the scenario where both users forwarding the information can gain only limited reward but incur more cost to both of them. An example of this case can be that the information is not the mainstream topic, e.g. the news about a punk musician and is supposed to be diffused among people with similar interests. In the other case that u nn > u fn , u ff > u fn , the payoff configuration is equivalent to that of the coordination game, where both players with the same actions can make more payoff than opposite actions. An example of this case can be that the information is politically sensitive and its reality is not guaranteed, forwarding which may gain attractions but also incur potential misleading cost.
From Figure 3, we can see that all the simulation results are consistent with the theoretical results, which prove the correctness of the conclusions. 3.1.2 Results over non-uniform degree network. In the non-uniform scenario, social network is based on a graph whose degree exhibits a specific distribution. This distribution means that when randomly choosing one user on the network, the probability of the chosen user with specific neighbors is a specific distribution. Note that degree correlation is not taken into account, i.e. the degrees of all users are independent of each other. So when some Simulation results for synthetic network which is the homogeneous uniform degree network Evolutionary information dynamics new information is released, all users update their information forwarding strategies in a spontaneous manner. The procedure and conclusions of analysis are similar to those in uniform degree network, and the biggest difference is that probability distribution of degree is introduced so the expectation and variance appear in the formula. Results are shown in the Figure 4, in which Erdo †s-Rényi random network is adopted as non-uniform degree network in the simulation. Four cases are the same as before, and the simulation results agree well with the theoretical results. In Figure 5, the comparison of proposed model and one of the existing data mining method in Leskovec et al. (2009) is exhibited. Payoff matrices are determined by estimating using the Twitter hashtag data set. The vertical axis is the dynamics, and the mention times of different hashtags per hour in the Twitter dataset are normalized within interval and denoted by solid black square. From the figure, we can see that the graphical EGT model can fit the real-world information diffusion dynamics better than the data mining method in Leskovec et al. (2009), as users' interactions and decision-making behaviors are taken into account.
3.2 Analysis of heterogeneous network 3.2.1 Results for unknown user type model. In social network, as people have different habits and interests, there may be lots of types of users, which could be modeled as a heterogeneous network. For example, if a group of people are all sports fans, they belong to the same type considering forwarding information related to sports, while some other people belong to music lovers. Therefore, the payoff matrices of a piece of information for different types are different. When users get in touch with each other at first, they are not familiar with each other, so they may not know which type his/her neighbors belong to, corresponding to the unknown user type model.
Analyzing every type of users respectively, the dynamics of population state of each type could be deduced separately. By proportional combination, we are able to get the global evolutionary dynamics. It can be observed that the dynamic of each type not only consists of itself but also the dynamics of global state, which means that nodes are affected not only by those with the same type, but also all other nodes. Similarly, by setting the dynamics as zero, Simulation results for Erdo †s-Rényi random network which is a homogeneous nonuniform degree network IJCS three ESSs with different payoff matrices are derived in a close form. Figure 6 shows the simulation results of unknown user type model when the payoff matrices are set as: u ff 1 ð Þ ¼ 0:4; u fn 1 ð Þ ¼ 0:6; u nn 1 ð Þ ¼ 0:3; u ff 2 ð Þ ¼ 0:2; u fn 2 ð Þ ¼ 0:4; u nn 2 ð Þ ¼ 0:5. The theoretical dynamics fit the simulation dynamics well and ESSs are predicted accurately. The average relative ESS error of the heterogeneous model is 3.54 per cent.
3.2.2 Results for known user type model. Sometimes, through repeated interactions, users may somehow manage to know their neighbors' types. For instance, when a user observes that one of his friends frequently posts news about football match, he may gradually know that this friend is a football fan. This condition could be modeled as known user type. In other words, when the center user is deciding on whether to change the strategy, he/she treats his/her neighbors in different ways for knowing their types. As a user's type and strategy affect its neighbors' payoffs, they may also influence the neighbors' strategies. Thus the edge information, i.e. relationship state (global edge state) and influence state (local network state) are required to fully characterize the network state. In the final formulation of dynamics, it could be proved that the relationship dynamics and influence dynamics change at a much faster speed than population dynamics do. This implies that we can select a time window with an appropriate length such that the population dynamics basically remain unchanged while the relationship dynamics and influence dynamics vary a lot. So we focus on a small time period where the population dynamics do not vary with time while the influence dynamics vary. Next, it is shown that in this small time period, the influence Evolutionary information dynamics dynamics will converge to the corresponding population dynamics. Simulation results in Figure 7 demonstrate that the known user type model based theoretical dynamics and the simulated dynamics match well. In Figure 7, the evolutionary dynamics given by the theory of the unknown user type model are also plotted. This does not match the simulated evolutionary dynamics under the known user type model, indicating the necessity for the theory of the known user type model.

Conclusions
In this paper, we summarize the results from works that analyze information diffusion process based on graphical evolutionary game theory over social network. To figure out the features in the process, evolutionary dynamics and evolutionary stable state are discussed. And it is found that the analysis of information diffusion based on graphical EGT matches well with reality. In the future, social network with irrational users or malicious users should also be considered, which would be more complicated compared with networks only consist of rational users. How to model the behaviours of irrational users is the focus of future research.