Evolutionary dynamics and fixation probabilities in directed networks

We investigate the evolutionary dynamics in directed and/or weighted networks. We study the fixation probability of a mutant in finite populations in stochastic voter-type dynamics for several update rules. The fixation probability is defined as the probability of a newly introduced mutant in a wild-type population taking over the entire population. In contrast to the case of undirected and unweighted networks, the fixation probability of a mutant in directed networks is characterized not only by the degree of the node that the mutant initially invades but by the global structure of networks. Consequently, the gross connectivity of networks such as small-world property or modularity has a major impact on the fixation probability.


Introduction
Evolutionary dynamics describe the competition among different types of individuals in ecological and social systems. Traits, either genetic or cultural, are transmitted to others through inheritance or imitation. The fitness of an individual determines her/his ability to pass on her/his traits to the next generation. An individual with a larger fitness value is more likely to replace one with a smaller fitness value. Such a dynamical process can be modeled using the well-known voter model and its variants [1][2][3][4][5]. According to these models, individuals adopt the trait (i.e., hereafter we call it type) of others. Selection and random drift are two major driving forces in evolutionary dynamics. Selection results from the different fitness levels of different types. Random drift results from the finite size of populations.
In the voter-type dynamics, one's type is replaced with the type of another individual.
Therefore, no new types are introduced into the population unless an explicit mutation (or innovation) is considered. Once a single type dominates the entire population, this unanimity state remains the same forever. In other words, the unanimity states are the absorbing states of these dynamics. Consequently, in the case of finite populations, the stochasticity of voter-type models leads to the fixation or extinction of a newly introduced type after some time. The probability that a single mutant introduced in a population of wild-type individuals eventually takes over the entire population is called fixation probability [3][4][5][6][7][8]. Fixation probability quantifies the likelihood of the propagation of a single mutant in the population. When different types of individuals have the same fitness value, the resulting evolutionary dynamics are called neutral evolutionary dynamics. In this case, it is well-known that the fixation probability of a single mutant on the complete graph is the reciprocal of the population size [8].
In reality, individuals do not necessarily interact with everyone in the population. They have relationships with some individuals, but not with others. This fact leads to the idea of complex contact networks of individuals. Neutral evolutionary dynamics such as voter-type models in complex networks have been extensively studied (e.g., [9][10][11][12]). It has been shown that the fixation probability of a single mutant in undirected and unweighted networks depends on the degree of the initially invaded node as well as update rules [3][4][5]. Edges in many real networks, however, have directionality. Examples of real networks include social networks in which a directed edge is drawn from the actor to the recipient of grooming behavior of rhesus monkeys [14]. Other examples include email social networks [13,15], and ecological networks, in which the heterogeneity of parameters such as habitat size [16] and geographical biases such as wind direction [17] and reverine streams [18] are exemplary sources of directionality. Moreover, the edges in real networks are generally weighted [19]. This concept has been nicely introduced in a seminal paper on evolutionary dynamics on graphs [3].
In this study, we investigate the dependence of the fixation probability of a single neutral mutant in general directed (or weighted) networks on its initial location and on update rules.
We study three major update rules that were introduced in [4,5]. Our results are remarkably different from those obtained in the case of undirected networks in which the fixation probability of a mutant is determined by the degree of the node where the mutant is initially placed. In the case of directed networks, the fixation probability crucially depends on global structure of networks. The difference between directed and undirected networks is striking, especially in the case of modular, spatial, or degree-correlated networks.

Model
Consider a population of N individuals comprising two types of individuals -type A and type B. Let the fitness of type A and type B individuals be r and 1, respectively. In this study, we mainly focus on the case r = 1, which corresponds to neutral competition between A and B. The structure of the population is described by a directed graph G = {V, E}, where V = {v 1 , . . . , v N } is a set of nodes, and E is a set of edges, i.e., v i sends a directed edge to v j if and only if (v i , v j ) ∈ E. Each node is occupied by an individual of either type. The fitness of the individual at node v i is denoted by f i ∈ {r, 1}. Each directed edge (v i , v j ) ∈ E is endowed with its weight w ij , which represents the likelihood with which the type of individual at v i is transferred to v j in an update step. We set w ij = 0 when (i, j) / ∈ E.
Consider the introduction of a single mutant of type A at node v i in a population of N − 1 residents of type B. Then, type A either eventually fixates, i.e., takes over the entire population, or becomes extinct (i.e., fixation of B), as schematically shown in Fig. 1. We are concerned with the fixation probability of type A, which is denoted by F i . When needed, we refer to any one of the three update rules introduced below by the superscript of F i , such as F LD i and F IP i . Throughout the paper, we assume that G is strongly connected. A network is strongly connected if there is at least one directed path between any ordered pair of nodes. If G is not strongly connected, we can find two nodes v i and v j such that there is no direct path from v i to v j . In this case, the fixation probability F i is always zero because the individual at v j is never replaced by the mutant initially located at v i [3]. Therefore, it is sufficient to investigate fixation probabilities in the most upstream strongly connected component of G. Therefore, we assume without loss of generality that G is strongly connected.

Results
In this section, we analytically obtain a system of linear equations that gives the fixation probabilities of mutants at individual nodes for the three update rules (link dynamics (LD), invasion process (IP), and voter model (VM)) [4,5]. An update event in the three rules is schematically shown in Fig. 2. Then, we compare the analytical results with numerical results obtained for various directed networks.

Link dynamics (LD)
Firstly, we consider the LD [4,5]. In this case, one directed edge is selected for reproduction in each time step; the edge (v i , v j ) ∈ E is chosen with probability f i w ij / k,l f k w kl for the type of the individual at v i to replace the type of the individual at v j . Individuals with larger fitness values and larger outgoing edge weights are more likely to reproduce than those with smaller fitness values and outgoing edge weights. Thus, the selection process acts on birth events.
Alternatively, the selection process can be assumed to act on death events, and the edge (v i , v j ) is chosen with probability (w ij /f j )/( k,l w kl /f l ) for the type at v i to replace that at v j . This implies that individuals with smaller fitness values and larger incoming edge weights are more likely to die than those with larger fitness values and smaller incoming edge weights. In fact, the fixation probability F LD i (1 ≤ i ≤ N) is the same under these two interpretations.
We consider the case r = 1 (hence, f i = 1, 1 ≤ i ≤ N) analytically. Suppose that a single mutant of type A invades v i . The fixation probability is given by F LD i . By considering the next update event, we can recalculate F LD i as follows. With probability w ij / k,l w kl , the edge (v i , v j ) ∈ E is selected for reproduction. Then, type A individuals occupy v i and v j .
We denote by F LD {i,j} the fixation probability when type A individuals are initially located at v i and v j but nowhere else. With probability w ji / k,l w kl , the edge (v j , v i ) ∈ E is selected.
Then, type A becomes extinct, and type A will not fixate. With the remaining probability k =i,l =i w kl / k,l w kl , the configuration of type A and type B individuals does not change. Therefore, we obtain To prove F LD {i,j} = F LD i + F LD j , consider for now N neutral types labeled 1, 2, . . ., N that are initially placed at v 1 , v 2 , . . . v N , respectively. On a finite graph G, one of the N types fixates eventually. The probability that type i or j fixates is given in two ways: This is a system of linear equations giving F LD i . We note that F LD i can be interpreted as the reproductive value of the individual at node v i [20,21].
Equation (2) can be derived more rigorously via the dual process [1,2,7,22]. Intuitively speaking, the dual process of a stochastic process is another stochastic process in which the time of the original process is reversed. The direction of edges in the dual process is the opposite to that in the original process. By considering the dual process, we can understand the tree of family lines in the original process, which is called genealogy. When we go backward in time, two individuals sometimes 'collide' in the dual process. Such an event is called coalescence. In terms of the original process, a coalescence corresponds to two individuals sharing the common ancestor. After two individuals coalesce in the dual process, they behave as a single individual, representing a single family line. As far as the fixation probability is concerned, LD with r = 1 is equivalent to the continuous-time stochastic process in which each edge for reproduction at the Poisson rate w ij . Then, the dual process of LD is the continuous-time coalescing random walk on the network with all edges reversed, with a random walker initially located on every node [1,2]. Coalescing random walk is defined as follows. Consider a walker at usually sparse, carrying out the Jacobi iteration may take much less time. The convergence of this iteration to F LD is guaranteed by the Perron-Frobenius theorem [23].
In undirected graphs, w ij = w ji holds. Therefore, , giving a result previously reported in [4,5]. In the case of weighted or directed networks, the complexity of Eq.
(2) implies that F LD i is not always determined by the local characteristics of node v i but is affected by the global structure of the networks.
Next, we argue that the mean-field (MF) approximation are not useful in most cases. Con-sider unweighted, but possibly directed, networks such that The Equation (4) indicates that a large k out i aids in the dissemination of the type at v i and a small k in i inhibits the replacement of the type at v i by the type at other nodes. However, the MF approximation deviates from the correct F LD i in many cases. As an example, consider the largest strongly connected component of a directed and unweighted email social network [13] with N = 9079 nodes and k = k in = k out = 2.62, where · denotes the average over the nodes. In this case, F LD i (indicated by the circles in Fig. 3

(c))
does not agree with the normalized k out i /k in i (indicated by the line). This is mainly because the indegree and outdegree of the same node in this network are positively correlated and because this network presumably has a nontrivial global structure. Actually, the Pearson correlation is equal to 0.40. Networks in which degrees of adjacent nodes are correlated also show considerable discrepancies between the MF approximation and the numerical results. On the other hand, in the case of undirected networks, our result F LD i = 1/N holds true in the presence of degree correlation of any kind, which is consistent with previously obtained numerical results [5].
Next, we examine the fixation probability in an asymmetric small-world network constructed from a ring of N = 5000 nodes. This network is a directed version of the Watts-Strogatz smallworld network [24]. Each node of this network tentatively sends directed edges to 5 nearest nodes along both sides. Then, 2500 out of 10N = 50000 directed edges are rewired so that their two ends are randomly and independently selected from the N nodes, excluding self-loops and preexisting edges. The correlation between the in-and out-degrees of the same node is negligible, with the PCC for the pairs (k in i , k out i ), defined by Eq. (5), being equal to −0.021. The degrees of adjacent nodes v i and v j conditioned by the existence of the edge (v i , v j ) [25,26] are also uncorrelated by the definition of the model. Nevertheless, the MF approximation is not effective in predicting the actual F LD i , as shown in Fig. 3(d). This discrepancy persists even for large N, because the directed small-world network does not render F LD i of adjacent nodes independent of each other. In contrast, F LD i in undirected small-world networks is completely determined by the MF ansatz indicated by the line in Fig. 3(d).
In contrast to these networks, Figure 3 whereF LD is defined by Eq. (3). The value of the PCC turns out to be slightly but significantly positive (mean ± standard deviation based on 100 network realizations is equal to 0.0834 ± 0.0057). The discrepancy is also significant for a large N, unless the mean degree k is large.
The results obtained for random networks extend to the case of scale-free networks without degree correlation. We generate a directed scale-free network by setting the degree distribution to be p(k) ∝ k −3 for k ≥ k /2 and p(k) = 0 for k < k /2, thereby generating k in i and k out i (1 ≤ i ≤ N) independently according to p(k), and randomly connecting the nodes using the Molloy-Reed algorithm [27]. Figure 3(b) indicates that the MF approximation roughly explains the numerically obtained fixation probability.
It is noted that the PCC for the pairs (F LD i , F LD j ) for (v i , v j ) ∈ E is small for the asymmetric scale-free network (= 0.0395 ± 0.0056), is large for the asymmetric small-world network (= 0.7888 ± 0.0165), and is small for the email social network (= 0.0420).

Invasion process (IP)
Next, consider the IP [4,5]. In the IP, selection acts on birth. In each time step, v i is first selected for reproduction with probability f i / k f k , where f i ∈ {r, 1} is the fitness of the type at node v i . Then, with probability w ij / l w il , the type at v i replaces that at v j . Consequently, the probability that the edge (v i , v j ) ∈ E is used for reproduction in an update step is equal to f i w ij /( k f k l w il ). On the complete graph, IP is the same as the standard Moran process [6].
For an arbitrary r the IP is mapped to the LD with the rescaled edge weight w ′ ij = w ij / l w il . Therefore, from Eq. (2), the fixation probability for r = 1 is the solution to In the case of undirected unweighted networks, F IP i ∝ 1/k i solves Eq. (7), giving a previously obtained result [4,5,22]. In the case of directed unweighted networks, applying the MF approximation to Eq. (7) yields The numerical results for the asymmetric random graph that is used for obtaining the results shown in Fig. 3(a) are shown in Fig. 4(a). The relation F IP i ∝ 1/k in i (the line in Fig. 4(a)) is roughly satisfied. Similar to the case of LD, some deviation persists in the case of random networks even with a large N. In contrast, Fig. 4(b) indicates that the actual F IP i in the scalefree network deviates considerably from Eq. (8), mainly because of the discreteness of k in i for a small integer k in i . Similar to the case of LD, the discrepancy between Eq. (8) and the exact F IP i is also large in the case of the email social network (Fig. 4(c)) and the asymmetric small-world network ( Fig. 4(d)). In addition to the degree correlation or global structure of networks, the discreteness of 1/k in i for a small k in i causes further deviation, as shown in Figs. 4(c) and 4(d).

Voter model (VM)
We examine a third update rule, the so-called voter model (VM) [4,5]. In the VM, we first eliminate the type at one node v j with probability f −1 j / k f −1 k . Then, with probability w ij / l w lj , the type at v i replaces that at v j . The probability that the edge (v i , v j ) ∈ E is used for reproduction in an update step is equal to f −1 j w ij /( k f −1 k l w lj ). For general r, the VM is mapped to the LD with the rescaled edge weight w ′ ij = w ij / l w lj . Consequently, from Eq. (2), F V M i for r = 1 is given by In the case of undirected networks, F V M i ∝ k i solves Eq. (9), recovering a previously obtained result [4,5,22]. The MF approximation yields In the case of the random network and the uncorrelated scale-free network, the numerical results shown in Fig. 5(a) and Fig. 5(b), respectively, support the rough validity of Eq. (10). However, this naive ansatz deviates from the actual F V M i for the email social network (Fig. 5(c)) and the asymmetric small-world network (Fig. 5(d)). This situation is similar to that observed in the case of LD.
The fixation probability F V M i on a graph G is equivalent to the PageRank of node v i of the graph G ′ , where G ′ is constructed by reversing all edges of G. The PageRank measures the number of directed edges a node, such as a webpage, receives from other important nodes as exclusively as possible [28][29][30]. If we neglect some minor technical treatments that are necessary for the practical implementation, the PageRank where λ is the largest eigenvalue of the eigenequation (11). If many edges are directed to v i , there are many positive terms (i.e., w ji > 0) on the LHS of Eq. (11), and they contribute to the PageRank of v i on the RHS. If v i receives an edge from v j whose outdegree is small or whose PageRank is large, w ji / l w jl or F P R j is large. Each of these factors also increases . A strongly connected network yields λ = 1, so that F P R i is the stationary density of the discrete-time simple random walk on the original graph G [23,[28][29][30]. Equation (11) with λ = 1 and with w ij replaced by w ji is identical to Eq. (9). In PageRank, nodes that receive many edges tend to be important, whereas the opposite is true in the case of the VM. F P R i is locally approximated by using k in i [31], which corresponds to the MF relation F V M i ∝ k out i . However, F P R i in real web graphs often deviates from the relation F P R i ∝ k in i [32]. This implies that F V M i in real networks can also deviate from the MF approximation, which is consistent with our main claim.

Constant selection (r = 1)
When r = 1, the fitness value of type A and that of type B are different, so one type has a unilateral advantage over the other type. We call this situation 'constant selection'. In this case, the dual process of the evolutionary dynamics is the coalescing and branching random walk [2], which is difficult to handle analytically. Therefore, we carry out Monte Carlo simulations for r = 4 on a fixed asymmetric random graph with N = 200 and k = 10. We calculate F LD i as a fraction of runs from 2 × 10 6 runs in which the single mutant with fitness r initially located at v i (i.e., f i = r and f j = 1, j = i) eventually occupies all nodes of the network. In Fig. 6(a), the numerically obtained F LD i for r = 4 is plotted against the exact solution of F LD i for r = 1 (Eq. (2)). Roughly speaking, F LD i for r = 4 monotonically increases with the exactly obtained F LD i for r = 1. We obtain similar results in the case of the IP (Fig. 6(b)) and VM (Fig. 6(c)). Therefore, the node from which a mutant is more likely to propagate throughout the population under the neutral dynamics (r = 1) also serves as a better invading node for mutants under the constant selection (r ∈ 1). Thus, our results derived in the case of the neutral selection is useful in predicting the order of the maginitude of fixation probabilities under the constant selection.

Modular networks
Real networks are often more complex than degree-uncorrelated random, scale-free, or smallworld networks. In particular, many networks are modular, i.e., they consist of several densely connected subgraphs termed modules, and each subgraph is connected to each other by a relatively few edges [33]. This is also the case for directed [34,35] and weighted [36] networks.
To intuitively understand the importance of the global structure of networks such as community structure in evolutionary dynamics, we generate a modular network [35], as schematically shown in Fig. 7(a), and study the fixation probability under neutrality, i.e., r = 1. We generate fits the data well (the dashed lines in Fig. 7(b)). Similar approximations in which k out i /k in To explain this result analytically, we presume that all nodes in a module are equivalent and have an identical fixation probability,F 1 orF 2 . In this manner, a network with two modules is reduced to a network with two nodes and self-loops. Equation (2) with N = 2 yieldŝ Because which agrees with the numerical results. On the other hand, k out i /k in i is equal to (1+w 1→2 )/(1+ w 2→1 ) and (1 + w 2→1 )/(1 + w 1→2 ) for M 1 and M 2 , respectively. Both of these values are close to unity when w 1→2 , w 2→1 ≪ 1, i.e., when the network is modular. Therefore, the MF approximation givesF LD 1 /F LD 2 ≈ 1, which is different from our simulated results.
Similar calculations in the case of the IP yield where when w 1→2 , w 2→1 ≪ 1. However, the naive MF ansatz (Eq. (8)) yields 1/k in would be approximately unity, which does not well explain the simulation results shown in Fig. 7(c).
In the case of the VM, we obtainF where When w 1→2 , w 2→1 ≪ 1, we obtainF However, the naive MF ansatz (Eq. (10)) yields k out would be approximately unity, which again does not explain the simulation results shown in Fig. 7(d).
In sum, for each update rule the community structure of networks has a strong impact on the fixation probability.

Conclusions
In summary, we obtained general formulae for the fixation probability in directed and weighted networks. For each of the three different update rules, fixation probability is a solution to a system of linear equations. Fixation probability in undirected networks is completely determined by the local connectivity [4,5] under neutrality. In contrast, in the case of directed degreecorrelated, small-world, or modular networks, fixation probability is not determined only by the degree of the node that a mutant initially invades, and it deviates from the MF approximation to a large extent. Our results indicate that the global connectivity of networks has a significant effect on the fixation probability.
type B fixation of type A fixation of type B type A . The lines represent the MF ansatz