Extreme robustness of scaling in sample space reducing processes explains Zipf's law in diffusion on directed networks

It has been shown recently that a specific class of path-dependent stochastic processes, which reduce their sample space as they unfold, lead to exact scaling laws in frequency and rank distributions. Such Sample Space Reducing processes (SSRP) offer an alternative new mechanism to understand the emergence of scaling in countless processes. The corresponding power law exponents were shown to be related to noise levels in the process. Here we show that the emergence of scaling is not limited to the simplest SSRPs, but holds for a huge domain of stochastic processes that are characterized by non-uniform prior distributions. We demonstrate mathematically that in the absence of noise the scaling exponents converge to $-1$ (Zipf's law) for almost all prior distributions. As a consequence it becomes possible to fully understand targeted diffusion on weighted directed networks and its associated scaling laws law in node visit distributions. The presence of cycles can be properly interpreted as playing the same role as noise in SSRPs and, accordingly, determine the scaling exponents. The result that Zipf's law emerges as a generic feature of diffusion on networks, regardless of its details, and that the exponent of visiting times is related to the amount of cycles in a network could be relevant for a series of applications in traffic-, transport- and supply chain management.

It has been shown recently that a specific class of path-dependent stochastic processes, which reduce their sample space as they unfold, lead to exact scaling laws in frequency and rank distributions. Such Sample Space Reducing processes (SSRP) offer an alternative new mechanism to understand the emergence of scaling in countless processes. The corresponding power law exponents were shown to be related to noise levels in the process. Here we show that the emergence of scaling is not limited to the simplest SSRPs, but holds for a huge domain of stochastic processes that are characterized by non-uniform prior distributions. We demonstrate mathematically that in the absence of noise the scaling exponents converge to −1 (Zipf's law) for almost all prior distributions. As a consequence it becomes possible to fully understand targeted diffusion on weighted directed networks and its associated scaling laws in node visit distributions. The presence of cycles can be properly interpreted as playing the same role as noise in SSRPs and, accordingly, determine the scaling exponents. The result that Zipf's law emerges as a generic feature of diffusion on networks, regardless of its details, and that the exponent of visiting times is related to the amount of cycles in a network could be relevant for a series of applications in traffic-, transport-and supply chain management.

I. INTRODUCTION
Many stochastic processes, natural or man-made, are explicitly path-dependent. Famous examples include biological evolution [1][2][3] or technological innovation [4,5]. Formally, path-dependence means that the probabilities to reach certain states of the system (or the transition rates from one state to another) at a given time depend on the history of the process up to this time. This statistical time-dependence can induce dramatic deformations of phase-space, in the sense that certain regions will hardly be revisited again, while others will be visited much more frequently. This makes a large number of path-dependent complex systems, and processes that are associated with them, non-ergodic. They are typically mathematically intractable with a few famous exceptions, including the Pitman-Yor or 'Chinese Restaurant' process [6,7], recurrent random sequences proposed by S. Ulam and M. Kac [8][9][10], Pólya urns [7,11,12], and the recently introduced sample space reducing processes (SSRPs) [13].
SSRPs are processes that reduce their sample space as they progress over time. In their simplest form they can be depicted by the following process. Imagine a staircase like the one shown in figure 1a. Each state i of the system corresponds to one particular stair. A ball is initially (t = 0) placed at the topmost stair N , and can jump randomly to any of the N − 1 lower stairs in the next timestep with a probability 1/(N − 1). Assume that at time t = 1 the ball landed at stair i. Since it can only jump to stairs i that are below i, the probability to jump to stair i < i is 1/(i − 1). The process continues until eventually stair 1 is reached; it then halts.
Remarkably, the statistics over a large number of repetitions of SSRPs yields an exact Zipf's law in the rankfrequency distribution of the visits of states [13], a fact that links path-dependence with scaling phenomena in an intuitive way. SSRPs add an alternative and independent route to understand the origin of scaling (Zipf's law in particular) to the well known classical ways [14,15], criticality [16], self-organised criticality [17,18], multiplicative processes with constraints [19][20][21], and preferential attachment models [22,23]. Beyond their transparent mathematical tractability, SSRPs seem to have a wide applicability, including diffusion on complete directed acyclical graphs [13], quantitative linguistics [24], record statistics [25,26], and fragmentation processes [27].
SSRPs can be seen as very specific non-standard sampling processes, with a directional bias or a symmetry breaking mechanism. In the same pictorial view as above a standard sampling processes can be depicted as a ball bouncing randomly left and right (without a directional bias as in the SSRP) over a set of states, see figure 1b. The ball samples the states with a uniform prior probability, meaning that all states are sampled with equal probability. A situation with non-uniform priors is shown in figure 1c where the different widths of boxes represent the probability to hit a particular state. In a standard sampling process exactly this non-uniform prior distribution will be recovered.
So far, SSRPs have been studied for the simplest case only, where the potential outcomes or states are sampled from an underlying uniform prior distribution [13]. In (c) Random process as in (b) but with non-uniform prior probabilities of states (width of boxes). The visiting probabilities follow the prior probabilities. (d) SSRP with non-uniform prior probabilities. Visiting distributions follow the attractor to a Zipf's distribution. This is true for a wide class of prior probabilities. (e) SSRP realized by a diffusion process on a directed acyclic network towards a target node (orange). The visiting probability of nodes follows a Zipf's distribution, independent of the network topology.
this paper we demonstrate that a much wider class of SSRPs leads to exact scaling laws. In particular we will show that SSRPs lead to Zipf's law irrespective of the underlying prior distributions. This is schematically shown in figure 1d, where the prior distribution is non-uniform, and states are sampled with a SSRP. The resulting distribution function will no longer follow the prior distribution as in figure 1c, but produces Zipf's law. We provide show in detail how SSRPs depend on their prior distributions. Zipf's law turns out to be an attractor distribution that holds for practically any SSRP, irrespective of the details of the stochastic system at hand, i.e. irrespective of their prior distributions. This extreme robustness with respect to details of transition rates between states within a system offers a simple understanding of the ubiquity of Zipf's law. Phenomena that show a high robustness of Zipf's law with respect to changes on the detailed properties of the system have been reported before [25,26,28].
As an important example we demonstrate these mathematical facts in the context of diffusion processes on Directed Acyclic Graphs (DAG). Here Zipf's distributions of node visiting frequencies appear generically, regardless of the weight-or degree distribution of the network. We call diffusion processes on DAG structures targeted diffusion, since, in this type network, diffusion is targeted towards a set of target or sink nodes, see figure 1e. The targeted diffusion results we present here are in line with recent findings reported in [29].

II. SSRPS WITH ARBITRARY PRIORS
We start the formal study of the statistics of SSRPs for the noiseless case which implies -in the staircase picture -that upward jumps are not allowed (sampling with a bias). We then study how the statistics of SSRPs behaves when noise is introduced. In this case the probability of upward jumps is no longer zero.

A. Noiseless SSRPs
Think of the N possible states of a given system as stairs with different widths and imagine a ball bouncing downstairs with random step sizes. The probability of the downward bouncing ball to hit stair i is proportional to its width q(i), see figure 1d. Given these prior probabilities q(i), the transition probability from stair j to stair i is with g(j − 1) = <j q( ). Prior probabilities are normalised, i q(i) = 1. We denote such a SSRP by ψ. One can safely assume the existence of a stationary visiting distribution, p, arising from many repetitions of process ψ and satisfying the following relation: Using equation (1), and forming the difference and by re-arranging terms we find that where we use the fact that g(i) + q(i + 1) = g(i + 1). Note that this is true for all values of i, and in particular since g(1) = q(1). We arrive at the final result . (6) p(i) is the probability that we observe the ball ball bouncing downwards at stair i. Equation (6) shows that the path-dependence of the SSRP ψ deforms the prior probabilities of the states of a given system, . We can now discuss various concrete prior distributions. Note that equation (6) is exact and does not dependent on system size.
Polynomial priors and the ubiquity of Zipf 's law: Given power law priors, q(i) ∼ i α with α > −1, one can compute g up to a normalisation constant which, when used in equation (6), asymptotically gives i.e., Zipf's law. More generally, this result is true for polynomial priors, q(j) ∼ i≤m a i j α(i) , where the degree of the polynomial α(m) = max{α(i)} is larger than −1, in the limit of large systems. Numerical simulations show perfect agreement with the theoretical prediction for various values of α, see figure 2a (circles, triangles, red squares).
Fast decaying priors: The situation changes drastically for exponents α < −1. For sufficiently fast decaying priors we have The fast decay makes the contribution to g from large i's negligible. Under these circumstances equation (6) can be approximated for sufficiently large i's, as p(i) ∼ q(i).
We encounter the remarkable situation that for fast decaying priors the SSRP, even though it is history dependent, follows the prior distribution. In this case the SSRP resembles a standard sampling process.
Exponential priors: For exponential priors, q(i) ∼ e βi , with β > 0, we find according to equation (6) that p(i) = 1/N , i.e., a uniform distribution. To see this note that, up to a normalisation constant, g(i) is a geometric series, Substituting it into equation (6), one finds the exact relation which can be safely approximated, for i 1, by We observe that this is a constant independent of i. Accordingly, after normalisation, we will have p(i) ∼ 1/N . Note that exponential priors describe a somewhat pathological situation. Given that a state i is occupied at time t, the probability to visit state i − 1 is huge compared to all the other remaining states, so that practically all states will be sampled in a descending sequence: which obviously leads to a uniform p. Again, numerical simulations show perfect agreement with the prediction, as shown in figure 2a (grey squares). Switching from polynomial to exponential priors, we switch the attractor from the Zipf's regime to the uniform distribution.

B. Noisy SSRPs
Noisy SSRPs are mixtures of a SSRP ψ and stochastic transitions between states that are not historydependent. Following the previous scheme of the staircase picture, the noisy variant of the SSRP, denoted by ψ λ , starts at N and jumps to any stair i < N , according to the prior probabilities q(i). At i the process now has two options: (i) with probability λ the process continues the SSRP and jumps to any j < i, or, (ii) with probability 1 − λ jumps to any point j < N , following a standard process of sampling without memory. 1 − λ is the noise strength. The process stops when stair 1 is hit. The transition probabilities for ψ λ read, Note that the noise allows moves from j to i, even if i > j.
Proceeding exactly as before we get where p λ (i) depicts the probability to visit state i in a noisy SSRP with parameter λ. As a consequence we obtain: The product term can be safely approximated by where we used q(j) ∼ dg/dx| j and log(1 + x) ∼ x for small x, assuming that x = λ q(j) g(j−1) 1. Finally, we get where p λ (1)/q(1) 1−λ acts as the normalisation constant. λ plays the role of a scaling exponent. For λ → 1 (no noise), p λ recovers the standard SSRP ψ of equation (1). For λ = 0, we recover the case of standard random sampling, p → q. It is worth noting that continuous SSRP display the same scaling behaviour (see Appendix A). The particular case of q(i) = 1/N that was studied in [13], shows that λ turns out to be the scaling exponent of the distribution p λ (i) ∼ 1/i λ . Note that these are not frequency-but rank distributions. They are related, however. The range of exponents λ ∈ (0, 1] in rank, represents the respective range of exponents α ∈ [2, ∞) in frequency, see e.g. [14] and Appendix B. For polynomial priors, q(i) ∼ i α (α > −1), one finds The excellent agreement of these predictions with numerical experiments is shown in figure 2b. Finally, for exponential priors q(i) ∼ e βi (β > 0) the visiting probability of for the noisy SSRP ψ λ becomes p(i) ∼ e (1−λ)βi , see Tab. I. Clearly, the presence of noise recovers the prior probabilities in a fuzzy way, depending on the noise levels. The following table sumarizes the various scenarios for the distribution functions p(i) for the different prior distributions q(i) and noise levels. prior (sub-) logarithmic polynomial exponential

III. DIFFUSION ON WEIGHTED, DIRECTED, ACYCLIC GRAPHS
The above results have immediate and remarkable consequences for the diffusion on DAGs [30] or, more generally, on networks with target-, sink-or absorbing nodes. We call this process targeted diffusion. In particular, the results derived above allow us to understand the origin of Zipf's law of node visiting times for practically all weighted DAGs, regardless of their degree-and weight distributions. We first demonstrate this fact with simulation experiments on weighted DAGs and then, in section III B we analytically derive the corresponding equations of targeted diffusion for the large class of sparse random DAGs, that explain that Zipf's law must occur in node visiting frequencies. In appendix B proofs are given for the cases of exponential and scale free networks.
We start with the observation that SSRPs with uniform priors can be seen as a diffusion processes on a fully connected DAG, where nodes correspond one-toone to the stairs of the above examples. This results in a Zipf's law of node visiting frequencies [13]. However, such fully connected networks are extremely unlikely to occur in reality. To create much more realistic structures, we generate arbitrary random DAGs following e.g. references [30,31]. Start with any undirected connected graph G(V, E), with V the set of nodes, E the set of edges, and P (k) the degree distribution, see figure 3a. Next, label each node in any desired way that allows an ordering, for example with numbers 1, ..., N , see figure  3b. The labelling induces an order that determines the directionality of links in the graph: if nodes i and j are connected, we draw an arrow from i to j, if i > j, or  [30,31]. Such a graph will have at least, one target or a sink node, in the depicted case this is node i = 1. A diffusion process of this graph, where random walkers are randomly placed on the graph and follow the arrows at every timestep, is called targeted diffusion with target node i = 1.
from j to i, if i < j, as seen in figure 3c. We denote the resulting DAG by G D (V, E D ). The order induced by the labelling mimics the order (or symmetry breaking) that underlies any SSRPs. By definition, there exists, at least, one target node, "1". Noise can be introduced to this DAG construction as follows: if node i and j are connected in G and i > j one can assign an arrow from i to j (as before) with probability λ, or place the arrow in a random direction with probability 1 − λ. This will create cycles that play the role of noise in the targeted diffusion process. This network is no longer a pure DAG since it contains cycles.

A. Targeted diffusion on specific networks
A diffusion process on G D is now carried out by placing random walkers on the nodes randomly, and letting them take steps following the arrows in the network. They diffuse according to the weights in the network until they hit a target node and are then removed. We record the number of visits to all nodes and sort them according to the number of visits, obtaining a rank distribution of visits 1 . We show the results from numerical experiments of 10 7 random walkers on various DAGs in figure 4. In Figs. 4a and 4b we plot the rank distribution of visits to nodes for weighted Erdős-Rényi (ER) DAG networks. A weight w ik is randomly assigned to each link e ik ∈ E from a given weight distribution p(w). Weights either follow a Poisson distribution, figure 4a, or a power-law distribution, figure 4b. In both cases Zipf's law is obtained in the 1 Rank ordering is not necessary whatsoever to see the clear agreement with the theoretical predictions. Almost identical results are seen when we order nodes according to their numerical ordering. The weight distribution w ik follows (a) a Poisson distribution with average µ = 6, and (b) a power-law p(w) ∝ w −1.5 that is shown in the inset. In both cases the predicted Zipf's law is present (black dashed line), even though the networks are small. In (a) the DAG condition is violated (red squares) by assigning random directions to a fraction of 1 − λ links. This allows for the presence of cycles, which play the role of noise in a SSRP. A power law with the exponent λ is observed in the corresponding rank distribution, perfectly in line with the theoretical predictions (dashed black lines). (c) A targeted diffusion experiment on a DAG that is based on the citation network of HEP ArXiv repository, containing 10 4 nodes belonging to the 10 4 most cited papers. (d) The results of the same experiment on an exponential network of the same size is given. The inset shows the respective degree distributions. Despite the huge topological difference between these two graphs, the rank distribution of visits to nodes is clearly of Zipf's type for almost four decades in both cases. rank distribution of node visits. For the same network we introduce noise with λ = 0.5 and carry out the same diffusion experiment. The observed slope corresponds nicely with the predicted value of λ, as shown in figure 4a (red squares) for the Poisson weights.
We computed rank distributions of node visits from diffusion on more general network topologies. In figure  4c we show the rank distribution of node visits where the substrate network is the citation network of High Energy Physics in the ArXiv repository [33,34], and the order is induced by the degree of nodes. Figure 4d shows the rank distribution of node visits from diffusion on an exponential DAG, that is generated by non-preferential attachment [35], where the order of nodes is again induced according to the degree. Both networks show Zipf's law in the rank distribution of node visits. This is remarkable since both networks are drastically different in topological terms.

B. Analytical results for targeted diffusion on random DAGs
For diffusion on random DAGs it is possible to obtain analytic results that are identical to equation (1), showing that Zipf's law is generally present in targeted diffusion.
We first focus on the definition of the prior probabilities in the context of diffusion on undirected networks. As stated above, q(i) is the probability that state i is visited in a random sampling process, see Figs. 1b and 1c. In the network context this corresponds to the probability that node i is visited by a random walker. Assume that we have an undirected random graph G(V, E) and that the N nodes are labelled 1, ...N . The probability that a random walker arrives at node i from a randomly chosen link of E, the network-prior probability of node i, is easily identified as where |E| is the number of links in the graph; the factor 2 appears because a link contains 2 endpoints. If σ G ≡ {k 1 , ..., k N } denotes the undirected degree sequence q G , is a simple rescaling of σ G , i.e., q G = 1 2|E| σ G . Using the same notation as before, the cumulative network-prior probability distribution is g G (i) ≡ ≤i q G ( ). From equation (18) and by assuming that in sparse graphs the probability of self-loops vanishes, i.e., p(e ii ) → 0, one can compute the probability that a link e ij exists in G, [32] p(e ij ∈ E) = k(i)k(j) where the second step is possible since ≤N k( ) = 2|E|. With this result, the out-degree of node labelled i in the graph G D can be approximated by Note that to compute k out i we only need take into account the (undirected) links which connect i to nodes with a lower label j < i, according to the labelling used for the DAG construction outlined above.
We can now compute the probability that a random walker jumps from node i to node j on the DAG G D , This is the network analogue of equation (1). Here p(j|i, e ij ∈ E) is the probability that the random walker jumps from i to j given that i > j and the link e ij exists in G. Clearly, this probability is Using Eqs. (19) and (22) in equation (21) we get which has the same form as equation (1). Note that this expression only depends on q G , i.e. the degrees of nodes in the undirected (!) graph G. The solution of equation (23) is obtained in exactly the same way as before for equation (1), and the node visiting probability of targeted diffusion on random DAGs is which is the network analog of equation (6). We finally show the results for a DAG that is based on an ER graph. For an ER graph, by definition, the probability for a link to exist is a constant r ∈ (0, 1], and p(e ij ∈ E) = r. Again we label all nodes by 1, ..., N and build a DAG G D ER as described above. It is not difficult to see that the out-degree of node i is k out (i) = (i − 1)r, and, using this directly in equation (21), we get which is the standard equation for a SSRP with uniform prior probabilities q, [13]. This means that for the ER graph q G (i) is a constant and g G (i) ∼ i. Using this in equation (24), we find that the node visiting probability is exactly Zipf's law, with respect to the ordering used to build the DAG, Note that this result is independent of r and, therefore, of the average degree of the graph.

IV. DISCUSSION
We have shown that if a system, whose states are characterized by prior probabilities q, is sampled through a SSRP, the corresponding sampling space gets deformed, in a way that Zipf's law emerges as a dominant attractor. This is true for a huge class of reasonable prior probabilities, and might be the fundamental origin of the ubiquitous presence of Zipf's law in nature. On the theoretical side we provide a direct link between nonergodicity as it typically occurs in path-dependent processes and power laws in corresponding statistics. Formally, SSRPs define a microscopic dynamics that results in a deformation of the phase space. It has been pointed out that the emergence of non-extensive properties may be related to generic deformations of the phase space [36][37][38]. Consequently, SSRPs offer a entirely new playground to connect microscopic and macroscopic dynamics in non-equilibrium systems. Our results could help to understand the astonishing resilience of some scaling patterns which are associated with Zipf's law, such as the recent universality in body-mass scaling found in ecosystems [39].
We discussed one fascinating direct application of this process: the origin of scaling laws in node visit frequencies in targeted diffusion on networks. We demonstrated both theoretically and by simulations that the immense robustness of these scaling laws in targeted diffusionand Zipf's law in particular -arises generically, regardless of its topological details, or weight distributions. The corresponding exponents are related to the amount of cycles in a network. This finding should be relevant for a series of applications of targeted diffusion on networks where a target has to be found and reached, such as in traffic-, transport-or supply chain management. We conjecture that these findings and variations will apply for search processes in general.

Noiseless continuous SSRPs
With the example of the staircase in mind, we can describe a SSRP ψ over a continuous sampling space, see figure (5). We start in the extreme of the interval, x = N , and we choose any point of Ω following the probability density q. Suppose we land in x < N . Then, at time t = 1 we choose at random some point x ∈ Ω x following a probability density proportional to q. We run the process until a point z ∈ Ω 1 is reached. Then the process stops. The SSRP ψ can be described by the transition probabilities between the elements of x, y ∈ Ω such that y > 1 as follows, where g(y) is the cumulative density distribution evaluated at point y, We are interested in the probability density p which governs the frequency of visits along Ω after the sampling process ψ. To this end, we start with the following selfconsistent relation for p, Recall that the integration limits N x represent the fact that a particular state x can only be reached from a state y > x. By differentiating this integral equation we obtain: In agreement to equation (A2), p(x|y) = q(x)/g(y) if y > 1 and y > x. Equation (A5) can be expanded using the Leibniz rule: This leads to a differential equation governing the dynamics of SSRPs under arbitrary prior probabilities q, The above equation can be easily integrated in the interval (1, N ]. Observing that equation (A7) can be rewritten as One finds: κ being an integration constant to be determined by normalisation. The above equation has as a general solution for points x ∈ (1, N ] where Z is the normalisation constant This demonstrates how the prior probabilities q are deformed when sampled through the SSRP ψ in the region x ∈ (1, N ]. This is the analogous to equation (6) of the main text.

Continuous SSRPs with noise
Suppose the interval Ω = (0, N ] and let us define a probability density q on Ω as in equation (A1). The noisy SSRP ψ λ starts at x = N and jumps to any point in x ∈ Ω, according to the prior probabilities q. From x the system has two options: (i) with probability λ the process jumps to any x ∈ Ω x , i.e., ψ λ continues the SSRP we described above or, (ii) with probability 1 − λ, ψ λ jumps to any point x ∈ Ω, following a standard sampling process. The process stops when it jumps to a member of the sink set, namely to a x ≤ 1. The transition probabilities now read (∀y > 1), Note that the noise enables the process to move from y to x, in spite x > y. As we did in equation (A4), we can find a consistency relation for the probability density p λ of visiting a given point of Ω along a noisy SSRP, If we take the derivative where the fourth step is performed taking the definition of p λ (x) given in equation (A13). We therefore have the following differential equation for p λ (x), which can be rewritten as Integrating it overall x ∈ (1, N ], we obtain which again demonstrates how the noisy SSRP deforms the underlying prior probabilities q, Z λ being the normalisation constant. Interestingly, if λ < 1, i.e., if we consider a noisy SSRP, λ has the role of a scaling exponent. We observe that we recover the standard SSRP ψ described above in equation (A2) if λ → 1 (no noise) and the Bernouilli process following the prior probabilities q if we have total noise, as expected. The results for the continuous SSRPs are similar to the discrete case; compare equation (A15) and equation (16).

Appendix B: Targeted diffusion on networks with different topologies
In the following we find the mapping between the degree distribution P (k) and the undirected ordered degree sequence. Once we know the degree sequence, we can compute the network prior probabilities q G thanks to equation (18). Then, we apply directly equation (24), which gives us the general form of statistics of node visits for targeted diffusion.
Without any loss of generality we assume that there is a labelling of the nodes of the graph G, such that the undirected degree sequence σ G , given by is ordered, meaning that In the following we will assume that the degree distribution P (k) is known and that we want to infer the formal shape of σ G , if any. In general, a formal mapping from P (k) to σ G is hard or even impossible to find. However, it can be approximated. Let us assume that there exists a function f (i) = k i that gives the degree of the i-th node of the ordered degree sequence of the undirected graph G. Suppose, for the sake of notational simplicity, that k i = k. Clearly, f −1 (k) = i. From this we infer that there are approximately i − 1 nodes whose degree is higher than k. The probability of finding a randomly chosen node whose degree is higher than k, P < (k), is P < (k) = k >k P (k ). The number of nodes with degree larger than k will thus be approached by N P < (k).
Under the assumption that the number of nodes is large one can argue that The identification of f from the knowledge of P (k) provides the functional shape of the ordered degree sequence and, consequently, the network-prior probability distribution.
Exponential networks: Exponential networks have a degree distribution given by with χ > 0. The direct application of equation (B3) reads leading to Since we assumed that k i = f (i), and knowing, from equation (18), that q(i) = k i /2|E|, the network-prior probabilities for exponential networks, q exp , are given by For large graphs we can approximate g G (i) by and equation (24) asymptotically becomes Targeted diffusion on exponential DAG networks therefore leads to Zipf's law in node visiting frequencies.
Scale-free networks: Scale-free networks have a degree distribution P (k) ∼ k −α . For α > 2, which is the most common case, one has which implies with −β = (1 − α) −1 . Therefore, the network-prior probabilities for scale-free networks, q SF , are given by As a consequence the cumulative network-prior distribution, g SG , is (approximating the sum with an integral) Using equation (24), this leads to Again Zipf's law appears in the node visiting probabilities.