The tails of rank-size distributions due to multiplicative processes: from power laws to stretched exponentials and beta-like functions

Although power laws have been used to fit rank distributions in many different contexts, they usually fail at the tail. Here we show that many different data in rank laws, like in granular materials, codons, author impact in scientific journals, etc are very well fitted by a β-like function ({a, b} distribution). Since this distribution is indeed ubiquitous, it is reasonable to associate it with some kind of general mechanism. In particular, we have found that the macrostates of the product of discrete probability distributions imply stretched exponential-like frequency-rank functions, which qualitatively and quantitatively can be fitted with the {a,b} distribution in the limit of many random variables. We show this by transforming the problem into an algebraic one: finding the rank of successive products of a given set of numbers.

ranks [8]. Such deviation is also found in many physical systems [9] and is known as the tail of the distribution [10]. As a matter of fact, when the highest rank of the data is finite, the power law is cut and finite size effects should be present. Usually, for each case a different ad hoc fitting function is proposed [3]. Another path is to construct a rank-size distribution from the cumulative distribution [10], by which method others have fitted the probability distributions with stretched exponential [9] and log-normal distributions [11]. However, for low ranks deviations are also observed, and unfortunately all of the previous expressions do not fit the data at both ending tails, at which different kinds of processes are set in once a crossover region is reached. Thus, multiscaling physical modeling seems to be a key issue as in turbulence, where Kolmogorov's power law is observed only in the inertial regimen. In one tail (small length scales) energy dissipation plays the main role, while energy injection dominates at big scales. One can conjecture that similar ideas are behind many other complex physical systems, since we report that many rank laws are extremely well parametrized, with a two exponent β function-like formula with parameters {a, b}, where a and b are fitted from the data, r is the rank and R is the maximal We will show that f (r ) is related to a kind of central limit theorem. In fact, Moyano et al [12] have commented that the rather ubiquitous presence of the Tsallis q-distributions is maybe due to a q-generalized central limit theorem for a class of non independent, correlated, product of probability distributions [13].
As an example of the phenomenology that we have found in rank laws, here we present three representative results. Figure 1 shows a semilog plot of the impact factor against the rank of scientific journals, taken from a recent study [14], compared with the fits given by equation (1). The fits are excellent, all with correlation coefficients above 0.98. Notice that we use a semilog plot instead of the usual log-log plot representation due to the fact that in such a way, the tails are easily observed since the ranks are treated in a linear fashion. A usual plot in the log-log representation will also reveal a significant deviation from the power law at the tails, as can be observed in several works [1]- [3].
Similar excellent fits are also obtained for codon usage in genomes, as shown in figure 2, where we plot the logarithm of the frequency of codons (normalized to 1000) as a function of the rank for different representative organisms, taken from a well known genome database [15]. Using equation (1), we have made similar plots for at least 10 organisms with a correlation parameter bigger than 0.99.
In figure 3, we show the rank-ordered distribution of stick-slip events in a slowly sheared granular media taken from [3], fitted using equation (1). Although a modified power law was proposed in [3] to explain the results, the present fit gives a better correlation coefficient. We have verified that equation (1) can be used with excellent results in order to correct the Gutenberg-Richter law, Bénard convection cells and in many different fields, like architecture, population, music or roads [16].
As the {a, b} distribution is indeed ubiquitous, it is reasonable to try to associate it with the product of correlated probability distributions [12]. We have not found such a class; however, here we will show that the product of discrete probability distributions imply stretched exponential-like frequency-rank functions, that qualitatively and quantitatively can be fitted very well with the {a, b} distribution.
In the dynamics of scientific journal impact factor there are many important issues: the ability to select a good problem for investigation, the gift for writing clear papers, etc. Similar comments would be valid for the dynamics of granular media. Perhaps, the presence of all these factors implies products of probabilities which obey the conditions of the hypothetical central limit theorem for the product of correlated probability distributions [12,17]. To be concrete, let us proceed in the same spirit of the central limit theorem, in which a given observable is just the result of many different random processes. Each realization of an observable, is determined by Rank-ordered distribution of stick-slip events in a slowly sheared granular media. Circles are data taken from [3], and the solid line is a fit using equation (1), with a = 1.08 and b = 0.40.
the actual values taken by the random variables in the involved random processes. For example, the distribution of heights in a population is determined by genetics, aliments, health care, etc, but the height of a particular person is just a realization with certain values of the random variables in each related processes. A similar thing happens in multiplicative processes, an observable is built from realizations made in each process. As in the case of the central limit theorem, each process has a different probability distribution. However, when many random variables are considered, only the first moments of these distributions turn out to be important, and as a consequence, a Gaussian appears with or without different probability distribution functions. Thus, in order to simplify the problem, here we will only consider the case of N processes that are identical, where each of them can have s different states with probability p j with j = 1, . . . , s. When N such processes are composed, the full state space may be considered to consist of all s N possible strings of length N , and there are s N possible states of the whole system. One can reduce the probability of each of these states to just (N + s − 1)!/(s − 1)!(N )! different values that we call the reduced probabilities x N (n 1 , n 2 , . . . , n s ). The multiplicity of the states is given by a multinomial coefficient N !/(n 1 !n 2 !n 3 ! . . . n s !), where n j is the number of subsystems in the state j. The probability of an observable for the whole system is, with n 1 + n 2 + n 3 + · · · + n s = N . However, we are interested in the rank of the different values of the resulting observable, not in the probability distribution. To tackle this problem, we notice that each different value of x N (n 1 , n 2 , . . . , n s ) corresponds to a different observable, since if one assumes that a certain observable (X ) is a one to one function of n 1 , n 2 , . . . , n s , then each value of X (n 1 , n 2 , . . . , n s ) can be mapped to x N (n 1 , n 2 , . . . , n s ) and X (n 1 , n 2 , . . . , n s ) = X (x N (n 1 , n 2 , . . . , n s )). From the previous considerations, it is clear that any rank hierarchy of x N (n 1 , n 2 , . . . , n s ) will be inherited to X (n 1 , n 2 , . . . , n s ) in most physical cases, where one can suppose that X (x N (n 1 , n 2 , . . . , n s )) can be expressed as a power series in x N (n 1 , n 2 , . . . , n s ), X (x N (n 1 , n 2 , . . . , n s )) = X 0 + X 1 x N (n 1 , n 2 , . . . , n s ) + · · · , where X 0 and X 1 are constants. Up to first order, this assumption means that X is proportional to x N (n 1 , n 2 , . . . , n s ). The rank features are thus reduced to study the hierarchy present in x N (n 1 , n 2 , . . . , n s ). In the general case of interacting processes, the addition of a new one leads to a relationship of the type x N +1 (n 1 , n 2 , . . . , n s ) = f (x N (n 1 , n 2 , . . . , n s )), while for independent processes, x N (n 1 , n 2 , . . . , n s ) = p n 1 1 p n 2 2 p n 3 3 . . . p n s s .
For the last case, the rank structure can be reduced to the following algebraic problem: take s numbers p 1 , p 2 , . . . , p s at random (normalization can be imposed at the end of the process), labeled in such a way that p 1 > p 2 > · · · > p s , and multiply each number by all the other ones. With these resulting numbers, repeat the process N times, to obtain a set of numbers that have the form p n 1 1 p n 2 2 p n 3 3 . . . p n s s , with the restriction n 1 + n 2 + · · · + n s = N . If the resulting set is arranged in decreasing order, we can assign a rank (r ) to each one according to its order in the hierarchy. The rank r = 1 is assigned to the number p N 1 , while the lowest r = R corresponds to p N s . For example, choose three random numbers p 1 , p 2 and p 3 and then form all of the possible products: p 2 1 , p 1 p 2 , p 1 p 3 , p 2 2 , p 2 p 3 , p 2 3 , then repeat the procedure. In figure 4 we present a plot of logx N (n 1 , n 2 , n 3 ) as a function of r for N = 40. Surprisingly, as shown in the figure, the resulting ranks are well fitted by the same two parameter β-like function. The message from this numerical experiment is simple: if this product is seen as a multiplicative process where each number is the probability of making a certain choice in during the process, the result has a well determined hierarchy. 6 The problem that remains is how to calculate x N (n 1 , n 2 , . . . , n s ) in terms of the rank. One can think of each set of values (n 1 , n 2 , . . . , n s ) as coordinates in an s-dimensional lattice, that live on a subspace of dimension [18] s − 1, and the rank is a parametrization of a path between the lattice points in such a way that ln x N (n 1 , n 2 , . . . , n s ) decreases in each step, ln x N (r ) = n 1 (r ) ln p 1 + n 2 (r ) ln p 2 + . . . + n s (r ) ln p s , where the starting point is always (N , 0, . . . , 0) and the end is at (0, 0, . . . , N ). For s = 2, the solution is easy to find. Using that n 1 + n 2 = N , it follows that, with C = |ln( p 2 / p 1 )|. Equation (6) shows that the numbers decay in a pure exponential way. The case s = 3 can be easily visualized as a trajectory in a triangle. The solution for any set p 1 , p 2 , p 3 is complicated, since the paths are usually complex. However, one can work out the cases p 1 ∼ p 2 p 3 and p 1 p 2 ∼ p 3 ; these cases provide the clue to solve others. Consider the limit p 1 ∼ p 2 p 3 , and δ 2 21 δ 31 , where we define δi j ≡ p i / p j . This rank sequence is similar to an odometer with an increased range after each turn due to the hierarchy 1 > δ 21 > δ 2 21 > δ 31 > δ 21 δ 31 > δ 2 31 > · · · > δ N 31 . For example, when N = 2 this leads to the following table that contains the number x N (r ) as a function of the rank and the corresponding path, x N (r ) n 2 n 3 r n 2M (r ) The sequence of the path goes as follows, first n 2 (r ) is increased one by one as n 3 (r ) remains constant, until it reaches a maximal value called n 2M (r ) which in fact determines the basic shape of the curve x N (r ), since δ 31 only produces small jumps (see figure 4). Once n 2 (r ) increases from zero to n 2M (r ), a new cycle begins with n 2 (r ) = 0 and n 3 (r + 1) = n 3 (r ) + 1. As a result, after a large number of steps, where R is the maximal rank. Then, by solving the resulting quadratic equation and since 1 N R, we get that n 2M (r ) ≈ N (R − r + 1) 1/2 . The corresponding value of n 3 (r ) can be obtained from the condition n 2 (r ) + n 3 (r ) N . Finally, the number as a function of the rank is given by, Furthermore, equation (8) can be written as an stretched exponential as follows, with D = N |ln( p 2 / p 3 )|. This curve shows an excellent agreement with the numerical results (see figure 4). Notice in figure 4 how this formula works better as the rank approaches R. The case p 1 p 2 ∼ p 3 can be tackled in a similar way. The result is, The comparison with the numerical results is also excellent (see figure 4).
In the general case, when p 1 , p 2 and p 3 have the same order of magnitude, as for example in figure 4, there are two tails for r → 1 and r → R. The tail at low r is basically produced by the hierarchy in the biggest probabilities, i.e. by numbers where n 1 ∼ N in which equation (10) gives the upward curvature. In a similar way, the tail for r near R is produced by the lowest probability hierarchy, n 3 ∼ N , controlled basically by equation (9). The main effect upon these tails when p 1 ∼ p 2 ∼ p 3 is that the sequence of ordering is not uniform [18], and there is a change in the exponent 1/2 that appears in the stretched exponential. Equation (10) is thus transformed into, where α < 1/2 and E s is a constant that depends on s. In a similar way, the exponent 1/2 in equation (9) is replaced by an exponent β < 1/2. Furthermore, these generic exponents for the tails also appear for s > 3 since from the polynomial equivalent to equation (7) one gets β ≈ 1/(s − 1). These changes are the result of the increasing number of cycles in the 'odometer' that we have discussed, as is also clear from the change in the exponents that transform from 1 to 1/2 as s goes from s = 2 to s = 3. A simple procedure to combine each of the tails represented by equations (9) and (10) is obtained by making the observation that only one of the exponentials dominates at a given tail, while the other tends toward a constant, i.e. the logarithm derivative of equation (10), is nearly zero if r → R for R 1. Analyzing the limit r → R gives a similar result for equation (9). From these considerations, a simple way to produce a function with the required tails at both ends when r → R and r → 1 is the following, where C 1 is a constant and D s is another constant that depends on s. Finally, equation (13) can be simplified when many processes are present, since s 1 and as a consequence α → 0 and β → 0. For this limit, in the logarithm derivative equation (12) one can neglect α with respect to 1 and ln x N (r ) ≈ −α E s ln(r/R). A similar procedure can be done in the tail r → 1, in which β is neglected with respect to 1. Combining both tails in a sole expression, we get the fitting function used in this work: and K can be obtained by self-consistency. The previous law is the limiting form of two stretched exponentials when the number of states of processes involved is big, as can be confirmed by comparing the cases with 3 and 4 numbers in figure 4. In conclusion, we have shown a simple formula that allows to fit many different rank phenomena which arises as a limiting case for products of random variables. A task that remains is how to get the coefficients a and b from physical principles, using for example master equations and the concept of multiscaling modeling. A key observation for such a study is that for expansion-modification algorithms in DNA models, a > b if the expansion probability of the genetic code is bigger than the mutation rate [19]. Thus, a and b represent the relative influence of two general mechanisms, where each of them dominate at a given tail. According to some preliminary results, a seems to be related to a certain funnel type of energy landscape, as in protein folding, which leads to a certain deterministic sequence, while b is associated with a many valley landscape, as seen in spin glasses. This last opposite effect provides much more variability in the sequence of results. Such correlation is consistent with associating b to the stochastic component of the dynamics and a with the most deterministic features [19]. In a forthcoming paper, we will analyze specific trends in a and b for different classes or systems.