Classification of complex systems by their sample-space scaling exponents

The nature of statistics, statistical mechanics and consequently the thermodynamics of stochastic systems is largely determined by how the number of states $W(N)$ depends on the size $N$ of the system. Here we propose a scaling expansion of the phasespace volume $W(N)$ of a stochastic system. The corresponding expansion coefficients (exponents) define the universality class the system belongs to. Systems within the same universality class share the same statistics and thermodynamics. For sub-exponentially growing systems such expansions have been shown to exist. By using the scaling expansion this classification can be extended to all stochastic systems, including correlated, constraint and super-exponential systems. The extensive entropy of these systems can be easily expressed in terms of thee scaling exponents. Systems with super-exponential phasespace growth contain important systems, such as magnetic coins that combine combinatorial and structural statistics. We discuss other applications in the statistics of networks, aging, and cascading random walks.


Introduction
Classical statistical physics typically deals with large systems composed of weakly interacting components, which can be decomposed into (practically) independent subsystems. The phasespace volume W or the number of states of such systems grows exponentially with system size N. For example, the number of configurations in a spin system of N independent spins is W (N) = 2 N . For more complicated systems, however, where particles interact strongly, which are path-dependent, or whose configurations become constrained, exponential phasespace growth no-longer occurs, and things become more interesting. For example, in black holes the accessible number of states does not scale with the volume but with surface, which leads to non-standard entropies and thermodynamics [1,2,3]. A version of entropy that depends on the surface and the volume was recently suggested in [4].
Other examples include systems with interactions on networks, path-dependent processes, co-evolving systems, and many driven non-equilibrium systems. These systems are often non-ergodic and are referred to as complex systems. For these systems, in general, the classical statistical description based on Boltzmann-Gibbs statistical mechanics fails to make correct predictions with respect of the thermodynamic, the information theoretic, or the maximum entropy related aspects [5]. Often the underlying statistics is then dominated by fat-tailed distributions, and power-laws in particular. There have been considerable efforts to understand the origin of power-law statistics in complex systems. Some progress was made for systems with sub-exponentially growing phasespace. It was shown that systems whose phasespace grow as power laws, W (N) ∼ N b , are tightly related to so-called Tsallis statistics [6].
The tremendous variety and richness of complex systems has led to the question whether it is possible to classify them in terms of their statistical behavior. Given such a classification, is it possible to arrive at a generalized concept of the statistical physics of complex systems, or do we have to establish the statistical physics framework for every particular system independently? For sub-exponentially growing systems such a classification was attempted by characterizing stochastic systems in terms of two scaling exponents of their extensive entropy [7]. The first scaling exponent is recovered from the relation S(λW ) S(W ) ∼ λ c , which is valid if the first three Shannon-Khinchin axioms (see supplementary material) are valid (the fourth, the composition axiom, can be violated), and if the entropy is of so-called trace form, which means that it can be expressed as S = W i g(p i ), where p i is the probability for state i, and g some function. The second scaling exponent d is obtained from a scaling relation that involves the re-scaling of the number of states W → W a . With these two scaling exponents c and d it becomes possible to classify sub-exponentially growing systems that fulfil the first three Shannon-Khinchin axioms [7]. Further, the exponents c and d characterize the extensive entropy, S c,d ∼ Γ(1 + d, c log(p i )). Practically all entropies that were suggested within the past three decades, are special cases of this (c, d)-entropy, including Boltzmann-Gibbs-Shannon entropy (c = 1, d = 1), Tsallis entropy (d = 0), Kaniadakis entropy (c = 1, d = 1) [8], Anteonodo-Plastino entropy (c = 1, d > 0) [9], and all others that fulfil the first three Shannon-Khinchin axioms. In [10] it was then shown that the exponents c and d are tightly related with phasespace growth of the underlaying systems. In fact, they can be derived from the knowledge of W (N), 1/(1 − c) = lim N →∞ NW ′ /W , and For super-exponential systems such a classification is hitherto missing. These systems include important examples of stochastic complex systems that form new states as a result of the interactions of elements. These are systems that-besides their combinatorial number of states (e.g. exponential)-form additional states that emerge as structures from the components. The total number of states then grows superexponentially with respect to system size, e.g. the number of elements. Stochastic systems with elements that can occupy several states (more than one) and that can form structures with other elements, are generally super-exponential systems. It was pointed out in [11] that such systems might exhibit non-trivial thermodynamical properties.
An example for such systems are magnetic coins of the following kind. Imagine a set of N coins that come in two states, up and down. There are 2 N states. However, these coins are "magnetic", and any two of them can stick to each other, forming a new bond state (neither up nor down). If there are N = 2 coins, there are five states: the usual four states, uu, ud, du, dd, and a fifth state 'bond'. If there are N = 3 coins, there are 14 states, the 2 3 combinatorial states, and six states involving bond states: state 9 is bond between coin 1 and 2, with the third coin up, state 10 is the same bond state with the third coin down, state 11 is a bond between 1 and 3 with the second con up, 12 the same bond with the second coin down, state 12 is a bond between 2 and 3, with the first state up, and finally, state 14 is the bond between 2 and 3 with the first coin down. It can be easily shown that the recursive formula for the number of states is, W(N + 1) = 2W(N) + NW(N − 1), which, for large N, grows as W(N) ∼ N N/2 e 2 √ N , see [11].
In this paper we show that it is indeed possible to find a complete classification of complex stochastic systems, including the super-exponential case. By expanding a generic phasespace volume W(N) in a Poincaré expansion, we will see that for any possibility of phase space growth, there exists a sequence of unique expansion coefficients that are nothing but scaling exponents that describe systems in their large size limit. The set of scaling exponents gives us the full classification of complex systems in the sense that two systems belong to the same universality class, if it is possible to rescale one into the other with exactly these exponents. The framework presented here has been proposed in [12] and generalizes the classification approach of [7,10]. It includes the sub-exponential systems as a special case. We show further that these exponents can be used straight forwardly to express-with a few additional requirementsthe corresponding extensive entropy, which is the basis for the thermodynamic properties of the system. Finally, we see in several examples that many systems are fully characterized by a very few exponents. Technical details and auxiliary results are presented in the supplementary material. We reference to the supplementary material in the corresponding parts of the main text. However, readers may also go through the supplementary material before they continue reading. We use the following notation for applying a function f for n times, f (n) (x) = f (. . . (f (x)) . . .) n times .

Rescaling phasespace
Suppose that phasespace volume depends on system size N (e.g. number of elements) as W (N). We use the Poincaré asymptotic expansion for the l + 1 th logarithm of W , where φ j (N) = log (j+1) (N) for N → ∞. A uniqueness theorem (see e.g. [13]) states that the asymptotic expansion exists and is uniquely determined for any W (N) for which log (l+1) W (N) = O(φ 0 (N)), see supplementary material. To see how the exponents c j correspond to scaling exponents, let us define a sequence of re-scaling operations, For example r The scaling operations obey the composition rule We can now investigate the scaling behavior of the phasespace volume in the thermodynamic limit, N ≫ 1. The leading order of the scaling is given by the first rescaling r 0 . We show in the supplementary material that the rescaling of phasespace is asymptotically described by where c (l) 0 ∈ R is the leading exponent, and l is determined from the condition that c (l) 0 should be finite. Thus, to leading order, the sample space grows as W (N) ∼ exp (l) N c (l) 0 . We now identify the scaling laws for the sub-leading corrections through higher-order rescalings W (r (k) λ (N)). We get (see supplementary material) Equivalently, one can express this relation as, W (r j , take the derivative of Eq. (4) w.r.t. λ, set λ = 1 and consider the limit N → ∞. For the leading scaling exponent we obtain The scaling exponent corresponding to the k-th order is obtained in a similar way and reads, This expression is not identically equal to zero, because the expression on the r.h.s. of Eq. (6) becomes c (l) 0 only in the limit. As a result, the phasespace volume grows as which is nothing but the Poincaré asymptotic expansion in Eq. (1). In the supplementary material we show that the formulas for c j , given by the theory of asymptotic expansions, correspond to the formulas for scaling exponents c (l) j and therefore it is indeed possible to express any W (N) in terms of an asymptotic expansion that is based on the sequence φ n (N). The expansion coefficients are scaling exponents determined by the rescaling of phasespace. Here n denotes the minimal number of expansion terms. In the typical situations, only a few scaling exponents are non-zero. If all exponents are non-zero, we can truncate the expansion after a few terms and still preserve a high level of precision. In many realistic situations it is enough to consider n = 2. The estimation of the leading order exponent can be tricky, because looking for the order l incorporates calculation of several infinite limits. Therefore, it is convenient to use an approach based on the corresponding extensive entropy.

The extensive entropy
The extensive entropy can be obtained by following an idea exposed in [7,10]. Let's assume a so-called trace-form entropy for some probability distribution P = (p 1 , . . . , p W ) where g is some function. The aim is to find such a function g, for which the entropy functional S g is extensive for a given W (N). Assuming that no prior information about the system is given, we consider uniform probabilities p i = 1/W . The extensivity condition can be expressed by an equation for g, which is [10] Alternatively, it is possible to define the extensive entropy as the solution of Euler's differential equation, see also [4], The question now is, how the scaling exponents of W (N) are related to scaling exponents of S g (W ). We begin with the first scaling operation r (0) . One can show that for N ≫ 1, Again, it is possible to determine the relation for the n th scaling exponent or equivalently, S g (r We can extract the scaling exponents d n by the same procedure as for c (l) k by taking the derivative w.r.t. λ, setting λ = 1 and performing the limit. For the first exponent we get De L'Hospital's rule and applying the extensivity condition of Eq. (10) gives We mentioned this result already above. The n th term can be found analogously to be We can now relate the scaling exponents c (l) k and d n by comparing Eqs. (7) and (16). For this we use a similar notation as for the exponents c The corresponding extensive entropy can now be characterized by the function g(x), which scales as the corresponding entropy scales as This equation is nothing but the asymptotic expansion of log S g in terms of φ n+l (N) = log (n+l+1) (N); the coefficients are again the scaling exponents that correspond to the rescaling of the entropy.
Note that the entropy approach allows us to obtain additional restrictions for the scaling exponents if further information about the system is available. For example, many systems fulfil the first three of the four Shannon-Khinchin (SK) axioms, see supplementary material. There we also show that it is possible to find a representation of the entropy that obeys the three axioms and the scaling in Eq. (19). In this case g(x) can be expressed as where a i are constants. One possible choice for those is The axioms impose restrictions on the range of scaling exponents. (SK2) requires that d The resulting entropy can be expressed by Eq. (12). One can trivially adjust the entropy minimal value, such that for the totally ordered state, S g (1) = 0. This is obtained by rescaling where λ = exp(g (1)). Note that the form of the entropy in Eq (20) is equivalent to (c, d)-entropy for c = 1 − d 0 and d = d 1 , and d j = 0 for all j ≥ 2.

Examples
We conclude with several examples of systems that are characterized by different sets of scaling exponents. Exponential growth: the random walk. Imagine the ordinary random walk with two possibilities at any timestep-a step to the left, or to the right. The number of possible configurations (i.e. possible paths) after N steps is which means exponential phasespace growth, W(N) = 2 N . We obtain l = 1, c 0 = 1 and d j = 0, for j ≥ 2. This set of exponents belongs to the class of (c, d)-entropies described in [7] for c = 1 − d 0 = 1, and d = d 1 = 1. They correspond to the scaling exponents of the Shannon entropy: from (18) we obtain that g(x) ∼ x log x and from (19) we get S(W ) ∼ log W , which is Boltzmann entropy. It is not immediately apparent what the entropy of a random walk should be. However, the random walk is equivalent to spin system of N independent spins, the 2 N different paths correspond one-to-one to the 2 N configurations in the spin model, where the role entropy of it is clear. Obviously, for the random walk, (SK 1-3) are applicable.
Sub-exponential growth: the aging random walk. In this variation of the random walk we impose correlations on the walk. After the first random choice (left or right) the walker goes one step in that direction. The second random choice is followed by two steps in the same direction, the next step is followed by three steps in the same direction, etc. For k independent choices, one has to make N = k−1 i=1 i = 1/2k(k − 1) steps. For this walk, we get that the number of possible paths is which leads to W (N) = 2 N/k ∼ 2 k/2 . For N ≫ 1, we have k ≈ √ N , and we obtain a stretched exponential (sub-exponential) asymptotic behavior, W(N) ∼ 2 √ N . The order is again l = 1 and the exponents are c = 2. Therefore, the three SK axioms are applicable and the resulting extensive entropy belongs to the class of entropies characterized by the Anteodo-Plastino entropy, since we have g(x) ∼ x(log x) 2 and S(W ) ∼ (log W ) 2 . This entropy is the special case of the (c, d)-entropy for c = 1 and d = 2, see [7].
Super-exponential growth: magnetic coins. Consider N coins with two states (up or down). These coins are magnetic, so that any two can stick to each other to create a pair which is a third state obtained by interactions of elements (one possible configuration). As mentioned before, in [11] it is shown that the phasespace volume can be obtained recursively For N ≫ 1, we get W(N) ∼ N N/2 e 2 √ N , which yields l = 1, and the scaling exponents c = −1. For the entropy this means, that g(x) ∼ x log x/ log log x and S(W ) ∼ log W/ log log W . This case is not contained in the class of (c, d)-entropies, because the third exponent, corresponding to the doublylogarithmic correction, is not zero. Actually we obtain c = 1 and d = 1, which would naively indicate Shannon entropy. However, the correction makes the system clearly super-exponential. The SK axioms are still applicable, the class of accessible entropy formulas is restricted by (SK2). For example, for the representative entropy Eq. (20) we find that a 0 ≥ 0 and a 1 ≥ 0, see supplementary material.
Super-exponential growth: random networks. Imagine a random network with N nodes. When a new node is added, there emerge N new possible links, which gives us 2 N new possible configurations for each configuration of the network with N links. We obtain the recursive growth equation which leads to W(N) = 2 ( N 2 ) , as expected. For this phasespace growth, we obtain l = 1, c The entropy corresponds to the class of compressed exponentials, which are superexponential, however, the entropy belongs to the class of (c, d)-entropies for c = 1 and d = 1/2. Because all exponents are positive the entropy observes the SK axioms.
Super-exponential growth: the cascading random walk. Consider a generalization of the random walk, where a walker can take a left or right step, but it can also split into two walkers, one of which then goes left, the other to the right. Each walker can then go left, right, or split again (multiple walkers can occupy the same position). The number of possible paths after N steps is where the first term reflects the left/right decisions, the second the splittings. We have W(N) = 2 (2 N−1 ) − 1, and find that l = 2, c (2) 0 = 1. The corresponding extensive entropy is g(x) ∼ x log log(x) and scales as S(W ) ∼ log log W . Because the coefficients are not negative, SK axioms are applicable. However, even though all correction scaling exponents are zero, the system cannot be described in terms of (c, d)-entropies, because l = 2. We would naively obtain that c = 1 and d = 0, which would wrongly correspond to Tsallis entropy. Alternatively, we can think of an example of a spin system with the same scaling exponents. In this case, N would not describe the size of a system, but its dimension. For N = 1, we would have two particles on the line, for N = 2 we have 4 particles forming a square, for N = 3 we have a cube with 8 particles in its vertices, etc. In general, we can think of a spin system of particles sitting on the vertices of a N-dimensional hypercube. The number of particles is naturally 2 N and for two possible spins we obtain W (N) = 2 (2 N ) .

Conclusions
We introduced a comprehensive classification of complex systems in the thermodynamic limit based on the rescaling properties of their phasespace volume. From a scalingexpansion of the phasespace growth with system size, we obtain a set of scaling exponents, which uniquely characterize the statistical structure of the given system. Restrictions on the scaling exponents can be obtained with further information about the system. In this context we discuss the first three Shannon-Khinchin axioms, which are valid for many complex systems. The set of exponents further determine the scaling exponents of the corresponding extensive entropy, which plays a central role in the thermodynamics of statistical systems. Thermodynamics is not the only context where entropy appears. As was shown in [5] for many complex systems the functional expressions for entropy depend on the context, in particular if one talks about the thermodynamic (extensive) entropy, the information theoretic entropy, or the entropy that appears in the maximum entropy principle. It remains to be seen if for superexponential systems there exists an underlying relation between the scaling exponents of the extensive entropy, and the exponents obtained from a information theoretic, or maximum entropy description of the same complex systems.

Rescaling in the thermodynamic limit
We first prove a theorem which determines the general form of rescaling relations in the thermodynamic limit for any general function.
Theorem. Let g(x) be a positive, continuous function on R + . Let us define the function z(λ) : .
Proof. From the definition of z(λ), it is straightforward to show that z(λλ ′ ) = z(λ)z(λ ′ ), because It may happen that c 0 is infinite. Thus, we may need to use higher-order scaling for the sample space, i.e., r (l) λ c 0 (W (N)), as shown in the main text. l is determined by the condition that the scaling exponent should be finite. The first correction term is given by the scaling W (r (1) λ (N)) = W (N λ ). To obtain the sub-leading correction, we have to factor out the leading growth term. This means that the scaling relation for the first sub-leading correction looks like which is again a consequence of the above theorem. To obtain the corresponding scaling relations for higher-order scaling exponents for the sample space (A.4), we need to factor out all previous terms corresponding to lower-order scalings, so the scaling relation looks like Because the left-hand side of this relation has the form of the function z appearing in the theorem, the validity of the relation is satisfied for N → ∞. Similarly, we can deduce the relations for scaling exponents that are associated with the extensive entropy.

Asymptotic expansion in terms of nested logarithms
The asymptotic representation of W (N) is obtained by the rescaling that corresponds to the Poincaré asymptotic expansion [13] of log (l+1) (W ) in terms of φ n (N) = log (n+1) (N) for N → ∞. Let us consider a function f (x) with a singular point at x 0 . It is possible to express its asymptotic properties in the neighborhood of x 0 in terms of the asymptotic series of functions φ n (x), if f (x) = O(φ 0 (x)) and φ n+1 (x) = O(φ n (x)). The series is given as The coefficients can be calculated from the formulas in [13] .
In our case, i.e., for N → ∞ and φ n (N) = log (n+1) (N) the function log (l+1) (W ) can be expressed (for appropriate l) in terms of this series, and the coefficients c  The original function can be obtained by This transform turns an increasing/decreasing function to a convex/concave function, while the scaling for x → 0 remains unchanged. Let us write the function g in the form of the transform Axiom (SK3) means g(0) = 0. This requires that the integrand should not diverge faster than 1/x for x → 0. This can be fulfilled for is automatically concave if d n ≥ 0, since a product of positive, decreasing functions is also decreasing. However, for d n < 0, [1 + log] (n) (1/x) dn is an increasing function from zero to one and the whole product may not be decreasing. In order to solve this issue, we introduce a set of constants a i and write g(x) in the form The constants a i can be chosen to ensure that the integrand is a decreasing function. We assume a i ≥ −1 to avoid problems with powers of negative numbers. The second derivative of g(x), i.e., the first derivative of the integrand is an increasing function and For d l < 0, the entropy cannot be concave, so d l > 0 is the restriction given by (SK2). To obtain a negative second derivative on the whole domain [0, 1], it is therefore enough to investigate d 2 g(x) dx 2 | x=1 , which leads to the condition i > 0, one may even choose a i = −1. On the other hand, for the case of magnetic coin model, one obtains that for a 0 = 0, a 1 = 0 as well.
Finally, let us show the connection to (c, d)-entropy derived in [7]. In this case, we assume only d 0 and d 1 can be non-zero, which leads to for c = 1 − d 0 and d = d 1 , which is nothing else than the Gamma entropy of [7].

Ordering of processes and classes of equivalence
The set of scaling exponents form natural classes of equivalence with natural ordering. Consider two discrete random processes X(N) and Y (N) with sample spaces W X (N) and W Y (N), respectively. The corresponding sets of scaling exponents are denoted by 1 , . . .}. One can introduce an ordering based on the scaling exponents. We write This is equivalent to lexicographic ordering. One can also introduce an ordering, which takes into account only certain a number of correcting terms. So, for example Similarly, one can define ≺ k , which takes into account only k correction terms. Additionally, it is possible to introduce an equivalence relation and also equivalence up to certain correction As an example, for magnetic coin model and random walk we have that X MC ∼ 0 X RW , but X MC ∼ X RW .

Construction of a "representative process"
To understand the mechanism of how the scaling exponents correspond to the structure of a random process, let us discuss a simple procedure to generally obtain processes with given scaling exponents c (l) k . We start with a random variable X 0 with N possible outcomes, so that W X 0 (N) = {1, . . . , N}. The scaling exponents of this process are naturally c First we can create all possible subsets of W X 0 (N). This defines a new variable X 1 with W X 1 (N) = 2 W X 0 (N ) , and we get c where 2 X denotes a variable on all subsets of X. One can easily show that this results in a shift of scaling exponents c . The interpretation of this transformation is the following: Consider an ordinary random walk with two possible steps. If X 0 (N) denotes a number of steps of a random walker, then X 1 (N) = 2 X 0 (N ) denotes the number of possible paths. When we apply the transform again, we obtain X 2 (N) = 2 X 1 (N ) . This denotes the number of possible configurations of a random walk cascade, etc. As a result, by more applications of 2, we obtain processes with more complicated structure of the respective phasespace.
To construct processes with arbitrary exponents, let us think about a procedure, where we create only partial subsets, which number p(N) can be between N (no partitioning) and 2 N (full partitioning) We denote this procedure by P. This process can be understood as process corresponding to a correlated random walk. This means that not every step of the walk is independent, but some steps can be determined by the previous steps, which diminishes the number of possible configurations when compared to the uncorrelated random walk. The resulting random process is obtained as the composition of l uncorrelated random walks (full partitioning) and a correlated random walk In the limit of large N we can assume that the function s does not depend on N, i.e., is a priori given by the scaling exponents of the system. Let us also assume, without loss of generality, that s is an increasing function (we can neglect the last cell, because its size is determined by the size of previous cells). For N ≫ 1, we approximate the sum by the integral and obtain .
Some examples for s(x) for a corresponding p(N) are • p(N) = 2 N , i.e., full partitioning corresponding to uncorrelated random walk. In this case, we obtain that s(x) = const., as expected.
• p(N) = N, i.e., no partitioning to maximally correlated random walk. We obtain that s(x) ∼ 2 x , which can be seen from the relation N i 2 i ∼ 2 N . • p(N) = N log N, which corresponds to the correction in the magnetic coin model.
In this case, s(x) ∼ 2 W (x) / log(W (x)), where W (x) is the Lambert W-function.