Conservation of population size is required for self-organized criticality in evolution models

We study models of biological evolution and investigate a key factor to yield self-organized criticality (SOC). The Bak-Sneppen (BS) model is the most basic model that shows an SOC state, which is developed based on minimal and plausible assumptions of Darwinian competition. Another class of models, which have population dynamics and simple rules for species migrations, has also been studied. It turns out that they do not show an SOC state although the assumptions made in these models are similar to those in the BS model. To clarify the origin of these differences and to identify a key ingredient of SOC, we study models that bridge the BS model and the Dynamical Graph model, which is a representative of the population dynamics models. From a comparative study of the models, we find that SOC is found when the fluctuations of the number of species $N$ are suppressed, while it shows off-critical states when $N$ changes according to its evolutionary dynamics. This indicates that the assumption of the fixed system size in the BS model plays a pivotal role to drive the system into an SOC state, and casts doubt on its applicability to actual evolutionary dynamics.


I. INTRODUCTION
The Bak-Snpeppen (BS) model [1] is one of the simplest models of biological evolution and has attracted much attention as it shows self-organized criticality [2][3][4][5][6][7][8][9]. The self-organized criticality (SOC) denotes that the system goes into a critical state without the tuning of a parameter, showing large extinction avalanches [10] and intermittent dynamics [11,12], whose statistics are characterized by power laws [3,13]. It has been suggested that the mass extinctions and punctuated equilibria found in the history of life are consequences of the SOC of the ecosystem, though counterarguments also exist [14][15][16]. The BS model is developed based on the minimal and plausible assumptions of Darwinian competition: A system of a fixed number of species N is updated by successive extinctions of the least fit species and migrations of new species whose fitness and interactions are drawn randomly. By repeating this simple process, the system self-organizes into a critical state, seemingly implying that successive exclusion of unfit species always drives the system into the critical state.
On the other hand, another class of models for biological evolution has also been studied. This class of models has population dynamics of species (either at species level or at individual level) and rules for introducing new species, aiming at bridging ecological and evolutionary time scales. This class of models includes the Tangled-Nature models [17][18][19][20][21][22], the web-world model [23,24], the scale-invariant model [25,26], and the replicator * Current address. model [27]. Each of these models has its own functional form of the population dynamics and there seems to be no de-facto standard. For instance, some of them assume predator-prey interactions between species [21,26], while others allow more general inter-species interactions [18,20]. Interestingly, the statistical properties (such as the extinction-size distribution, the species lifetime distribution, and the intermittency of the time series) obtained for these models seems to be categorized into a few classes despite their wide variety in network size and network topology [28,29]. These statistical properties are dependent on a few key factors of the models, such as the introduction of a genome space [28] or demographic stochasticity [30]. For instance, the Tangled-Nature models assume that new species appear in the system by mutations of existing species, each of which is represented by a single bit flip in an L-bit "genome" string. These models show intermittent evolutionary dynamics, characterized by quasi-stable states interrupted by sudden and active reorganization of species. Among these various models, the simplest rule of introducing new species is the migration rule [28]. In the migration rule, new species, whose links as well as their weights are randomly drawn from a given probability distribution, are added to the system at a constant rate. Within this setting, a simple exponential extinction size distribution and "skewed" species-lifetime distribution are robustly found irrespective of the functional form of the population dynamics, meaning no sign of SOC is found. Although this class of models seems to be based on plausible assumptions similar to the BS model, the latter shows SOC behavior while the former does not. The aim of this paper is to understand the origin of these differences and to identify the key element in the model definitions that causes the SOC behavior in order to deepen the understanding of evolutionary dynamics and SOC phenomena in general.
In this paper, we focus on a comparison between the BS model and the Dynamical Graph (DG) model [31][32][33], which is a simplified variant of the migration models. The DG model is a simplistic model developed based on a minimal set of plausible assumptions in order to analyze the underlying mechanisms for the migration population dynamics models. In the DG model, an ecosystem is represented by a weighted directed network. The species' fitness is defined as the sum of the incoming link weights, and a species survives until its fitness becomes negative. (The detailed model definition is given in the next section.) Even though the model no longer has population dynamics equations, it reproduces the statistical properties of the corresponding population dynamics models, i.e., the model does not show any SOC behavior. Thanks to its simplicity, the mechanisms for the skewed species-lifetime distribution becomes clear and its functional form turns out to be a stretched exponential function with exponent 1/2 [31,33]. We use this model as a representative of the non-SOC models since it is as simple as the BS model.
We propose that the fixed number of species is a key building block for the SOC. In the DG model, the number of species may change with time through the evolutionary process, while the BS model has a fixed number of species. In the sandpile SOC model [34], it is known that conservation of the number of particles is a necessary factor for the SOC [35][36][37]. Although an analogous quantity for the evolutionary model is not known, we conjecture that conservation of the number of species is a necessary condition for SOC. To test this conjecture, we propose a model which suppresses the fluctuations of the number of species by a control parameter. The model is equivalent to the DG model in one limit of the parameter, while the model has a fixed number of species like the BS model in the other limit. We will see how SOC behaviors emerge as this parameter changes to suppress the population fluctuations.
This paper is organized as follows. In the next section, we review the DG model and the BS model in detail, discussing the key differences between these two models. In Section 3, we propose a generalized model which incorporates the DG model and the BS model as special cases, and we show the simulation results for this model. The last section is devoted to summary and discussions.

II. A REVIEW OF THE MODELS
A. Dynamical Graph model In the DG model, a directed weighted network is considered, where nodes and links represent species and interspecies interactions, respectively. The fitness of each species (node) is defined as the sum of its incoming links: where a ji is the weight of the link from node j to node i. The species goes extinct if its fitness becomes less than the threshold f th = 0 and survives otherwise. At each time step, the network is updated by migration of a new species and the subsequent extinctions. When a new species migrates, links between the new species and extant species are made with probability c for each direction, and its weight is randomly drawn from a Gaussian distribution with mean 0 and standard deviation 1. (See Fig. 2 in [31] or Fig. 1 in [32].) Because of the introduction of the new species, some of the existing species, including the new one, may have a negative fitness. The species with the lowest negative fitness in the system is removed. Because of the extinction of this species, the fitnesses of the neighboring species change accordingly. The fitnesses are calculated again, and the species having the lowest fitness is removed if its fitness is negative. The removal of species continues until all the species in the system have a positive or zero fitness, causing an avalanche of extinctions. Since extinctions may or may not happen, the number of species N changes with time. In this setting, the system reaches a statistically stationary state after a sufficiently long initialization period starting from an empty network.
We show the statistical properties of the DG model in Fig. 1(a). The distribution of the fitness P (f ) is shown in Fig. 1(a-1). It is positive only for f ≥ 0 by model definition. The extinction size for this model is defined as the number of the species that went extinct in one time step. The distribution of extinction size follows a simple exponential function as shown in Fig. 1(a-2). The interval between extinctions, which we define as the number of steps between consecutive extinctions (or the number of consecutive steps having no extinction), shows an exponential distribution as well ( Fig. 1(a-3)). The time series of the number of species for this model shows a 1/f 2 power spectral density, which indicates the dynamics is characterized by a simple Ornstein-Uhlenbeck process. All these statistics indicate that the extinction events occur randomly, characterized by a Poisson process. A non-trivial aspect of this model is found in the lifetime distribution of species. The lifetime of a species is defined as the duration between its immigration and extinction. The species lifetime distribution P (L) follows a stretched exponential function, P (L) ∝ exp (−(L/L 0 ) α ), with the exponent α close to 1/2. This functional form is explained by the modified Red-Queen hypothesis, where an age-independent and N -dependent mortality is assumed [31,33].

B. Bak-Sneppen model
Contrary to the DG model, the BS model assumes that an ecosystem has a fixed number of species N and each species i has a single fitness value f i . For each step, the species having the minimum fitness is removed from the system (extinction) and immediately replaced with a new one (migration). The fitness of the new species is uniformly randomly drawn from [0, 1], and the fitness of its neighbors are also updated to a randomly assigned value, modeling the interactions with the new species. Following Ref. [1], we assume that the fitness barrier that must be overcome before a species can go extinct is proportional to its fitness. This leads to the time before extinction of the least fit species, f min , and its replacement with a new species being of the form where t 0 and f 0 are constants. Here, f 0 must be sufficiently small to produce a broad region of critical scaling in the temporal dynamics. In the following, we use t 0 = 1 without loss of generality. So the actual time scale proceeds by τ ext (f min ) in each step. Although species were assumed to be aligned along a one-dimensional lattice in the first paper where the BS model is originally proposed [1], we mainly focus on the random-neighbor (or mean-field) version of the model [5], in which a new species interacts with randomly chosen species. This simplification is instrumental not only for analytical treatment [5,7] but for comparison with the DG model.
A statistically stationary state is obtained after a sufficient number of iterations. For comparison, we show the results for the BS model in Fig. 1(b). The distribution of f , P (f ), is shown in Fig. 1(b-1). It shows a profile similar to a step function: a positive plateau is found for f th < f < 1, while it drops to an infinitesimal value for 0 < f < f th . When we see the temporal dynamics of the extinction events, the dynamics shows an intermittent behavior characterized by a power-law inter-event time distribution ( Fig. 1(b-3)), reminiscent of punctuated equlibria. Here, the inter-event time is defined as the duration between two consecutive extinctions, i.e., τ ext . The extinction size is defined as the number of consecutive extinctions whose f min is smaller than f th . From the viewpoint of the time series, the extinction size is defined as the size of the "correlated bursts" [38,39]. We obtain the clusters of events, or the bursty train, by splitting the event sequence by interval ∆ = τ ext (f th ), and the number of events in each group is regarded as the extinction size. In Fig. 1(b-2), the extinction size is shown. It is well fitted by a power law of exponent −3/2.
The species lifetime can be defined as the time between a species' appearance and its extinction although it has not been studied in the literature to our knowledge. From our simulations, as shown in Figs. 1(b-4), the species lifetime distributions shows a power law followed by an exponential distribution. The exponent for the power law part is −1, which is same as that for the inter-event time distribution. The typical time scale for the latter part is approximately τ ext (f th ). Since the neighbor species and its renewed fitness is randomly selected, each species always has a positive finite probability of having a fitness below the threshold, which leads to its extinction. Thus, the species lifetime distribution has an exponential part although the overall profile is L −1 .

C. Key differences
We summarize the key differences in the model definitions between the BS and the DG models as follows.
1. While the number of species in the DG model fluctuates around its equilibrium value, the BS model has a fixed number of species which is given as a model parameter. 2.
The extinction threshold f th is predefined in the DG model. On the other hand, in the BS model, the threshold of the fitness is self-organized as a result of immigrations and extinctions.
3. The time required for immigrations and extinctions are different. In the DG model, immigrations occur regularly at each time step. The extinction immediately happens when a species has a negative fitness. By contrast, in the BS model, the time required for an extinction is dependent on the fitness value, which is followed by an instantaneous immigration of a new species.
4. The fitness is defined as the sum of the incoming interspecies interactions in the DG model while the fitness for the BS model is defined as an attribute of a species.
The first three factors are not mutually exclusive but dependent on each other. These factors are reminiscent of the difference between canonical and grand canonical ensembles. The canonical ensemble, where the number of particles is fixed and the chemical potential is obtained as an outcome, may be regarded as similar to the BS model, where the number of species is fixed and the threshold is obtained as an outcome. Similarly, the grand canonical ensembles, where the number of particles is obtained given the chemical potential, is analogous to the DG model, where the number of species is obtained given the extinction threshold. Thus, it is not easy to incorporate only one of these factors. The fourth factor, however, is independently testable by modifying the BS model so that the fitness of the species is defined as the sum of the interactions. In the following section, we will first show that the results for this "link-based" BS model remain qualitatively the same as those for the original BS model, implying that Factor 4 is not the key ingredient of the SOC. We will then generalize the model to interpolate between the link-based BS model and the DG model to see the effects of Factors 1, 2 and 3.

A. link-based BS model
We first investigate a variant of the BS model, where the fitness of a species is defined as the sum of the inter-species interactions. More specifically, the ecosystem is represented by a weighted directed network whose nodes and links represent species and interactions between them, respectively. At each time step, the species with the minimum fitness as well as its links are removed. Similar to the BS model, the extinction is followed by an immediate immigration of a new species, whose incoming and outgoing links are made with probability c and their weights are randomly independently drawn from a Gaussian distribution with mean 0 and variance 1. Thus, the fitness of the other species undergoes changes both by the eliminations of the links and by the introduction of the new links. The time required for the extinction to occur is the same as that for the BS model, i.e., τ ext (f ) = exp(f /f 0 ).
This modification does not alter the results significantly from the BS model. See Figs. 1(c) for the results. As shown in Fig. 1(c-1), the fitness distribution shows a sudden drop at a certain threshold f th ≈ 0 similarly to the BS model although the shape of the distribution and the threshold value are different. The extinction size is defined similarly to the BS model. The distribution of the extinction size P (s) shows a power law with exponent −3/2 as shown in Fig. 1(c-2), which is the same as that for the BS model. In Fig. 1(c-3), the inter-event time distribution P (τ ) is shown. It shows an approximate power law with exponent around −1.2, although the profile is slightly concave on the log-log scale. The lifetime distribution, shown in Fig. 1(c-4), is also approximated by a power law with exponent −1. Thus, we conclude that the model shows SOC behavior, which is essentially the same as the BS model. The link-based definition of species fitness does not alter the picture fundamentally.

B. generalized model
We propose the following model in order to study the effect of the key differences 1, 2, and 3 discussed in Section II C. We generalize the DG model, incorporating some ingredients of the BS model. The system is represented as a weighted directed network and a species' fitness is defined as the sum of its incoming links. Following the assumption made in the BS model, we assume that the time required for an extinction of a species depends on the fitness as τ ext (f ) = exp(f /f 0 ), where f 0 is a constant determining typical scale. In addition to this, we also consider the time between migration events in order to control the number of species. We assume that the time required for a migration depends on the current number of species and a parameter µ which controls the fluctuations in N : τ mig (N ) = exp (µ(N − N 0 )/f 0 ).
Here, N 0 is an input parameter of the model, denoting the expected number of species. When µ = 0, the migration time is always τ 0 = 1, irrespective of the current number of species. There is no external force controlling the number of species, which corresponds to the situation for the DG model. When µ is large enough, on the other hand, the number of species is controlled by an external force. Migrations occur frequently when N < N 0 , putting species into the system more often, while migration occurs less frequently when N > N 0 . Thus, the number of species is under stronger control yielding smaller fluctuations in the number of species around N 0 . In the limit of µ → ∞, immigration occurs immediately when N < N 0 , while it never occurs when N > N 0 . This results in a situation that immigrations and extinctions occur one after the other as in the BS model where N is fixed.
The other rules are kept the same as the DG model. When a migration of species occurs, a species whose interactions are randomly determined is added to the system. For each possible link between existing species, we make a link with probability c and the weight of the links are drawn from a Gaussian distribution with mean 0 and variance 1. The algorithm to update the system is summarized as follows.
1. Find the species with the minimum fitness in the system, f min .
• If τ ext < τ mig , an extinction of the least fit species occurs, i.e., the species is removed from the system. The current time t is increased by τ ext .
• If τ ext ≥ τ mig , a migration of a new species occurs. The current time t is increased by τ mig .
The model has a correspondence to the DG model when µ = 0. This is because species with negative fitness are removed during one migration as τ ext < τ mig for f min < 0. As long as there is a species with negative fitness, extinctions continue. When all the species have a positive or zero fitness value, the system accepts an immigrant. Although the actual time scale may be different from the DG model, the long-time behavior is approximately the same for a sufficiently small f 0 since τ ext is negligibly small. When µ → ∞, the model has a correspondence to the BS model. If N > N 0 , extinction of the least fit species always occurs since τ mig → ∞. If N < N 0 , on the other hand, migration of a new species instantly occurs. Thus, migration alternates with extinction, keeping the number of species nearly constant around N 0 . In other words, the least fit species are removed and immediately replaced by a new species. This dynamics is essentially similar to the BS model.
Typical time series for this model with µ = 0 and 0.1 are shown in Fig. 2. As expected, the number of species is controlled when µ = 0.1 while it fluctuates widely for µ = 0. Hereafter, the parameter N 0 is set to 950, which is comparable to the average number of species for µ = 0, in order to keep the number of species when changing µ and eliminate any side-effect caused by a change in N .
In Fig. 3(a), the probability distribution of f for various values of µ are shown. As shown in the figure, the distribution has positive values for f > 0 while it drops quickly for f < 0, similar to the link-based model. This profile does not change significantly by changing µ and the threshold value seems to be around zero.
We define the extinction size as the number of consecutive extinctions whose interval is less than ∆ = exp (f th /f 0 ). This definition is consistent with those in the DG and the BS models when µ = 0 and µ → ∞, respectively. Using f th = 0, the extinction size is calculated for various values of µ and is shown in Fig. 3(b). When µ = 0, the extinction size distribution decays exponentially. As we increase µ, the tail becomes heavier and an overall power-law distribution is found for a sufficiently large µ. When µ = 0.1, the distribution is approximately fitted by a power law with exponent −1.5. Although there is a slight deviation from a power law for large s (∼ 10 4 ) and a non-monotonic dependence on µ is found, it may be due to a finite-size effect since the extinction size is the same order as N 0 when it deviates from the power law.
The distribution of inter-event time, which is defined as the period between two consecutive extinctions, is shown in Fig. 3(c). When µ = 0, the distribution consists of an initial power-law decay for τ < 1 and an exponential decay for τ ≥ 1. The former part corresponds to the extinctions within a single migration event, which is represented as the data point for τ = 0 in Fig. 1(b-3). For τ ≥ 1, the curve decays exponentially. It is consistent with the results for the DG model. The tail of the distributions gets much broader and closer to a power law as we increase µ. In the figure, a line indicating a power law τ −1.2 is shown as a gray dashed line. The distribution is fitted fairly well by a power law although the curve is slightly concave in a log-log plot, indicating the time series are bursty. (One may also find the burstiness from Fig. 2.) [40] Similar transient behavior is also observed in the species lifetime distribution P (L). As shown in Fig. 3(d), the profile changes significantly with µ. When µ is small enough, the curve shows a quick decay while it has heavy tails for larger µ. The curves for small µ are well fitted by stretched exponential functions with exponent close to 1/2, which is similar to the DG model. As µ increases, the curve becomes closer to a power law. For µ = 10 −1 , the distribution is approximated by a power law L −1 .
Therefore, all these statistics indicate a crossover from a non-SOC behavior to SOC with increasing µ, showing that the constraint on the number of species is an essential factor for the emergence of SOC.

IV. SUMMARY AND DISCUSSION
In summary, we formulated and studied a model that bridges the DG model and the BS model in order to identify a key factor for generating the SOC phenomena in the biological evolution model. By a comparative study of a model that controls the fluctuations of N , we found that the constraint on the number of species significantly alters the model behavior. When N is fixed, an extinction of a species is followed by immediate replacement by When µ is small, an initial power law decay followed by a stretched exponential curve is observed. When µ is large enough, an approximate power law is observed. The dashed gray curves are a stretched exponential function and a power law. The parameters c = 0.01, N0 = 950 and f0 = 0.02 are used. Each simulation runs for 4 × 10 7 steps in addition to 6 × 10 5 initialization period which are not included in the statistics. The data are averaged over 5 independent runs. Error bars denote the standard errors.
a new species, i.e., the system is under a high pressure of potential new species trying to migrate into it. Such immediate introductions of new species is a necessary condition to keep the system in a critical state. If this condition is not met, the system goes to an off-critical state as N decreases, preventing critical avalanches of extinctions.
Although the BS model has been used to explain the origin of punctuated equilibria or large-scale extinctions, its applicability is questionable as the conservation of the system size is not satisfied in general. To realize an SOC state, it is required to have some external mechanism to maintain the system size at a constant level in addition to intrinsic Darwinian evolutionary competition, indicating that the BS model is valid only in limited situations. Future studies are needed to explore other possibilities for modeling evolutionary dynamics.
In this paper, we limited ourselves to the simplest cases, in which new species have completely random phenotype. Clearly, it is an idealized assumption and a different rule for introducing new species may alter the story as well. Actually, it is known that qualitatively different results are obtained for the "mutation" Tangled-Nature models [28]. In these models, a species represented by an L-bit genome sequence may mutate to another species by flipping one of the bits. The models yield a bursty time series characterized by 1/f -fluctuations and powerlaw species-lifetime distributions. It would be interesting to simplify these mutation models as in the DG model in order to understand the origin of the burstiness. Other models have also been studied that show SOC states while allowing the system size N to change [41][42][43]. Each of these has its own rules for introduction and eliminations of entities, and a unified understanding is still lack-ing. Further studies in this direction are expected, and we believe the present study provides a foundation for better understanding of more complex models.

ACKNOWLEDGMENTS
YM acknowledges support from MEXT as "Exploratory Challenges on Post-K computer (Studies of multi-level spatiotemporal simulation of socioeconomic phenomena)" and from Japan Society for the Promotion of Science (JSPS) (JSPS KAKENHI; grant no. 18H03621). PAR gratefully acknowledges hospitality at the RIKEN Center for Computational Science, and in the Department of Physics, Graduate School of Science, University of Tokyo.