A simple model for skewed species-lifetime distributions

A simple model of a biological community assembly is studied. Communities are assembled by successive migrations and extinctions of species. In the model, species are interacting with each other. The intensity of the interaction between each pair of species is denoted by an interaction coefficient. At each time step, a new species is introduced to the system with randomly assigned interaction coefficients. If the sum of the coefficients, which we call the fitness of a species, is negative, the species goes extinct. The species-lifetime distribution is found to be well characterized by a stretched exponential function with an exponent close to 1/2. This profile agrees not only with more realistic population dynamics models but also with fossil records. We also find that an age-independent and inversely diversity-dependent mortality, which is confirmed in the simulation, is a key mechanism accounting for the distribution.


Introduction
The dynamics of ecosystems in evolutionary timescales has attracted much attention since it is a well-known and good example of understanding complex open systems, where various interactions among species may cause collective behavior [1,2]. The development of theoretical models will help us to better interpret fossil records and further understand the mechanisms of evolutionary processes.
In this paper, we focus on the distribution of the lifetime of species. The lifetime of a species is the duration from its emergence to its extinction. The distribution of families (or genera) estimated from fossil records exhibits an interesting profile: it is neither a simple exponential nor a clear power law in the sense that it is convex on a log-log plot and concave on a semi-log plot [3]- [5] (also see figure 1). This implies that the distribution has a heavier tail than that of an exponential function but decays faster than a power law.
There are several theoretical explanations for lifetime distributions. The red-queen hypothesis states that environmental changes are so rapid and extreme that older species cannot obtain evolutionary advantage over a younger species; hence the mortality should be independent of time. This leads to an exponential function for the species-lifetime distribution: exp(−t/τ )/τ [6]. Another group of simple theories supports power-law distributions. Assume that the fitness of each species (which can be the number of individuals, or any other measure of the distance from extinction) follows a neutral random walk in one-dimensional space. Then the lifetime is the first return time of a random walk, which obeys the t −3/2 power-law distribution for large t. If the dynamics of a species is a critical Galton-Watson branching process [7] instead of a random walk, the return time distribution shows a power law with an exponent −2. A more general stochastic process predicts power laws with exponential cutoffs [8]. In addition to these stochastic processes, the so-called 'self-organized criticality' ('SOC') models predict power laws [1,9].
In addition to these simple models, population dynamics models are also proposed with the aim of bridging the differences in ecological and evolutionary descriptions. These include the tangled-nature models [10,11] and their variants [12]- [16], the web-world model [17,18], the scale-invariant model [19,20] and others [21,22]. All those models have population dynamics and rules for emergence and extinctions of species.
Recently, models with random migration rules were found to present the skewed specieslifetime distributions for various functional forms of population dynamics [16,19]. In these models, migrations are modeled as additions of new species whose interaction coefficients are randomly assigned from certain distributions irrespective of resident species. The resulting species-lifetime distribution shows a curved line that lies somewhere between a power law and an exponential distribution: concave on a log-log plot and convex on a semi-log plot. It is worth noting that this skewed profile seems to be observed universally with respect to the functional forms of population dynamics and model parameters. Although these models have different numbers of species and various community structures, they all share similar profiles of specieslifetime distributions.
While skewed profiles are observed not only for several population dynamics models but also for fossil records, a mechanism that gives rise to such a skewed profile is still unclear. The universality of the profile suggests that the underlying mechanism should be captured by a simple model. For this reason, we propose a further simplified model that reproduces the skewed profile, and we then analyze it in the following.   [4]. The original data contain the list of presence of families (present, absent or unknown) in each geological era. Circles (a) and triangles (b) represent the estimates where unknown cases are assumed as present and absent, respectively. (c-e) Species-lifetime distributions for (c) the scale-invariant model [19], (d) the tangled-nature model A and (e) the tangled-nature model B [16] with random migration rules. The data for mathematical models are shifted in the horizontal and vertical directions so that they collapse onto the fossil data. (f, g) Fittings (f) by a q-exponential function (equation (1)) with τ = 33.0 and q = 1.23 and (g) by a stretched-exponential function (equation (2)) with τ = 10.0 and β = 0.5.

Model
A community in our model is organized as successive migrations and extinctions of species. Each species has a state variable called 'fitness', which depends on interactions of that species with other species. The interaction of species i with species j is described by the coefficient a i j , and the fitness of the ith species is defined as the sum of all interactions the species i has, given by f i = j a i j . The species i can survive if f i is larger than or equal to zero. If the fitness is negative, that species becomes extinct and is eliminated from the system. Although the assumptions here are similar to those suggested in the model used in [23], our model accommodates changes in the number of species and suggests much simpler rules for driving evolution.
At each time step, a new species migrates into the system, which may cause successive extinctions. We take the following procedure for each time step. Firstly, a new species is added to the system. For each new interaction coefficient, we assign a nonzero random value from the Gaussian distribution with mean 0 and variance 1 with probability c and 0 with probability 1 − c. Coefficients for intra-species interactions a ii are zero. Secondly, after the migration, fitness of each species f i is calculated and a species with the minimum fitness is identified. If the minimum fitness is non-negative, no extinction happens at that time step. Otherwise, the species with the minimum fitness is eliminated from the system (extinction), and f i for all species are recalculated and another species with the minimum fitness is identified again. We repeat this process until all fitness variables f i become non-negative. Figure 2 illustrates an example of this process.
Fitness of a species changes only by migrations and extinctions of other species. In a migration event, a change in f i is caused by the a i j assigned between the species i and the new entrant. Such changes follow the Gaussian distribution with mean 0. In an extinction event, a change in f i is caused by a loss of a i j between species i and the species going extinct. The mean of lost a i j is expected to be positive since the sum of all interaction coefficients held by resident species must always be positive. In our simulation, the average of all interaction coefficients a i j of the emerging communities is confirmed to be slightly positive. An extinction of a certain species reduces the fitness of other species on average, and the dynamics of species' fitness can no longer be described by a neutral random walk. Figure 3(a) shows a temporal dynamics of the number of species, N for c = 0.2. It fluctuates in a finite range with its average 61.7. As shown in figure 3(b), the probability density function of N peaks at around 50 and decays exponentially for large N . The power spectrum for this time series is calculated and found to be approximated by 1/ f 2 with saturation at low frequency, which indicates that the evolutionary process reproduced by our model is an Ornstein-Uhlenbeck process. We next examined the distribution of extinction sizes, which is the number of species that go extinct at a single time step. The distribution exhibits a simple exponential form, which is different from the power laws typically obtained from the SOC models and similar to the one obtained from the population dynamics models with random migrations [16,20]. It is confirmed that the statistics discussed below does not change qualitatively against various values of c, although the characteristic number of species and the timescale change. Thus, we show the results only for c = 0.2 in the following. The species-lifetime distribution is shown in figure 4. One of the fitting functions proposed to account for this skewed profile is a q-exponential function [4],

Result
where q is an exponent to characterize the skewness and τ is a characteristic timescale at which p(t) crosses over from the exponential function to the power law t 1/(1−q) . This function is naturally derived from a simple age-dependent mortality function. If the mortality function m(t) is an inverse of a linear function of age, a q-exponential species-lifetime distribution is naturally derived. Another candidate of fitting is a stretched exponential function, where β and τ are fitting parameters. This function is often used phenomenologically to describe relaxation dynamics in disordered materials such as glasses [24]. Among the three candidates shown in the figure, the stretched exponential function fits best, while the q-exponential function also fits reasonably well. The exponent β for the stretched exponential function is approximately 1/2. The fitting parameter q for the q-exponential distribution is 1.16, which is smaller than the value estimated from fossil records, 1.23.

Discussion
While both stretched exponential and q-exponential functions exhibit a 'skewed' profile, the pictures these distributions stem from are different. A q-exponential function originates from a simple age-dependent mortality function [4]. In this picture, long-living species are expected to have advantageous features to survive. If a species survives for a longer duration, that species has a higher survival probability than other species surviving for a shorter duration.
On the other hand, a stretched exponential function is typically obtained by a linear combination of exponential functions with different decay times. If we assume that the dynamics of N is characterized by an asymmetric random walk, a stretched exponential function with exponent 1/2 is naturally obtained. Let us denote the probability of N becoming N + 1 by p  Fitting curves are also shown with a stretched exponential function, a qexponential function and a simple exponential function. Fitting parameters and the reduced chi-square σ 2 (the degrees of freedom is 2988) for the stretched exponential function are β = 0.576, τ = 36.8328 and σ 2 = 17.67. For the qexponential function, we obtain q = 1.16 and τ = 93.2 with larger deviation from the data: σ 2 = 123.0. For the simple exponential function, which has a smaller number of parameters and hence its degrees of freedom is 2989, we have the worst fit σ 2 = 430.3 with the fitting parameter τ = 134.622. In the inset, a fitting with a simple exponential function with τ = 350 is also shown as a guide to the eyes. and becoming N − 1 by 1 − p. Here we assume p < 1/2 and N is bound to be positive. In addition, we assume that p is independent of N and that every species goes extinct with an equal probability irrespective of their age. Such an asymmetric random walk model yields an exponential decay for probability density functions of N . Independence of p from N means that the extinction probability of each species living in a community of N species is proportional to 1/N ; hence the characteristic time to extinction τ is proportional to N . A species-lifetime distribution is then expressed as where K n (x) is the modified Bessel function of the second kind and b is a coefficient that depends on p. We call this set of assumptions in the above discussion the 'modified red-queen hypothesis', since it assumes an equal mortality for all the species in each moment. In order to investigate which picture is more reasonable for the proposed model, the mortality of each species is estimated from the simulations. Using the communities obtained from the migration-and-extinction process, the probability that each resident species goes extinct in the next time step (mortality) is calculated by adding a test immigrant many times (100 times in our simulation). The relationship between mortality and age of a species is shown in figure 5 for different numbers of species. For all N , mortalities show a sharp decrease at t ∼ 0 and then converge to a constant value for t 1. Mortality in the large t regime shows significant dependence on N , but not so much on t. And the mortality is approximately proportional to 1/N as shown in figure 5(b). These are consistent with the modified red-queen hypothesis: extinction risk of a species in the model is almost independent of its age and inversely related to the number of species of a resident community. Note also that this modified red-queen hypothesis is a good approximation, but not an exact reduction of the model. We can find a deviation in the fitting of simulation results with a huge number of samples, which appears as the large values of reduced chi-square.
The analytical treatment for the current model has not been obtained. The difficulty comes from the fact that obtaining a skewed distribution for species lifetime requires not only dynamics of each species' fitness but also the fluctuations of N . In a model with a constant N in which only one species with the minimal fitness is removed at each time step, the species lifetime shows an exponential distribution. The graph model we study in this paper is so simple that most of the ecological details are omitted, as in Hubbel's neutral theory [25]. In spite of the absence of population dynamics, the model shares a similar species-lifetime distribution with several population dynamics models [16]. This fact makes us speculate that a further fundamental mechanism might be shared by the graph model and population dynamics models. We believe that models with a more complex population dynamics model will show a similar skewed profile.
In our model, new species are assumed to have completely random phenotype. This is also an idealized aspect of the model. Since the empirical data shown in figure 1 are collected at the level of families, we speculate that phenotypic correlation between resident and new species 8 may be irrelevant. Effects of phenotypic correlation may alter the profile of species-lifetime distributions. The understanding of these effects on species-lifetime distributions is still open.
In summary, we proposed a simple model to explain a skewed species-lifetime distribution observed in the fossil record and population dynamics models. The skewed profile is fitted well by a stretched exponential function with exponent 1/2. The mortality of a species is found to be independent of age and inversely dependent on the number of species. Therefore, the skewness found in the lifetime distributions is compatible with the modified red-queen hypothesis. We believe that such a skewed profile will also appear in other fields of study such as sociology or economics because features of this model are not necessarily biology specific. Similar lifetime distributions are already observed in other empirical data [26], while data showing simple exponential functions are also common [27].