Progress to extinction: increased specialisation causes the demise of animal clades

Animal clades tend to follow a predictable path of waxing and waning during their existence, regardless of their total species richness or geographic coverage. Clades begin small and undifferentiated, then expand to a peak in diversity and range, only to shift into a rarely broken decline towards extinction. While this trajectory is now well documented and broadly recognised, the reasons underlying it remain obscure. In particular, it is unknown why clade extinction is universal and occurs with such surprising regularity. Current explanations for paleontological extinctions call on the growing costs of biological interactions, geological accidents, evolutionary traps, and mass extinctions. While these are effective causes of extinction, they mainly apply to species, not clades. Although mass extinctions is the undeniable cause for the demise of a sizeable number of major taxa, we show here that clades escaping them go extinct because of the widespread tendency of evolution to produce increasingly specialised, sympatric, and geographically restricted species over time.

The most direct expectation of weak directionality is that clades should become rich in specialised species as time passes. Hence, under this theory, we predict that most species in the late phase of clade evolution should have traits typical of specialists, such as small range size and high degree of sympatry 17,18,29,30 . Given high degree of sympatry and reduced range size are presumed to depress diversification, according to this model there should be some point in time where both the regime of range size evolution and the diversification process shift (Fig. 1). We tested this hypothesis by locating statistically significant shift points in the total range size, degree of sympatry, and net diversification rate curves. We then tested four hypotheses consistent with the early/late phases scenario to assess whether 1) our data support the existence and temporal coincidence of total range size, degree of sympatry, and diversification shifts, 2) the degree of sympatry increases and the net diversification rate decreases after the shifts, and 3) the average species range size decreases after the shifts (see Fig. 1) and 4) the degree of sympatry is negatively correlated to speciation rate and positively correlated to extinction rate, which would indicate the link between specialisation and the decrease in diversification.
To test these predictions we collected from the Paleobiology Database (https://paleobiodb.org/#/ on 2/10/2016) the fossil occurrence data on 21 extinct animal clades belonging to five different phyla (Cnidaria, Mollusca, Brachiopoda, Arthropoda, and Bryozoa, see supplementary information). The data included 14,430 The total (green solid line), average (blue solid line), and clade (gold solid line) range size curves for the focal clade. The total range curve is computed as the algebraic sum of individual species range sizes over time. The average range curve is computed dividing the total curve for the number of species present in each time bin. The clade range curve represents the range actually occupied by the entire clade, summed over consecutive time bins. According to weak directionality theory predictions, after the shift point (vertical light blue line) the total-and the clade range curves should diverge signficantly over time, as an effect of a progressive increased range overlap (sympatry). The area test (B) is devised to test such prediction. As species range sizes are expected to decrease, on average, after the shift points, the average range curve should take a negative slope after the shift (C). The slope test is devised to test such prediction.
For all of the analysed clades, we first computed the range size of each species per time bin, and the range size of the entire clade per time bin, which represents the union of individual species ranges (Fig. 2). We then summed individual species ranges within each time bin and then over consecutive time bins, to produce a "total range curve". The use of cumulative range values, rather than time bin data, is appropriate as it smooth off unequal sampling and allows calculating effectively changes in the regime of geographical evolution of clade (see below). The (slope of) total range curve plotted versus time indicates the velocity of range size accumulation at the level of clade. It is equivalent to the average size of species ranges times species richness cumulated over all time bins. The total range curve is best fitted by either sigmoid, or generalized logistic curves, while the linear model is rejected for all the examined clades. This indicates that the increase in total range size slows down towards the recent, according to saturation dynamic (Table 1). To quantify the degree of sympatry (Figs 1B and 2), we started by summing the geographic range of the entire clade over consecutive time bins. This "clade range curve" is different from the total range curve in that it depends on how much individual species ranges do overlap (for instance, if two species have range = 1 km 2 and do perfectly overlap the total range curve will be 1 + 1 = 2 km 2 , while the clade range curve will be 1 km 2 , Fig. 2). Then, we computed the area between total and clade range curves per unit time, under the specific hypothesis that the area difference between the two curves should be larger after the shift point, thereby indicating a higher degree of sympatry since (Fig. 1B). Eventually, we tested how often the difference between the two curves tends to increase after the shift points across clades, in keeping with our hypothesis 2, by means of the binomial distribution.
The net diversification rate was computed starting from the fossil record as the expected number of speciation/extinction events per lineage per Myr. Finally, we computed a second measure of the degree of sympatry at the level of bin. For each such bin, we took the ratio between the total range size (summed algebraically over all species in the bin) and the clade range size in the focal interval. This ratio represents the degree of overlap among individual species ranges. We assessed whether changes in the degree of sympatry throughout a clade's history significantly correlates with temporal variation in speciation and extinction rates. To this aim, we fitted birth-death models in which speciation and extinction rates respond to changes in sympatry by means of an exponential correlation with parameters γ λ and γ μ , respectively 11 .
Shift points in degree of sympatry among species, total range size and net diversification rate are statistically closer in time to each other than expected by chance in 20 out of 30 cases, and for 16 out of 21 clades (Table 2). Both figures are statistically different from chance according to the binomial distribution (Table 3), indicating that the existence and temporal coincidence of shift points are robust. We then took the average ages of the three shifts to get a single shift point, and tested hypotheses 2 and 3. In keeping with our predictions (hypothesis 2) the degree of sympatry increases after 28 out of 30 shift points. Even after excluding the earliest third of clade evolution (when a high degree of sympatry is expected because in a diversifying clade species tend to place close to each other on Earth's surface 5 ) we recover the same pattern. The increase is temporally coincident with shift points (Table 3).
Then, we tested the prediction that species average range size decreases after the shift points (hypothesis 3), by dividing the total cumulative range curve by the number of species present in each time bin. This allows testing the evolution of species average range size over time (Fig. 1A,C). After the shift points this curve has a slope significantly different from zero fifteen times, 12 of them being negative (i.e. the species average range size decreases towards the recent, Table 3), and in eight different clades. This is consistent with the idea that species after the shifts tend to be small-ranged and therefore specialists. Yet, after nearly one half of the averaged shifts (16/31, 51.6%) there was no significant pattern in average range size.
Net diversification rates decrease after the shift points in 28 out of 30 cases and for all of the clades, thus supporting our predictions.
Finally, as expected the extinction rates are positively correlated with the degree of sympatry in 12 clades out of 21, and the speciation rate is negatively correlated to sympatry in 13 clades (Table 4). Overall, 16 clades out of 21 show either decreased speciation or increased extinction as the degree of sympatry increases, in keeping with our prediction that sympatry (as a consequence of specialisation) depresses diversification to drive the clade extinct.
Overall, our results indicate that the distinction between an early and a late phase of clade evolution is useful, that the net diversification rate decreases consistently during the late phase, and that mainly specialist species, having high degree of range overlap to each other (sympatry), make up the majority of clade biodiversity after the shift points.
We repeated all of the analyses excluding species with less than 10 total occurrences, in order to rule out the possibility that what we perceive as rarity, is in fact lack of preservation. The results are available as supplementary information. On such a reduced dataset, we located 22 shift points for seventeen clades. The shift points are statistically significant in 19 out of 22 cases (86.4%), and net diversification rate is always lower after than before the shifts. Yet, the degree of sympatry after the shift is higher than before only 9 times, and the average range size is significant and negative just one time. Taken at face value, these latter results are not supportive of weak directionality. Yet, it must be noted that by excluding rare (or otherwise poorly sampled) species from the dataset, we effectively removed those species whose effect on clade range size evolution we were seeking to test.

Specialisation as the force promoting both the progress and death of clades
Weak directionality theory derives from the idea that natural selection incessantly fine-tunes species to their environment. Specialisation is an almost naïve consequence of such tinkering. Its payoff is evident in the short  9 . While this has long being noted when dealing with the fate of individual species, the impact of such a trap on the history of clades is massive. We show that the degree of geographic range overlap among species increases as clades grow older (and most specifically after the shiftpoints). In modern ecosystems, it was shown that niche specialisation promotes sympatry 17,18,30 . Hence, this result points to a non-random pattern of the incidence of specialized species over time.
We proved a direct link between sympatry and diversification, which taken together with the notion that sympatry tends to grow over time during the history of clades, is further, crucial evidence, that specialisation undermines clade survival in the long run. We expected sympatry to impact negatively on speciation rate, and positively on extinction rate, but only during the late phase of clade evolution (i.e. after the shiftpoints). In this regard, it is  Table 2. Average age of the shiftpoints (in Ma) and the percentage of time since clade inception to the shiftpoints. The probability that the distance among shift points is lower than expected by chance (p.dist). The ratio of the percentage of area per unit time between the total range and the clade range curves after the average shiftpoint as compared to the same figure before the shiftpoint (see Fig. 1 for further explanation)(area.test). The probability that net diversification rates decrease after the shiftpoint (net.t). The slope of the regression between the average species range and time after the shiftpoint (slope) and the probability that such a regression slope is different from zero (p.slope). Values in bold indicate compliance to weak directionality theory.  especially interesting to note that we found sympatry to correlate negatively to diversification rate 12 times, and positively only 3 times. As per speciation rate, we retrieved 13 negative correlations to the degree of sympatry, and only 4 positive correlations. Overall, this suggest that sympatry alone could be enough to explain the common observation that origination rate decrease over time for purely biological reason, even when diversity is low 4,31,32 .
The slow-down in geographic expansion that clades experience over time suggests that this mechanism is especially important when the available space for allopatric speciation becomes limited 33 , which our data suggest should happen after the shift points. Obviously, the net diversification rate decreases towards the present in all of the clades we analysed. Yet, what is important is that we found this decline is temporally coincident with the increased level of sympatry. Patterns of decreased diversification towards the present have also been frequently inferred from phylogenies [34][35][36] and explained in that context as the product of limited opportunities for speciation as clade history progresses towards the present 37,38 . However, our data suggest that increased habitat specialisation, hence higher sympatry are fundamental aspects of such trend in diversification. Our model is inherently simple and based on ideas that have roamed through the scientific literature for decades. The concept of weak directionality, for example, is directly based on Edward Cope's law of the unspecialised 13 (i.e. the idea that clade start with small generalist species to end up with large-bodied specialized types). We believe that the main obstacles to their further development until now have been the difficulties of estimating changes in clade range size from fossil data, the lack of adequate fossil databases, and a certain reluctance to use any concept of progress (i.e. strong directionality) in modern evolutionary thinking 39 . Sewall Wright's and George G. Simpson's concepts of the evolutionary landscape 7,40 are consistent with weak directionality theory. The landscape metaphor emphasizes the trend towards higher-fitness morphologies and genotypes as drivers of weak directionality. Here we have shifted the focus from how species evolve in coexistence by specialising on particular resources/environmental conditions as species richness builds up to the long-term, macroevolutionary effect of this process. Specialisation has often being envisaged as an evolutionary trap 9 but its potential contribution to clade extinction has been largely neglected since Cope first identified the link between specialisation and clade demise in his law of the unspecialised.
Liow 41 found that long-lived crinoid taxa are less specialised in morphology than more derived types (but see ref. 42). This bolsters the idea that specialisation increases extinction risk, and would support our interpretation here if such long-lived taxa are also stratigraphically older. Evidence for such a pattern comes from a study of ours on placental mammals 14 where we demonstrated that specialised types are geologically younger than their relatives. Statistical modelling of the evolution of specialisation further support the idea that natural selection on local adaptation and habitat choice always leads to specialists, implying the latter appear later, on average, than generalists 43 .
We will not push our model so far as to say that all of the clades have to follow the path it predicts. According to the weak directionality theory, after the shift point the geographic evolution of clades proceeds towards a  Table 4. Correlation between the degree of sympatry and speciation and extinction rates. Posterior sampled of the correlations parameters are summarized as mean values and 95% credible intervals (CI). The correlation parameters γ λ and γ μ quantify the correlation between temporal changes in the birth-death rates and changes in the degree of sympatry. For instance, the speciation rate at time t is λ t = λ 0 exp(γ λ s t ), where λ 0 is the estimated baseline speciation rate and s t is the degree of sympatry at time t (ref. 11). Values in bold indicate significant negative correlation with speciation and significant positive correlation with extinction rates.
condition where a few, highly-sympatric, species coexist within a relatively small range. With the exception of the prediction for reduced average range size after the (mean) shiftpoint, we found complete adherence to our predictions for 15 out of 21 clades (binomial distribution, p = 0.026, Table 3). This is especially robust considering that a number of clades went extinct during a mass extinction, which means their natural course towards extinction was abruptly interrupted by something unrelated to range size dynamics within clade of any other biological attribute 10 , and the effect that specialisation had upon them. Although this dynamic is true for most of the clades we analysed -rhynconelliform brachiopods (Strophomenida, Spiriferinida, Orthotedida, and Productida), tabulate corals (Favositida), stenolaemate bryozoans (Fenestrida, Cystoporida, Trepostomida, Rhabdomesida), pteriacean bivalves (Pterineidae), proetid trilobites (Proetida), murchinsonid (Lophospiridae), eogastropod (Euomphalidae) and bellerophontid gastropods (Bellerophontida), and ammonoid ammonites (Desmoceratida) -there are also clades whose total range size keeps growing after the shift points, since the number of species remains high then -Auloporida (tabulate corals), Rhabdomesida (stenolaemata bryozoans), Strophomenida and Orthotetida (rhynconellid brachiopods). Not surprisingly, these are all clades whose species richness was still high in the late phase of their evolution (see supplementary material for the geographic and net diversification rate paths of individual clades). It must be emphasized that whereas the evidence for increased sympatry and reduced diversification after the shift points is very robust (pertaining to more than 90% of the shifts), the comparable figure for the expected average range size decrease is extremely weak, as the results conform to the expectations only in 12 out of 30 shift points (40% of the cases). Indeed, the range size frequency distribution is usually right-skewed within clades 44 . This means that (relatively) large-ranged species are always expected to occur within a clade. In addition, after the shift point the decrease in species diversity implies that surviving species probably have the opportunity to occupy the range they left vacated. If the clade range size does not decrease in the latter bins, such dynamics would imply that average range size would not decrease in many cases. Thus, expectation of reduced average range size for species over time is too simplistic 45 .
Most of the clades mentioned so far conform to the predictions of the weak directionality model: they show high levels of sympatry, small average range size, and small total range size after the shift points. A manifold of taxa, though, show small total ranges, small average range and small degree of sympatry. These clades survived what according to the model should have been their final extinction moment because of species having disjunctive ranges. Thus, although rare, there are cases of clades whose species become rare overall in their former ranges, rather than becoming restricted to sympatry in a small residual range.
Although potential artifacts may result in waxing and waning of species ranges and diversity at the clade level 46 , our empirical data sets differ from the predictions of random models of clade evolution, in conforming to the predictions of higher levels of sympatry, hence high incidence of specialization, occurs during the late portion of a clade existence.
Weak directionality theory provides a consistent and widely applicable explanation for the extinction of clades. They diversify and survive because individual species become more and more specialised over time. Yet, the very reason for clade success also contains the seed of their decline. The path to extinction is neither simple nor monotonic. We identified more shift points than clades, implying that clades may survive moments of crisis. It will be interesting to know how phenotypic diversity evolves during clade existence, in order to understand whether phylogenetic or developmental constraints on evolvability might prevent clades from escaping the evolutionary trap that weak directionality represents 47,48 .

Methods
We retrieved from the Paleobiology Database (https://paleobiodb.org/#/) data on fossil occurrences for 58 clades of marine invertebrate species selecting the fields occurrences, "advanced options", taxonomic resolution: "species", Plate: "Scotese", output options: "collection", "coordinates", "methods", "paleolocation" (see supplementary information for full details). Each data point includes the paleocoordinates and the (estimated) minimum and maximum age of the fossil localities. Data pertain to five different animal phyla (Arthropoda, Brachiopoda, Bryozoa, Cnidaria, Mollusca) and cover some 480 million years of the fossil record. Overall, the database includes ca. 21,000 species. The fossil record of individual clades was divided in equal-length time bins. The length of such intervals was clade-specific, meaning that we applied bins of different lengths as to maintain as many bins as possible while avoiding producing bins containing less than three species with at least three occurrences per species, which is the minimum requirement to calculate a species range size estimate. We further removed the species and genera that lack a continuous stratigraphic range and dubbed with uncertain taxonomic classification (e.g. sp., cf.). After applying these selection criteria, we were left with 21 clades, including 14,431 species and 84,457 occurrences overall.
Species and clades range size estimation. For each species in each time bin, we estimated the range size (in km 2 ) by computing its Minimun Convex Polygon (MCP 49 ), under the software Quantum GIS 2.10 50 . Because of MCP represents the extent of occurrence of a species, land portions could be included in a marine species' geographic range erroneously. In this very case, we removed land portions from the MCP by using the digitized version of 19 world maps displaying the reconstructed position of landmasses and seas (http://cpgeosystems.com/ index.html) in the temporal interval from 550 to 70 Myr. In particular, when we had species geographic extent ranging less than 180 decimal degrees in longitude, we used the Lambert Azimuthal Equal Area projection. In all other cases, when we had species extent exceeding 180 decimal degrees of longitude and included within + − 60 latitudinal degrees, we used Mollweide Equal Area projection. If the longitudinal extent exceeded 180 decimal degrees and latitudinal extent exceeded 60 decimal degrees north or south to the equator, we then used the Albers Equal Area projection. In the very special cases not considered in the above criteria we splitted the polygons in Scientific RepoRts | 6:30965 | DOI: 10.1038/srep30965 order to have different regions to be projected by means of Lambert Azimuthal Equal Area projection and then summed the areas computed for every single polygon portion to get the original species polygon area.
After computing individual species ranges, we summed them to get the total range in the bin. Time-bin total species ranges were then summed over consecutive bins to get the total range curve (Fig. 1A, green line). For each bin and clade, we also computed the clade range (the range effectively occupied by the entire clade in the focal bin, Fig. 2). Clade ranges were summed over consecutive bins as well, to get the clade range curve (Fig. 1A,B gold line). The difference between the actual and the total ranges per bin depends on how much species ranges overlap to each other (i.e. on the degree of sympatry).
Finally, we divided the total range per bin by the number of species present in that bin. This gives the average species range size (Fig. 1A,C blue line) of the species of that particular clade during that particular time bin.
Diversification rate analyses using PyRate. We analyzed fossil occurrence data for each clade using the program PyRate 51,52 , which provides a joint Bayesian estimation of the preservation process and diversification dynamics. We modeled fossil preservation by a non-homogeneous Poisson process with rate estimated from the data and expressing the mean number of expected occurrences per lineage per time unit (in this case 1 Myr). Under the PyRate framework, the preservation process is used to infer times of origination and extinction (when applicable) of all lineages, which represent the result of an unknown underlying birth-death process. The parameters of the birth-death process, speciation and extinction rates, represent the expected number of speciation/ extinction events per lineage per Myr, and are estimated from the data along with the times of origination and extinction and preservation rate. We ran 10,000,000 Markov Chain Monte Carlo (MCMC) iterations to obtain posterior estimates of the parameters. We assessed the presence of significant shifts in speciation and/or extinction rates through time, using the birth-death MCMC algorithm 52 . Based on the estimated number of shifts and their temporal placement, we obtained marginal posterior distribution of speciation, extinction, and net diversification rates calculated within 1 Myr time bins through the lifespan of each clade. Finally, after discarding the initial 2,000,000 iterations as burnin, we summarized rates through time by calculating mean and 95% credible intervals from the posterior samples, and used them to produce rates-through-time plots.
Statistical testing of weak directionality theory. Shiftpoints estimation and testing: We identified shift points (in time) by using the cross-entropy method, which applies the Bayesian information criterion to locate significant changes in the trend along the net diversification, total range, and clade range curves 53 . Then, we computed the mean age distance among the three. We compared such mean distance to a family of random mean age distances, to test whether the sum of the time distances among them was smaller than expected by chance, which would imply the shift points are statistically coincident in time. To perform such comparison, we sampled at random 9,999 times two ages from time-bins midpoints (the degree of sympatry and cumulative total range curves are computed per time-bin), and one age from the net diversification rate, 1-my long sample. Then, we calculated the mean distance among the three random time points. Eventually, we counted how many times the random mean time distance is higher than the real distance between breakpoints, to calculate the p-value for the hypothesis that real time distances are smaller than random means.
Testing the increase in degree of sympatry after the shift point: After locating the shift points, we first tested whether the degree of sympatry increases after them (the area test, see Fig. 1B). To this aim, we computed the area difference (i.e. the area between the two curves) between the cumulative total and the cumulative actual ranges, both for the total time intervening from clade birth to the shift point, and from the shift point to the moment of clade death. Then, we divided each area differences to the corresponding lengths of time. This gives two area differences per unit time. The ratio of the two differences, expressed as a percentage of the unit area difference after the shift point to the unit area difference before the shift point, indicates whether the degree of sympatry either increases or decreases towards the present, so that a value < 100 points to a decrease, and a value > 100 to an increase, in the degree of sympatry over time.
Testing for average range size reduction after the shift point: According to weak directionality theory, the species average range size should decrease after the shift points. To test such a prediction, we regressed species average range size against time, selecting only those time bins more recent than the shift points. A significant and negative regression slope would indicate the expected trend towards range size reduction (slope test, see Fig. 1C).
Testing for net diversification rate decrease after the shift point: We partitioned the per-million year net diversification rate, in rates before and rates after the shift points, and applied to the two samples Wilcoxon's (Mann-Whitney) one-sided test, to asses the hypothesis that net rates significantly decrease after the shift points.
Testing for the correlation between diversification and the degree of sympatry: We used a time-varying birth-death model to investigate the intensity and significance of correlations between speciation and extinction rates and the degree of sympatry 52 . The model is implemented in PyRateContinuous (https://github.com/dsilvestro/PyRate) and assumes an exponential correlation between sympatry and the birth-death rates, parameterized by correlation parameters γ λ and γ μ , which are estimated in a Bayesian framework. A posterior estimate γ > 0 indicates positive correlation between sympatry and the speciation (or extinction) rates, and γ < 0 indicates negative correlation. We considered the correlations as significant when 0 did not fall within their 95% posterior credible intervals. We used the times of origination and extinction of each species estimated by PyRate (see above) as input data for the PyRateContinuous analyses, and ran 1,000,000 MCMC iterations sampling every 1000 th , to obtain posterior samples of the correlation parameters and baseline speciation and extinction rates. We calculated the posterior means and 95% credible intervals of the correlation parameters after removing the first 200 samples as burnin.