The origin and evolution of coral species richness in a marine biodiversity hotspot *

The Coral Triangle (CT) region of the Indo‐Pacific realm harbors an extraordinary number of species, with richness decreasing away from this biodiversity hotspot. Despite multiple competing hypotheses, the dynamics underlying this regional diversity pattern remain poorly understood. Here, we use a time‐calibrated evolutionary tree of living reef coral species, their current geographic ranges, and model‐based estimates of regional rates of speciation, extinction, and geographic range shifts to show that origination rates within the CT are lower than in surrounding regions, a result inconsistent with the long‐standing center of origin hypothesis. Furthermore, endemism of coral species in the CT is low, and the CT endemics are older than relatives found outside this region. Overall, our model results suggest that the high diversity of reef corals in the CT is largely due to range expansions into this region of species that evolved elsewhere. These findings strongly support the notion that geographic range shifts play a critical role in generating species diversity gradients. They also show that preserving the processes that gave rise to the striking diversity of corals in the CT requires protecting not just reefs within the hotspot, but also those in the surrounding areas.

The flora and fauna of the Coral Triangle (CT; Allen and Werner 2002;Fig. 1A) have fascinated scientists since Alfred Russel Wallace (1863), and the surrounding oceans are widely recognized as a major marine biodiversity hotspot (Allen, 2000(Allen, , 2008Renema et al. 2008). This region contains the highest diversity of corals and many other marine groups (Carpenter and Springer 2005;Kerswell 2006;Paulay and Meyer 2006;Bellwood and Meyer 2009;Veron et al. 2009), with species richness-most of it associated with coral reefs-decreasing away from the CT both latitudinally and longitudinally (Ekman 1935;Briggs 1974;Hoeksema 2007). Although the biodiversity gradients associated with the CT are increasingly well documented (Hughes et al. 2002;Connolly et al. 2003;Karlson et al. 2004 high species richness within the region remain poorly understood (Bellwood and Meyer 2009;Barber and Meyer 2015).
From a macroevolutionary perspective, large-scale spatial gradients in species richness are a function of origination rates, extinction rates, and changes in geographic distributions of species over time (Goldberg et al. 2005;Jablonski et al. 2006;Roy and Goldberg 2007). Simpler models assume that distributions of taxa are static over time and attribute observed differences in richness to differences in origination and extinction rates. For example, Stebbins' (1974) now famous dichotomy of the tropics being a cradle or a museum of biodiversity focuses exclusively on origination and extinction rates and has been a dominant paradigm in analyses of latitudinal diversity gradients (Jablonski et al. 2006;Mittelbach et al. 2007;Schluter and Pennell 2017). However, it is evident that a complete understanding of the processes that shape large-scale biodiversity gradients requires not only estimates of origination and extinction rates, but also an account of changes in species' geographic distributions over evolutionary time (e.g., Jablonski et al., 2006Jablonski et al., , 2013  et al. 2016). Furthermore, simulation models suggest that failure to take into account range shifts can lead to biased estimates of diversification rates and thus potentially erroneous conclusions about the macroevolutionary drivers of richness gradients (Roy and Goldberg 2007).
Existing hypotheses for high species richness in the CT invoke various combinations of the three macroevolutionary processes mentioned above. The center of origin hypothesis focuses on higher speciation rates inside the CT with subsequent range expansions out of this region (Briggs 1966;Stehli and Wells 1971;Briggs 1974), analogous to the "out of the tropics" model of Jablonski et al. (2006) proposed in the context of the latitudinal diversity gradient. The center of survival hypothesis is analogous to Stebbin's (1974) metaphor of a museum of biodiversity and states that high species richness in the CT is due to relatively low rates of extinction there (Bellwood et al. 2012). In contrast, the Table 1. Biogeographic scenarios associated with the high species richness of the Coral Triangle (CT), and regional rates and species ages predicted from each model. center of overlap hypothesis emphasizes the role of range expansions into the region rather than in situ speciation or extinction in generating high CT diversity (Woodland 1983;Jokiel and Martinelli 1992;Hughes et al. 2002;Bellwood and Meyer 2009;Hodge and Bellwood 2016). Finally, the center of accumulation hypothesis attributes higher richness within the CT to a combination of species expanding their ranges into the CT and persisting there over time due to lower extinction rates (Ladd 1960;Paulay 1990;Pandolfi 1992;Bellwood and Meyer 2009). Each of these hypotheses is essentially a statement about the relative contributions of speciation and extinction rates inside and outside the CT along with rates of range expansions into and out of this region (see summary in Table 1). Distinguishing between the scenarios based solely on current species richness, phylogenetic relationships and geographic distributions remains difficult (Goldberg et al. 2005;Bellwood and Meyer 2009;Cowman 2014;Barber and Meyer 2015), but some of the existing hypotheses make stronger testable predictions than others. The center of origin hypothesis makes the clear prediction that species in the CT would, on average, be younger than their relatives outside this region because the age distribution should be dominated by newly evolved species that have had less time to disperse or go extinct (Stehli and Wells 1971;Goldberg et al. 2005;Roy and Goldberg 2007). In contrast, the center of survival hypothesis (Bellwood et al. 2012) makes the opposite prediction about evolutionary age-species in the CT would, on average, be older than their relatives in the surrounding areas (Roy and Goldberg 2007). The remaining hypotheses make less clearcut predictions about diversification within the CT, but in general imply older species within this area compared to the outside (Table 1). Investigating the relative contributions of origination, extinction and range dynamics in shaping large-scale diversity gradients is best achieved by integrating paleontological and phylogenetic information (Barber and Meyer 2015;Jablonski et al. 2017), but the fossil record of Indo-Pacific corals and other invertebrates remains poorly studied. In the absence of fossil data, fitting dynamic models to distributional and phylogenetic data can allow for estimating each of the regional rates (Goldberg et al., 2005(Goldberg et al., , 2011, which can then be compared with the various "center of" hypotheses for the CT (Table 1).

Model
The CT encompasses almost 30% of the world's reefs (Burke et al. 2012), containing more than 600 species or ß75% of all living reef-forming stony corals (order Scleractinia) (Veron et al. 2009). Despite this strikingly high richness, only 2.1% of the 842 species are endemic to this region (Veron et al. 2009;Hughes et al. 2013) (Fig. 1B). Instead, most species (70.8%) have geographic distributions spanning the CT and its surrounding regions. This pattern also holds for individual groups; for example, among acroporid corals (family Acroporidae, a major scleractinian subclade; Fig. 2) only 3.5% of the 289 species are endemic to the CT, but 78.5% are shared with surrounding regions. High endemism has been interpreted as support for the center of origin model in an analysis using reef fish, with the assumption that endemics evolved in the region and are young (Mora et al. 2003). Under that assumption, the CT is not recognizable as a center of origin for corals. However, endemism is an unreliable proxy for sites of origin and thus, by itself, should not be used to test macroevolutionary models of diversity gradients (Goldberg et al. 2005;Bellwood and Meyer 2009;Cowman 2014; but see Hanly et al. 2017).
With the exception of some earlier studies of reef corals (e.g., Stehli and Wells 1971;Pandolfi 1992), exploration of the macroevolutionary dynamics underlying the CT species richness peak has been dominated by data from reef fish along with some isolated studies of invertebrates (reviewed in Cowman 2014; Barber and Meyer 2015;Evans et al. 2016). Given the ecological and conservation importance of reef-building corals, there is clearly a need to better understand the macroevolutionary underpinnings of the high coral species richness in the CT. Here, we undertake quantitative tests of competing explanations for the CT richness peak based on a phylogenetic tree of living reef corals. To evaluate alternative hypotheses, we first compared species age distributions for present-day CT endemics, outside endemics, and widespread species. We then fit a dynamic phylogenetic model of speciation, extinction, and geographic range shifts to estimate the contributions of each of these processes. Our results do not support the center of origin hypothesis. Instead, they indicate that the remarkable diversity of reef corals in the CT hotspot is driven primarily by range expansions of species that evolved elsewhere.

Phylogenetic trees
We used two sets of phylogenetic trees for our analyses of geographic range evolution. The first comprised 1000 Bayesian posterior trees that were completely sampled for all 1547 scle-ractinian coral species (Huang and Roy 2015). These trees were modified from earlier coral phylogenetic studies (Huang 2012;Huang and Roy 2013) based on the supertree method (Baum 1992;Ragan 1992). They integrate data from seven mitochondrial DNA loci, 13 published morphological trees (Hoeksema 1989;Pandolfi 1992;Hoeksema 1993;Cairns 1997;Pires and Castro 1997;Wallace 1999;Cairns 2001;Daly et al. 2003;Budd and Smith 2005), available taxonomic information, and fossil node calibrations following Stolarski et al. (2011).
The second set of trees contained only species belonging to Acroporidae for which molecular sequence data were available. In BEAST 1.8 (Drummond and Rambaut 2007;Kumar et al. 2009;Drummond et al. 2012), we performed a partitioned Bayesian analysis with the seven mtDNA genes used above and two additional nuclear DNA markers (internal transcribed spacers 1 and 2, as well as Pax-C; see Table S1). Fossil node calibrations (normal distribution priors) were based on the oldest fossil occurrences of Acropora (Paleocene; 66 ± SD 2 mya) (Carbone et al. 1993;Baron-Szabo 2006), according to Richards et al. (2013), and Acroporidae (late Cretaceous; 100.5 ± SD 2 mya) (Wallace 2012). Five Markov chain Monte Carlo (MCMC) runs of 30 million generations were carried out with a sampling interval of 1000. The first one-third of all posterior trees were discarded as burn-in after verifying MCMC convergence in Tracer 1.6 (Rambaut et al. 2014), and the remaining trees were subsampled to every 100,000 iterations to generate 1000 molecular trees.

Species geographic ranges
Geographic range information for each species was obtained from Hughes et al. (2013), a database of digitized geographical range limits for 732 Indo-Pacific corals. Additional range information was obtained from the databases of Coral Geographic (Veron 2000;Veron et al., 2009Veron et al., , 2011 and IUCN Red List of Threatened Species (Carpenter et al. 2008). For species not represented in these databases, information was extracted from the Global Biodiversity Information Facility (GBIF; http://data.gbif.org). Ranges of recently described species not covered by the above sources were obtained from the original taxonomic descriptions (see Table S1). Each species was then characterized as being present only within the focal region (see below), only outside the focal region, or present both within and outside the focal region (widespread).
For our main analysis, we used the geographic data of Hughes et al. (2013) as the primary source and the CT as the focal region. To delineate the boundaries of the CT, we followed Veron et al. (2009Veron et al. ( , 2011, who defined 141 ecoregions (including 16 that comprise the CT) based on distributional data for all coral species. Each of these CT ecoregions reportedly contains more than 500 coral species.
To assess the sensitivity of our results to the exact definition of the focal region and the small number of endemics within it, we repeated our main analysis with an alternative geographic division. In this second division, the focal area was the Central Indo-Pacific (CIP) realm as defined by Spalding et al. (2007). The CIP covers a much larger area, comprising 58 of the 141 ecoregions defined by Veron et al. (2009Veron et al. ( , 2011 with the CT embedded inside it. An alternative compilation of species ranges was the Coral Geographic (Veron et al., 2009(Veron et al., , 2011. In this older dataset, 10 more species were considered endemic to the CT, with four fewer species present only outside the CT. To determine if our results were sensitive to variations in species ranges between this database and Hughes et al. (2013), we repeated our main analysis using ranges from Coral Geographic.

Species ages
As an initial assessment of whether the CT had been a center of origin for reef corals, we compared the ages of CT endemics to those of other reef coral species by extracting the terminal branch lengths from each supertree using the R package ape 3.0 (Paradis 2012). Species ages estimated from a molecular phylogeny will generally differ from ages determined from the fossil record, which are based on the first appearance of a species rather than the most recent reconstructed branching time (Huang et al. 2015). For example, if a widespread parent species produced a narrow-ranged daughter, both lineages would be assigned the same, younger age using a molecular tree (assuming both survive until the present without speciating). Nevertheless, ages under both definitions reflect regional differences in speciation rates in qualitatively the same manner (Roy and Goldberg 2007).
We also generated lineages-through-time (LTT) plots (Nee et al. 1992) of species partitioned by their geographic state. For each of the states-endemic to the CT, only outside the CT, and widespread-we pruned the tree to contain only lineages for which extant species have that state. The LTT plots of these pruned trees reflect not only the branching times associated with each lineage's geographic state, but also the number of species pruned away (Nee et al. 1994).

Geographic range model
To examine geographic range evolution and diversification in the CT versus all the other reef ecoregions combined, we applied a geographic state speciation and extinction model (GeoSSE; Goldberg et al. 2011) that expands a previous model of range evolution (Ree et al. 2005;Ree and Smith 2008) to allow inference of region-dependent diversification and range shifts from phylogenetic data. The GeoSSE model belongs to the BiSSE family of models, which compute the likelihood of a phylogeny and tip state data based on an evolving trait that affects the chances of speciation and extinction (Maddison et al. 2007). In our analyses, the GeoSSE model works with three geographic states-at any time, a species may be endemic to the CT, present only outside the CT, or widespread (i.e., present in both the CT and outside)and its parameters allow species to transition among these states, which can affect rates of speciation and extinction (Fig. 3A). The six model parameters are per-lineage rates at which species originate (speciation inside or outside the CT, s CT and s out , respectively), go locally or globally extinct (from inside or outside the CT, x CT and x out ), or expand their ranges from one region to the other (from inside or outside the CT, d CT→out and d out→CT ). The model assumes that these stochastic rates are constant over time and across lineages, and that the dynamics in the two regions and among species operate independently. At any time in the past, the chance that an endemic lineage had split or gone extinct thus

(A) Schematic summary showing transitions among the three geographic states induced by processes in the model. (B) Estimates of the regional speciation (s), extinction (x), and range expansion (d) GeoSSE model parameters (analysis #1). Bars and shaded areas mark the 95% credibility intervals of the marginal rate posterior distributions. Gray lines denote exponential priors.
depends only on the region in which it was present. Similarly, each widespread species can speciate in either region or contract its range. The model thus allows the combined processes of speciation, extinction, and range modifications to play out through the history of the clade, and its fit based on present-day data (i.e., phylogeny and geographic states) infers how common each process was per lineage.
Phylogenetic models like GeoSSE are best applied to monophyletic clades, but reef corals within the Scleractinia are paraphyletic because the order also includes deep-water corals (Le Goff-Vitry et al. 2004;Fukami et al. 2008;Kitahara et al. 2010;Stolarski et al. 2011). Consequently, we restricted our model analyses to the monophyletic Acroporidae (289 species, including Alveopora) (Dai and Horng 2009;Wallace 2012) (Fig. 2), which is similar to Scleractinia in its patterns of endemism and species ages ( Fig. 1B and C). To perform a comparable analysis on Scleractinia as a whole would require additional states and rate parameters, complicating the model fitting and distracting from our central questions.
For our main analysis (#1), we used the Acroporidae supertrees and the Hughes et al. (2013) geographic ranges, with the CT as the focal region. We first fitted the full seven-parameter model (including a speciation parameter for widespread species to produce one CT and one non-CT daughter species, s AB ), one six-parameter model (Fig. 3A) and three further-constrained fiveparameter models (equal speciation, extinction, or dispersal for the two regions) via maximum likelihood (ML). Model comparisons provided substantial support (Akaike information criterion, AIC, within three units of best-fitting model) for the six-parameter model on 87.1% of trees (Fig. S1). There was considerably less support for the seven-parameter model (AIC > 3 on 54.3% of trees; s AB mean = 0.00649 per Ma, median = 0 per Ma). To further check that omitting s AB did not affect our results, we fitted the seven-parameter model with the MCMC procedure described below and confirmed that our conclusions about the relationships among the other parameters were unaffected. Among the five-parameter models, none emerged as clearly better than the others or than the six-parameter model. We therefore used the sixparameter model as the basis for our Bayesian inference because it was the most inclusive of the well-supported models. We also fitted this six-parameter model to our data with the CIP as the focal region (analysis #2), as well as to geographic ranges relative to the CT based on the Coral Geographic (Veron et al., 2009(Veron et al., , 2011analysis #3). For each dataset, MCMC utilizing broad exponential priors (rates of 0.5/Ma) was initialized with the ML estimates. To assess the sensitivity of our results to the priors themselves, we repeated the MCMC for analysis #1 using a broad uniform prior (rates between 0 and 100/Ma; analysis #4). Analyses were carried out using the R package Diversitree 0.9-9 (FitzJohn 2012).
We also carried out a separate analysis on the Acroporidae phylogeny built on mtDNA and nuclear DNA data (analysis #5). This allowed us to assess the CT dynamics with a larger molecular dataset and a different set of fossil node calibrations, and without concerns about supertree construction based on morphology and large unresolved clades. Unlike the fully sampled supertree, only 41% of species were present on this phylogeny. In the GeoSSE likelihood function, the probability of an extant species being present on the tree in its observed geographic state was thus modified to incorporate the probability of it being sampled (FitzJohn et al. 2009;Goldberg et al. 2011).
The GeoSSE model assumes each of the six estimated rates is constant over time and across lineages, except for the effects of the geographic state, so estimates effectively summarize typical rates over this time period. Because chances of rate heterogeneity likely decrease with the temporal window over which they are estimated, we also fitted the model to the subclade consisting of the genera Acropora and Isopora, estimated to have evolved in the Paleocene (Wallace and Rosen 2006;Wallace, 2008Wallace, , 2012Simpson et al. 2011;analysis #6).
Overall, rates for each of the 1000 ultrametric supertrees, pruned to Acroporidae (analyses #1, #2, #3 and #4) or Acropora + Isopora (analysis #6), and the 1000 molecular trees (analysis #5) were estimated via ML and used for model comparisons. Our main results compare the posterior distributions of the rates, based on the concatenation of 1000 postburn-in MCMC steps on each of the 1000 supertrees.

Null comparisons
Models like GeoSSE can be a powerful means of inferring macroevolutionary processes from phylogenetic data, and they have performed well in simulation testing under a variety of conditions (Maddison et al. 2007;FitzJohn et al. 2009;FitzJohn 2010;Goldberg et al. 2011;Goldberg and Igić 2012;Magnuson-Ford and Otto 2012;Davis et al. 2013). Recently, however, concerns have been raised about the reliability of results using the BiSSE family of models. To address these concerns directly, and also to assess the power of our nonmodel-based species age analyses, we undertook a suite of sensitivity and null analyses.
The first concern about models like GeoSSE is that, although the analysis is explicitly phylogenetic, it does not properly treat nonindependence in the species' states that arise from shared ancestry (Maddison and FitzJohn 2015;Rabosky and Goldberg 2015). No solution to this issue exists, but we do not expect it to be the main driver of our results because none of the geographic states are clustered in one or few subclades.
The second concern is that models like GeoSSE may incorrectly report high confidence that the trait affects rates of speciation and extinction because there is no outlet in the model for other processes-beyond the focal trait-that may cause diversification rate heterogeneity (Rabosky and Goldberg 2015). Our analyses described above already addressed this concern to some extent. Analysis #6 considered a younger subtree, which reduced the opportunity for processes to operate outside the GeoSSE assumptions. Analyses #1 and #5 used phylogenies constructed and dated with different techniques (morphological vs. molecular data, supertree vs. supermatrix methods, and different fossil node calibrations). Tree construction artifacts could cause branch lengths to deviate from the assumptions of the GeoSSE model, but obtaining similar results from these different phylogenies partially alleviates this concern. We also did not infer the importance of state-dependent diversification merely by rejecting simple models that lacked it (Beaulieu and O'Meara 2016); our main results were instead based on parameter estimates under the model that allowed for state-dependent diversification.
To further address the concern of a suitable alternative model, we compared the GeoSSE model fit against a model that does allow for state-dependent diversification but does not attribute it to the geographic state. We extended the character-independent modeling approach (Beaulieu and O'Meara 2016) to a six-state trait, with the three geographic states replicated across two hidden states. Speciation and extinction rates were not allowed to differ inside and outside the focal region, but they were allowed to differ between the two states of an unspecified trait (analysis #7; Fig. S2).
We also used null simulations to address more directly the concern that diversification rate heterogeneity embedded in our phylogeny could cause spurious inferences about the evolution of our geographic trait (Pulido-Santacruz and Weir 2016). Each of the following simulation-based tests involved generating a neutral trait-one known not to affect rates of speciation or extinctionon each of the 1000 posterior trees, and then performing our LTT and GeoSSE analyses on it.
First, we randomly shuffled the empirical geographic states among the tips (analysis #8). This did not reflect an evolutionary process, but it approximated a very high rate of trait evolution and was most prone to generate incorrect significant statistical associations between the character state and diversification rate (Rabosky and Goldberg 2015). The shuffling procedure also allowed us to characterize how the unequal state frequencies may shape our results.
Second, we simulated the evolution of geographic range under the assumption of symmetric transitions between geographic regions via range expansion (d A→B = d B→A , where A and B represent the two regions for the neutral trait) and contraction (x A = x B )-these rates were fixed at one event per million years (analysis #9). One of our conclusions from the real data was that range shifts were predominantly in one direction, and this neutral trait allowed us to test whether that conclusion might be an artifact of correlations among the parameters and incorrect inference of region-dependent speciation.
Third, we simulated the evolution of geographic range under rates inferred from the real data, but without allowing the trait to affect lineage diversification (analysis #10). The rates for simulation were obtained by fitting to the real geographic states, on each tree, a three-state Mk model (Lewis 2001) that allowed range expansion and contraction but precluded transitions between the two endemic states. This neutral trait allowed us to test whether our finding that species originated primarily in one region was a statistical artifact.
For each of these neutral traits, we repeated our analysis of whether species ages differ among geographic states. We fitted the six-parameter GeoSSE model to each neutral trait and computed the 95% highest posterior density (HPD) interval for the differences in regional rates (s A -s B , x A -x B , r Ar B , and d A→B d B→A , where net diversification rate r = sx). We then summarized the GeoSSE results as the proportion of simulations in which the 95% HPD interval for each rate difference was wholly smaller or larger than zero, versus overlapping zero. If the model was indeed performing appropriately, we expected that the HPD intervals of the speciation and extinction rate differences should include zero for most of these simulations.

SPECIES AGES
Terminal branch lengths of acroporids differ significantly among species endemic to the CT, those that are found only outside the CT, and those that are widespread (analysis of variance, F = 22.10, P << 0.01). On average, the oldest species are CT endemics (Fig. 1C). Results are qualitatively the same for the entire scleractinian clade (F = 4.86, P < 0.01). When the geographic states were randomized on the tree (analysis #8), we found essentially no significant differences in species ages among the CT endemics, species outside the CT, and widespread species (P > 0.05 for 92.0% of trees; Fig. S3). Furthermore, as expected, there are no significant differences in species ages for neutral traits simulated without regional differences in speciation, extinction, or dispersal (analysis #9; P > 0.05 for 99.3% of trees; Fig. S4). When we fitted the real geographic states to the three-state Mk model (analysis #10), frequencies of the simulated neutral trait mirror those of the real data, and yet there are no significant differences in species ages (P > 0.05 for 98.8% of trees; Fig. S5). Thus, the phylogenetic signal and age distribution of geographic ranges do not seem to be an artifact of the low number of CT endemics, and instead likely reflect species tending to originate outside the CT.
LTT plots for subtrees comprising each of these geographic groups reveal that lineage accumulation was slowest for CT endemics, with differences particularly pronounced over the last 20 Ma (Fig. 2 for acroporids; Fig. S6 for all scleractinians). However, these patterns are not different from lineage accumulation observed in 1000 sets of shuffled geographic states (Fig. S7).
Overall, these plots provide no support for a higher rate of speciation inside the CT.

GEOGRAPHIC RANGE MODEL
Consistent with our inferences based on species ages, estimates from the GeoSSE model show that the speciation rate within the CT has been lower than elsewhere (s CT < s out with posterior probability of 0.97; Figs. 3B and S8; Table 2), contrary to the center of origin hypothesis. Furthermore, the estimated rate of extinction is higher within the CT (x CT > x out with posterior probability of 0.98), which is at odds with the center of accumulation and center of survival hypotheses. However, richness in the CT remains high despite this surprising combination of low speciation and high extinction because of a high rate of range expansion into the CT from the surrounding regions (d out→CT ). This rate surpasses range shifts away from the CT (d out→CT > d CT →out with posterior probability of 0.87) and exceeds the rate of extinction in the CT (d out→CT > x CT with posterior probability of 1.00).
Results are qualitatively unchanged when the focal region is enlarged from the CT to the Central Indo-Pacific (analysis #2; Fig. S9), or when distributions of coral species are derived from a different biogeographic database (analysis #3; Fig. S10), demonstrating that our conclusions are not sensitive to the precise definition of the two regions and to revisions of species geographic distributions.
These findings are robust to the statistical methods used to fit the GeoSSE model; similar estimates were obtained with a different prior distribution in the Bayesian analysis (analysis #4; Fig. S11) and via ML (Fig. S1). All of our results also incorporate phylogenetic uncertainty by estimating the model parameters across a posterior set of 1000 trees. Furthermore, our inferences are also similar when performed on the Acroporidae phylogeny reconstructed solely from molecular sequence data and using different fossil node calibrations (analysis #5; Fig. S12), showing further robustness to phylogenetic uncertainty.
To assess the sensitivity of our results to the possibility of temporal rate variation, we repeated the analyses on a younger subset of species-the clade comprising Acropora and Isoporawhich has experienced much less geological and environmental change than the family as a whole (analysis #6). The results remain qualitatively the same (Fig. S13).

NULL COMPARISONS
To allow for the possibility that a factor other than geographic range shapes region-dependent diversification, we fitted a character-independent model in which rates of speciation and extinction did not depend on geographic range, but could potentially depend on another unspecified trait (analysis #7). This model does not perform significantly better or worse than GeoSSE on most of the trees (Fig. S14), though when one model is preferred Table 2

All analyses are based on geographic range information obtained from Hughes et al. (2013) except for analysis #3, which is based on data from the Coral
Geographic (Veron et al., 2009(Veron et al., , 2011. Acroporidae supertrees are used for all analyses except analysis #5, which employs a molecular tree based on seven mitochondrial and two nuclear markers, and analysis #6, which is based on subclade Acropora + Isopora. (AIC > 2), it is GeoSSE with conclusions as reported above. It is thus possible that geographic range is not the only trait that affects speciation and extinction rates, but even so, this analysis also does not support a higher speciation rate in the CT. Finally, to test the robustness of our main analysis without relying on the assumptions of the more complex characterindependent model, we used our neutral trait simulations to assess whether our findings of region-dependent rates could be explained merely as an artifact of GeoSSE failing to include other important processes. Regardless of whether neutral traits have been obtained by shuffling tip states (analysis #8), simulating range evolution without region-dependent effects (analysis #9), or simulating range evolution under rates derived from the data (analysis #10), we do not find spurious associations between neutral traits and rates of lineage diversification or range shifts. That is, for nearly all simulated traits, the differences between regions in speciation, extinction, and net diversification rates have 95% credibility intervals that include zero (Table 3). This is in contrast to rate differences for the true geographic range trait, which generally exclude zero in consistent directions across trees (analysis #1; Table 3). The same is true for differences in dispersal rates, except when they are allowed to differ in the neutral simulations (analysis #10). This suggests that the influence of other factors on rates of speciation and extinction is not embedded in the shape of our phylogeny in a manner that irreparably misleads GeoSSE, nor are our conclusions driven solely by tip state frequencies. We thus conclude that the region-dependent rates of speciation, extinction, and range expansion that we infer likely reflect real differences in these processes.

Discussion
The fossil record of Scleractinia, and Acroporidae in particular, indicates lower richness in the Indo-Pacific relative to Europe and the Mediterranean prior to the Oligocene (Rosen 1988;Wilson and Rosen 1998;Wallace and Rosen 2006;Wallace, 2008Wallace, , 2012, but closure of the Tethys Seaway shifted the diversity center to the newly formed Indo-Australian Archipelago by Early Miocene (Wallace 2000;Renema et al. 2008). Despite the high diversity of corals in the CT, our results do not support the hypothesis that this diversity hotspot is a center of origin for reef corals. Age distributions of species endemic to the CT are not biased toward younger ages (Fig. 1C), and neutral trait simulations show that these patterns of regional species ages are not driven solely by the different frequencies of the geographic states, in particular the rarity of CT endemics. Similarly, LTT plots do not show a higher rate of diversification within the CT (Figs. 2 and S6), although the shallower slope of the CT endemics can be explained simply by their rarity, rather than supporting a lower speciation rate in the CT. Our model-based approach indicates that speciation rate inside the CT has been lower than that outside, and results are also inconsistent with the center of survival hypothesis-extinction rate within the CT is consistently higher than that outside in all of our model results (Table 2).
Instead, our results indicate a biogeographic dynamic in which species originate outside the CT and expand their ranges into the region, but afterward are more likely to go extinct inside the CT than outside (Table 1). Thus, while the high richness within the CT is primarily a product of a high rate of range expansion into the region, the diversity peak is also underlain by temporal turnover of species. This directionality of range expansion is consistent with the center of accumulation hypothesis, but the higher extinction rate within the CT is not. Of all the macroevolutionary hypotheses about the origin of CT species richness, the center of overlap, which posits range expansion into the CT but does not make explicit predictions about speciation and extinction, comes closest to our results. However, because our analyses do not subdivide the region outside the CT, we cannot test in more detail the hypothesis that species with nonoverlapping ranges outside the CT come to co-occur in the CT through range expansions. Alternatively, these results can be interpreted as being consistent with a new hypothesis that the CT is a macroevolutionary sink for

Results of the main GeoSSE analysis (#1), simulations based on shuffled states with the same tip frequencies as Hughes et al. (2013) (analysis #8), and neutral traits (analyses #9 and #10), summarizing distributions of the 95% highest posterior density (HPD)
interval for the difference in regional rates relative to zero (s = speciation, x = extinction, r = diversification, d = range expansion).

Analysis
HPD s CTs out x CTx out r CTr out d CT→outd out→CT corals where a steady immigration of species from the surrounding areas drives up species richness, but some of that influx is offset by extinction. As in any model-fitting analyses, our conclusions from the GeoSSE model are dependent on its assumptions. Specifically, as is common in most macroevolutionary analyses using phylogenies of living species, we make the simplifying assumption that rates of speciation, extinction, and range expansion depend only on geographic distribution, not on lineage identity or time. Consequently, the diversification and range dynamics quantified in this study represent a time-averaged summary covering the entire evolutionary history of the clade analyzed here, although the stochastic nature of the underlying process allows for occasional deviations from the general trend. Although our finding of region-dependent diversification is unlikely to simply be an artifact of this assumption, based on the null simulation analyses, caution is warranted for the extinction rate estimate in particular. It is well recognized that estimating extinction rates solely from phylogenies of living species is inherently difficult, especially when such rates may vary over time (Quental and Marshall 2009;Rabosky 2010;Beaulieu and O'Meara 2015;Rabosky 2016;Rabosky and Huang 2016). Therefore, rather than interpreting the absolute extinction rate estimates we obtained, we have focused on net diversification rates and the relative differences in these rates between our focal areas. Nonetheless, the regional variation in extinction rates inferred here is best viewed as a hypothesis and should be tested in the future with direct estimates from the coral fossil record when they become available.
Lower speciation and higher extinction rates within the CT may in part be explained by the dimensions of the two designated areas (Fig. 1A). Habitat heterogeneity aside (see Kiessling et al. 2010), the smaller size and more contiguous structure of the CT region could reduce population subdivision and ultimately speciation within it, while also increasing chances of extinction from the region. It is also possible that the CT has a larger per-area rate of speciation or smaller per-area rate of extinction, which is obscured by the sizes of the regions in our analysis. Furthermore, interspecific competition has been argued to be an important driver of speciation and extinction rates (Sepkoski 1996), and increasing competition resulting from an influx of coral species into a small region could suppress speciation rate and increase extinction rate (e.g., Rabosky 2013).
The asymmetry in rates of range expansion, however, is not easily explained by geometric differences between the regions. If increases in geographic range sizes were equally likely for species at any location within either region, the smaller size of the CT would bolster transitions to the widespread state for CT endemics relative to non-CT species. This null expectation is rejected by our findings, suggesting that the signal of enhanced dispersal into the CT cannot simply be the result of neutral changes in species distributions over geological time. The higher rate of dispersal into, rather than out of, the CT could have been due to a variety of factors including past climatic and sea level changes such as those during the Pleistocene that altered the distribution of shallow water habitats, as well as paleoceanographic changes that transported larvae into this region Kerswell 2006;Bellwood and Meyer 2009).
Although large-scale dispersal of reef corals in the Indo-Pacific remains poorly understood, recent studies have suggested that dispersal out of the CT may increase during episodes of global warming (Burrows et al. 2011;Kiessling et al. 2012), and the CT may be a net larval source under current environmental conditions (Wood et al. 2014). A population-level analysis of 45 Indo-Pacific reef-associated species also found that population establishments closer to the CT were generally older than those at peripheral locations, indicating that the CT has been a net source of biodiversity for the Indo-Pacific region (Evans et al. 2016). In reconciling these more recent patterns with our results, it is important to note the difference in temporal and taxonomic scales involved and also that range expansions such as those documented here necessarily involve not only arrival, but also establishment and persistence of populations in a new locality. Furthermore, distributional limits of Indo-Pacific coral and fish species have recently been shown to be better predicted by geological features such as tectonic plates and mantle plume tracts rather than by contemporary environmental conditions Leprieur et al. 2016), which suggests that changes to the physical environment in the deeper geological past may also have played an important role in driving distributional shifts that ultimately produced the CT diversity peak. Similarly, Pellissier et al. (2014) reconstructed refugial coral reef habitats during Quaternary climatic changes that helped preserve reef fish diversity. Although much of the speciation and extinction patterns estimated here are older than the Quaternary, the range dynamics would have been influenced by the spatial distribution of refuges, especially during periods of global cooling. Despite the growing body of evidence for a strong imprint of past geological change on present-day marine diversity gradients, we still know relatively little about how changes in the abiotic environment during the Cenozoic may have interacted with biotic processes such as competition to generate today's reef coral richness gradient. Better understanding the processes driving the range expansions we describe here will require integrating evolutionary models with paleoceanographic and modern environmental data at a much higher spatial resolution than attempted here. Regardless of specific drivers of change, our results provide strong support for the idea that insight into the dynamics of geographic range expansions is crucial for understanding the origin and maintenance of diversity gradients (Jablonski et al. 2013).
The lack of support for the center of origin hypothesis in reef corals, along with varied results from reef fish and other groups, strongly suggest that there may not be a single macroevolutionary explanation of why the CT is so species rich (Bowen et al. 2013;Cowman 2014;Barber and Meyer 2015). Some of the differences in results could certainly reflect variations in methodologies or patterns exhibited by subclades versus more inclusive groups. For example, a previous study argued for a high speciation rate in the CT for reef fish based primarily on their elevated level of endemism in the CT (Mora et al. 2003), but other analyses have found areas with even higher endemism outside the CT that are not as species rich (Hughes et al. 2002;Connolly et al. 2003). More importantly, these centers of endemism are not unambiguous sites of species origin because they contain a mixture of neo-and paleoendemics (Bellwood and Meyer 2009;Cowman 2014;Cowman et al. 2017). Applying the same dynamic GeoSSE model on the clownfish clade, Litsios et al. (2014) showed that these reef fish originated and diversified in the Central Indo-Pacific region, but an independent radiation followed in the Western Indian Ocean, resulting in surprisingly comparable diversification rates between the two regions. It is worth noting that the definition of endemism is tied to the spatial scale of analysis. Here, we apply this concept broadly to the regions examined-within the CT and outside the CT-to infer the rate differentials between our focal areas that have similar species richness despite being very different in size (Fig. 1). It is conceivable that diversification rates are higher in CT subareas, such as Raja Ampat and Cenderawasih Bay at West Papua that host many range-restricted species, than at specific localities outside the CT (Tornabene et al. 2015), but such a pattern needs to be verified with more spatially constrained analyses.
Temporal trends in origination, extinction, and dispersal rates are likely to vary among taxa due to differences in ecologies and life histories. In fact, distinct evolutionary histories have been shown to underlie congruent global species richness gradients for birds and mammals (Hawkins et al. 2012). For three diverse and abundant reef fish families-Labridae, Pomacentridae, and Chaetodontidae-whereas species origination and dispersal patterns have been temporally congruent throughout the Cenozoic (Cowman and Bellwood 2013a; see also Cowman and Bellwood 2011;Hodge et al. 2014), vicariance histories associated with the Tethys seaway closure, Isthmus of Panama and East Pacific Barrier are distinct between families (Cowman and Bellwood 2013b). Therefore, future studies of the macroevolutionary drivers of the CT biodiversity hotspot should focus on model-based estimates of origination, extinction, and dispersal rates during specific time periods rather than choosing a single best "center of" hypothesis (Cowman 2014; see also Goldberg et al. 2005;Roy and Goldberg 2007).
Coral reefs are under threat from a variety of anthropogenic stressors Hoegh-Guldberg et al. 2007;Knowlton and Jackson 2008;Pandolfi et al. 2011;Huang and Roy 2013), and better understanding of the processes that generate and sustain high species richness within the CT is essential for developing effective management strategies in the face of such threats. Our work here takes a macroevolutionary perspective, showing that the evolutionary roots of most CT coral species actually lie elsewhere, and that range expansions into the CT have been critical in maintaining diversity there. Thus, although measures to conserve reefs within the CT are increasingly being implemented (White et al. 2014), preserving the macroevolutionary processes that have given rise to this hotspot also requires protecting the biodiversity of surrounding areas (Bellwood and Meyer 2009;Bowen et al. 2013). Such a broader view that recognizes the importance of protecting peripheral regions may be important for ensuring the long-term viability of species in the CT region and the ecosystem services they provide.

AUTHOR CONTRIBUTIONS
DH and KR conceived of the study. DH collected the data. DH, EEG and KR designed the simulations and analyzed the data. All authors contributed to the writing of this paper.

ACKNOWLEDGMENTS
We thank T. Hughes, S. Keith, and C. Veron for sharing their biogeographic databases; G. Rouse and N. Budd for support and advice; as well as I. Sanmartín, W. Kiessling, and two anonymous reviewers for constructive comments that substantially improved the manuscript. EEG was funded in part by National Science Foundation (NSF) grant DEB-1120279, and KR was funded by a grant from the National Aeronautics and Space Administration. This work was supported by the National Research Foundation, Prime Minister's Office, Singapore under its Marine Science R&D Program (MSRDP-P03). The authors have no conflict of interest to declare.

DATA ARCHIVING
All datasets and R scripts are available at the Dryad Digital Repository: https://doi.org/10.5061/dryad.8395f.

Supporting Information
Additional supporting information may be found online in the Supporting Information section at the end of the article. Table S1. List of 1547 species used to reconstruct the phylogeny of Scleractinia. Figure S1. Maximum likelihood estimates of the regional speciation (s), extinction (x), and range expansion (d) GeoSSE model parameters. Figure S2. Schematic summary showing transitions among the geographic states induced by processes in the GeoSSE and character-independent models. Figure S3. Species ages based on phylogenetic tip length of CT, non-CT, and widespread species for shuffled geographic states with the same tip frequencies as Hughes et al. (2013) (analysis #8). Figure S4. Species ages based on phylogenetic tip length of CT, non-CT, and widespread species for neutral traits simulated without regional differences in speciation, extinction, or dispersal (analysis #9). Figure S5. Species ages based on phylogenetic tip length of CT, non-CT, and widespread species for neutral traits obtained by fitting real geographic states to the three-state Mk model (analysis #10). Figure S6. Lineages-through-time plots partitioned by region for all scleractinian corals. Figure S7. Averaged lineages-through-time plot of each region compared to 1000 sets of shuffled geographic states (gray). Figure S8. GeoSSE estimated parameters computed with MCMC for 1000 randomly resolved trees, showing variability among trees (1000 steps per tree) for analysis #1 (CT vs. out). Figure S9. GeoSSE parameter estimates computed with MCMC for Acroporidae trees in analysis #2 (CIP vs. out). Figure S10. GeoSSE parameter estimates computed with MCMC for Acroporidae trees based on the Coral Geographic in analysis #3. Figure S11. GeoSSE parameter estimates computed with MCMC for Acroporidae trees based on a broad uniform prior (rates between 0 and 100/Ma) in analysis #4. Figure S12. GeoSSE parameter estimates computed with MCMC for posterior trees based on Acroporidae molecular data in analysis #5. Figure S13. GeoSSE parameter estimates computed with MCMC for trees of the clade Acropora + Isopora in analysis #6. Figure S14. AIC of the character-independent model in which rates of speciation and extinction do not depend on geographic range, compared against the GeoSSE model (Fig. S2).