What affects power to estimate speciation rate shifts?

The development of methods to estimate rates of speciation and extinction from time-calibrated phylogenies has revolutionized evolutionary biology by allowing researchers to correlate diversification rate shifts with causal factors. A growing number of researchers are interested in testing whether the evolution of a trait or a trait variant has influenced speciation rate, and three modelling methods—BiSSE, MEDUSA and BAMM—have been widely used in such studies. We simulated phylogenies with a single speciation rate shift each, and evaluated the power of the three methods to detect these shifts. We varied the degree of increase in speciation rate (speciation rate asymmetry), the number of tips, the tip-ratio bias (ratio of number of tips with each character state) and the relative age in relation to overall tree age when the rate shift occurred. All methods had good power to detect rate shifts when the rate asymmetry was strong and the sizes of the two lineages with the distinct speciation rates were large. Even when lineage size was small, power was good when rate asymmetry was high. In our simulated scenarios, small lineage sizes appear to affect BAMM most strongly. Tip-ratio influenced the accuracy of speciation rate estimation but did not have a strong effect on power to detect rate shifts. Based on our results, we provide suggestions to users of these methods.

The diversification rate of a lineage is the difference between its speciation rate λ and extinction rate μ. Testing hypothesis of diversification rate variation is underpinned by the ability to decouple and accurately estimate these rates. Estimates of extinction rates from phylogenies of extant taxa appear to be error prone (Rabosky, 2010;Laurent, Robinson-Rechavi & Salamin, 2015;; but see Stadler, 2013;. On the other hand, speciation rate estimates are generally considered to be more robust. One of the most common themes in macroevolutionary studies over the last decade has been to test whether a trait (or trait variant) has increased speciation rates (e.g. Sundue, Testo & Ranker, 2010;Claramunt et al., 2012;Escudero et al., 2012;Litsios et al., 2012;Horn et al., 2014;Rainford et al., 2014;Xiang et al., 2014;Gubry-Rangin et al., 2015;Igea et al., 2017, Wiens et al., 2017Sahoo et al., 2017, Seeholtzer et al., 2017, with a suite of analytical tools providing the framework to infer speciation rate variation across the phylogeny. Historically, analyses testing the effect of a trait on diversification relied on comparisons of species richness of sister clades (e.g. Mitter, Farrell & Wiegmann, 1988;Zeh, Zeh & Smith, 1989). This method cannot distinguish between λ and μ, is prone to Type II error (non-detection of significant diversification rate differences), and does not effectively utilize information from clades with mixed character states (Maddison, Midford & Otto, 2007). The most recent methods aim to utilize information in the branching patterns of dated phylogenies to decouple λ from μ (Stadler, 2011). The BiSSE (Binary State Speciation and Extinction) (Maddison, Midford & Otto, 2007) modelling approach has been especially popular for hypothesis testing because it estimates λ and μ associated with character states, i.e state-dependent diversification rates. BISSE specifies a stochastic model where λ and μ can depend on the character state of a lineage at each time point, and the rates of character state change are allowed to vary (Maddison, Midford & Otto, 2007;Fitzjohn 2009). Inferences about speciation and extinction rates in relation to character state are made by comparing the maximum likelihood scores of competing models using likelihood ratio tests or AIC (Akaike Information Criterion) scores. BISSE requires a completely resolved dated phylogeny with information on character states of tips as input, and can take into account incomplete sampling (Fitzjohn et al., 2009 Goldberg & Igić, 2012) integrate cladogenetic and anagenetic trait evolution.  proposed the HiSSE (Hidden States SSE) model, which attempts to account for unmeasured ('hidden') factors impacting diversification rates of a known trait or character state.
In contrast to the BiSSE family of methods, character-independent diversification methods attempt to identify the number and location of rate shifts in speciation and extinction across the tree, without a priori information on the mechanism of rate variation. Once the locations of rate shifts are found, the researcher can test for associations with traits of interest. MEDUSA (Modeling Evolutionary Diversification Using Stepwise Akaike Information Criterion; Alfaro et al., 2009), is one such framework that has been very popular. MEDUSA incrementally assigns rate shifts to all branches of the tree, and uses stepwise AIC to determine the number and location(s) of rate shifts that best fit the data. Rate shift estimates are thus agnostic of the cause of rate variation among lineages.
BAMM (Bayesian Analysis of Macroevolutionary Mixtures; Rabosky, 2014) is the most widely used character-independent diversification method. BAMM assumes that λ and μ are heterogeneous across the phylogeny, and that changes in these parameters across branches occur under a compound Poisson process. It uses reversible-jump Markov chain Monte Carlo to explore models varying in the number of shifts in diversification parameters. Estimates of λ and μ, and inferences on the number of rate shifts are based on posterior distributions. Both BAMM & MEDUSA require a dated phylogeny and can accommodate incomplete sampling.
Although BiSSE, BAMM and MEDUSA have been very popular, recent critical evaluations of their performance have highlighted potential shortcomings particular to each method. Using empirical datasets, Rabosky & Goldberg (2015) found that BiSSE is prone to high Type I error rates, wherein diversification-neutral traits are often found to be significantly associated with speciation rate. Surprisingly, such false associations appear to be detected even for traits with weak phylogenetic signal (Rabosky & Goldberg, 2015). The ability of the BiSSE method to detect state-specific diversification rates has been shown to be affected by factors such as tree size (number of tips) (Davis, Midford & Maddison, 2013;Gamisch, 2016), tree age (Simpson et al. 2018) and tip-ratio bias (i.e. ratio of tips with one character state versus another) (Davis, Midford & Maddison, 2013): the method appears to perform better with large trees and low tipratio biases.  used extensive simulations to understand the statistical behavior of MEDUSA, and showed that the method is prone to a very high rate of false inferences of rate shifts (ca. 30% on average), and that the estimated diversification parameters are biased. The probability of rate shift detection in MEDUSA depends on the number of terminals in the tree (Laurent, Robinson-Rechavi & Salamin, 2015). Moore et al. (2016) showed that the accuracy of BAMM is strongly affected by the priors specified, and that the estimates diversification rate parameters are unreliable (although see Rabosky et al., 2017). Meyer & Wiens (2018) found that BAMM underestimated the number of rate shifts, and overestimated diversification rates for small clades.
Analyses of simulated phylogenies have proved to be a valuable tool to assess the performance of the various modelling approaches. We simulated phylogenies with a single speciation rate shift each, and evaluated the power of BiSSE, BAMM & MEDUSA to detect these shifts. We varied the degree of increase in speciation rate, the number of tips, the tip-ratio bias and the relative age in relation to tree age when the rate shift occurred. We found that all methods have good power to detect rate shifts under many conditions, but also identified some scenarios under which these methods have low power.

p1) Simulation of phylogenetic trees with single diversification rate shifts
p1a) General procedure The basic workflow of the simulation process is outlined in Figure 1. To obtain a phylogeny with a single shift in speciation rate, we first simulated two trees -a basetree and a subtree wherein the subtree had a greater speciation rate (λ1) compared to that of the basetree (λ0). We used the package TESS (Höhna, May & Moore, 2015) to simulate trees in R (R Core Team 2016). The package implements tree simulation based on a global, time-dependent birth-death process conditioned either on number tips or age (Höhna, May & Moore, 2015). We generated phylogenetic trees, both basetree and subtree, under a constant birth-death process by conditioning on age using the function tess.sim.age, which simulates trees given the age (Bage or Sage) and diversification rate parameters (λ and µ). A subtree of a given age (Sage) was grafted onto the basetree following pruning of a randomly chosen basetree clade with approximately the same age (Sage ±2.5%) (Figure 1), using a custom written function (available in Figshare). This generated a composite tree with a single speciation rate shift (λ0 to λ1). The relative age of the subtree in relation to that of the overall tree was varied by varying the age of the subtree. We ran two simulation sets, each with different goals. p1b) Simulation Set 1: Effects of speciation rate asymmetry, overall tree size and relative subtree age Simulation Set 1 aimed to test the relative effects of speciation rate asymmetry (λ1/λ0), relative subtree age (100*Sage/Bage) and overall tree size (Table 1), and therefore these three parameters were varied. Tree size variation was incorporated by defining specific target overall tree size classes (50±10, 100±10, 150±10, 200±10, 300±10 and 500±10) a priori, and for each tree size class, simulating trees with all combinations of relative subtree age (20%, 40% and 60%) and speciation rate asymmetries (1.5X, 2.5X, 3.5X, 4.5X and 5.5X). Speciation rate asymmetry was varied by altering λ1, while keeping extinction rates constant (µ0=µ1). Thus, higher asymmetry values indicate a greater degree of speciation rate increase in the subtree. Pilot runs indicated that a λ0 value of 0.2 and Bage values between 5 and 35 units generated trees with the required parameters. Details of the pilot runs are described in Supplemental Information 1. We simulated 50 composite replicate trees each for all combinations of tree size class, relative subtree age and speciation rate asymmetry ( Table 1). Details of how each replicate tree was simulated are described in Supplemental Information 1. Thus, this simulation set comprised 1500 trees each for three relative subtree ages. Simulated trees were subsequently used in the diversification analyses where BAMM, MEDUSA and BiSSE were used to detect the simulated rate shifts (Section 2).

p1c) Simulation Set 2: Effects of tip-ratio and number of tips in basetree and subtree
Simulation Set 2 aimed to test the relative effects of tip-ratio bias (basetree:subtree number of tips), basetree size and subtree size. We simulated trees with three tip-ratio values (1:9, 9:1 and 1:19), all with a 2X speciation rate asymmetry. We chose 2X because our pilot simulations indicated that power was neither very high nor very low. For each tip-ratio value, we simulated trees with different overall tree sizes (50±10, 100±10, 200±10, 400±10 and 800±10). In order to achieve the target tip numbers and tip-ratios, Bage was allowed vary between 5 and 35 units and relative subtree age between 30 and 70%, while λ0 had values 0.2, 0.26 or 0.28 (Table 1). These ranges were decided based on pilot simulations described in Supplemental Information 1. Hundred replicate trees were generated for each parameter combination (overall tree size class, tip-ratio value), with all replicates for a given tip-ratio plus tree size combination having the same Bage, λ0 and relative subtree age (see Supplemental Information 1). We were unable to simulate trees with 1:9 ratio for the 100±10 size class, and therefore used a 1:8 ratio as an approximation.

p2) Estimation of diversification rate parameters and power of modelling approaches
We used BiSSE, MEDUSA and BAMM to estimate diversification rate parameters (λ0, µ0, λ1 and µ1) of the simulated composite trees and detect rate shifts. Power was calculated for each parameter combination as the proportion of the replicate trees in which a significant rate shift was detected at the node where the subtree was attached to the basetree.

p2a) BiSSE
For the BiSSE analysis, a character was assigned to be present in all subtree tips, but absent in all basetree tips. Therefore, diversification rate estimates of the BiSSE model will reflect the   164  165  166  167  168  169  170  171  172  173  174  175  176  177  178  179  180  181  182   183  184  185  186  187  188  189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 speciation and extinction rate estimates of the subtree (i.e. λ1 and µ1) and basetree (λ0 and µ0). We recorded AIC scores and the maximum likelihood values for all possible BiSSE models for a given tree: I. λ1 ≠ λ0, µ1 ≠ µ0, q01 ≠ q10 (All rates not equal; full model) II. λ1 = λ0, µ1 ≠ µ0, q01 ≠ q10 (Only speciation rates equal) III. λ1 ≠ λ0, µ1 = µ0, q01 ≠ q10 (Only extinction rates equal) IV. λ1 ≠ λ0, µ1 ≠ µ0, q01 = q10 (Only transition rates equal) V. λ1 = λ0, µ1 = µ0, q01 ≠ q10 (Only speciation and extinction rates equal) VI. λ1 ≠ λ0, µ1 = µ0, q01 = q10 (Only extinction and transition rates equal) VII. λ1 = λ0, µ1 ≠ µ0, q01 = q10 (Only speciation and transition rates equal) VIII. λ1 = λ0, µ1 = µ0, q01 = q10 (All rates equal; null model) Because we were interested in testing whether speciation rates differ between basetree and subtree, we chose the best among the models where the speciation rates are unequal (model I, III, IV and VI) based on the minimum AIC score, and compared this with the best model (selected based on minimum AIC score) where speciation rates are equal (model II, V, VII and VIII) using a likelihood ratio test. If P<0.05, the rate shift was considered to have been detected, while the rate shift was not considered detected if P≥0.05. Our motivation was to mirror how a typical researcher may infer a speciation rate shift using BiSSE on empirical data sets. Using the approach outlined above, a user will infer a shift in speciation rate, disregarding whether or not there are changes in extinction and transition rates. In the Results section, we refer to this approach of model selection as the 'approach 1' Because the true model, especially with respect to transition rates, is unclear, we also calculated power by comparing models III with V (referred to as 'approach 2'), as well as VI with VIII (referred to as 'approach 3').

p2b) MEDUSA
We performed MEDUSA analyses using the function MEDUSA (available from GitHub as an R package https://github.com/josephwb/turboMEDUSA , downloaded August 2017). We specified the model of tree evolution to be a birth-death process to estimate the diversification rate parameters. We recorded the node where the rate shift was detected and the estimated diversification rates. The diversification rate shift in a tree was considered to be correctly detected if a model with >1 diversification rates was chosen as the best model, and the rate shift was located at the node where the subtree was attached to the basetree.

p2c) BAMM
We performed BAMM analyses with the default parameter settings using the control file available from the BAMM website (http://bamm-project.org/quickstart.html#control-file accessed: August 2017). The priors used to estimate the speciation and extinction rates were generated using the setBAMMpriors for each tree using the BAMMtools package  in R. We ran the MCMC analysis for 2 million iterations and checked for convergence using ESS metrics (>200). The bammdata object was generated using the getEventData function from the BAMMtools package after discarding the first 10% of samples as burnin. The bammdata object was then used to calculate the diversification rates.
We estimated speciation and extinction rates for the subtree (λ1 and µ1) as the average rate of the clade using the function getCladeRates from the BAMMtools package . We estimated λ0 and µ0 using the same function by specifying the common ancestor node of the composite tree, but excluding the rates of the subtree. The best diversification rate shift configuration was identified using the function getBestShiftConfiguration with the expected number of shifts set to 1. The diversification rate shift was considered to have been correctly detected if the best rate shift configuration included a rate shift at the node where the subtree was attached to the basetree.

p3) Accuracy of estimated asymmetry and speciation rates
To compare the difference between the true and estimated asymmetry in Simulation Set 1, we calculated the estimated asymmetry as the median of the λ1/λ0 values of the 50 replicate trees for a given parameter combination, and error was calculated as.
Error in estimation of asymmetry = (Estimated asymmetry-True asymmetry) / True asymmetry.
Thus, a value of 0 represents no error, values <1 represent underestimation and those >1 represent overestimation.
We calculated error in estimation of λ0 and λ1 for trees in Simulation Set 2 as the median of estimated/true λ for the 100 replicate trees of each combination of tree size class and tip-ratio. This was done independently for subtree and basetree. Thus, a value of 1 represents no error, values >1 indicate overestimation of λ, and values < 1 indicate underestimation.

1)
Effect of speciation rate asymmetry, overall tree size and relative subtree age on power In analyses of trees from Simulation Set 1, power tended to increase both with asymmetry and tree size across all three subtree ages (Figure 2), but the overall effect of asymmetry and tree size depended on subtree age (Sage).
For BiSSE, when using 'approach 1' model selection, power was generally high (>0.75) for all combinations except when tree size was 50, and 1.5X was in combination with 20% Sage ( Figure  1A, D, G). A tendency for increase in power with increasing tree size was most apparent when Sage was 60%. Results were similar when power was calculated using two additional approaches of model selection (approaches 2 & 3; Supplemental Information 2). In the case of MEDUSA and BAMM, for Sages 20% and 40%, power was close to zero for an asymmetry of 1.5X irrespective of tree size, while power was very high (nearly 1) for all tree sizes when asymmetry was 4.5X or 5.5X ( Figure 2B, C, E, F). The exceptions were the set of BAMM analyses on trees of Sage 40% ( Figure 2F), where 4.5X and 5.5X asymmetries had low power (less than 0.5) for very small tree sizes. For intermediate asymmetries (2.5X and 3.5X), power correlated strongly with tree size for both methods. At Sage 60%, no asymmetry level resulted in uniformly high or low power across all tree sizes. Rather, there was a strong association between power and tree size for all asymmetries ( Figure 2H, I).
The number of basetree tips, subtree tips and the ratio of basetree:subtree tips (tip-ratio) for a given tree size class depended on Sage (Supplemental Information 3). Both the number of basetree tips and tip-ratio bias decreased, but subtree size increased, with increasing Sage. The effect of these three parameters was explored further in Simulation 2. Figure 3 depicts the relationship between true (i.e. simulated) and estimated asymmetry ratios. BAMM had a strong tendency to underestimate asymmetry (negative error values), whereas BiSSE tended to overestimate asymmetry (positive error values). MEDUSA generaly overestimated asymmetry, but tended to underestimate this when the true asymmetry was low and for lower tree sizes. Figure 4 depicts results from analyses of trees simulated in Set 2, all of which had a 2X asymmetry, overall size ranging from 50 to 800, and one of three tip-ratios, either biased towards the basetree (9:1) or the subtree (1:9 and 1:19). Figure 4A): We present only the results from 'approach 1' of model selection because all three approaches produced similar results. Power was nearly 1 for all tree sizes with 9:1 ratio (blue bars), but ranged from ca. 0.7 to 1 among trees of 1:9 tip-ratio (red bars) and from ca. 0.60 to ca. 0.9 among trees with 1:19 tip ratio (green bars). For the latter two tip-ratios, power tended to increase when basetree size (5, 10, 20, 40 or 80) increased, but did not increase when basetree size remained constant and only the subtree size increased (comparisons between adjacent green and red bars). Power was nearly 1 for all tree sizes when tip-ratio was basetree biased (9:1 tip-ratio, blue bars). Thus, power was generally higher for tip-ratio 9:1 than for 1:9 for the same overall tree size (comparison of adjacent blue and red bars). 3b) MEDUSA ( Figure 4B): Power was strongly correlated with overall tree size for all three tip-ratios. Power ranged from ca. 0.2 to 1 for the 1:9 tip-ratio (red bars), and from ca. 0.5 to 1 for the 1:19 tip-ratio (green bars). For smaller tree sizes, power increased when basetree size remained constant and subtree size increased, for e.g. comparison of trees with 5-45 and 5-90 basetree-subtree tips, or 20-180 and 20-380. Power ranged from ca. 0.1 to ca. 1 for the basetree biased tip-ratio (blue bars), and was overall higher for 1:9 compared to 9:1 (comparison of adjacent blue and red bars).  Figure 4C): Power remained at ca. 0.25 for both subtree biased tip-ratios when tree size ≤ 200 tips (red and green bars). For trees of the 400 size class, power was ca. 0.4 when there were 20 subtree tips, but increased to ca 0.5 when there were 40 subtree tips. Power was low (>0.25) for the basetree biased tip-ratio irrespective of tree size (blue bars). Power was highest for trees with 800 tips and subtree biased tip-ratios (0.75 or 1). Power was overall higher for tip-ratio 1:9 than for 9:1 (comparison of adjacent blue and red bars).

4) Error in speciation rate estimates in Simulation Set 2
BiSSE had a tendency to underestimate λ0 and λ1 (error values <1) at ratios 1:19 (green bars) or 9:1 (blue bars), and overestimate these two parameters at 1:9 (red bars; Figure 4D,G). MEDUSA had a tendency to underestimate λ0 and λ1 (error values <1) at 1:19 (green bars) and to overestimate λ0 for the smallest trees at 1:9 ratio (red bars; Figure 4E). At 1:9 ratio, λ0 tended to be overestimated, while λ1 tended to be underestimated. BAMM tended to overestimate λ0 for all tree sizes and tip-ratios, and this effect was strongest at the 1:9 tip-ratio (red bars; Figure 4F). All three methods tended to overestimate λ1 when the tip-ratio was 1:9 (red bars), and underestimate this for the 9:1 (blue bars) and 1:19 (green bars) tip-ratios ( Figures 4G-I).

Discussion
Previous studies have identified shortcomings specific to particular modelling approaches for estimation of rate shifts in phylogenies (e.g. Rabosky, 2010;Davis, Midford & Maddison, 2013;Laurent, Robinson-Rechavi & Salamin, 2015;Rabosky & Goldberg, 2015;Gamisch, 2016;Moore et al., 2016;Kozak & Wiens, 2016;Meyer & Wiens, 2017;Burin et al. 2018;Simpson et al., 2018). We simulated large sets of trees where speciation and extinction rates remained constant throughout the tree (basetree), apart from an increase in speciation rate at a single node (subtree), and analyzed these trees using three widely used modelling approaches. We are therefore able to assess the relative performance of the three methods, and identify issues that are common to these methods. We find that power to detect speciation rate shifts is strongly influenced by rate asymmetry and tip number for all methods.

Effect of rate asymmetry
Not surprisingly, power increased as the speciation rate asymmetry increased, which has also been reported in other studies (e.g. Davis, Midford & Maddison, 2013;Laurent, Robinson-Rechavi & Salamin, 2015). All methods performed poorly when the subtree speciation rate increased by only 50% (1.5X) relative to the basetree, suggesting that moderate rate increases are difficult to detect. However, even with a high asymmetry of 5.5X, a significant proportion of rate shifts were undetected by all methods (Type II error), especially in smaller trees (Figure 2). Davis, Midford & Maddison (2013) and Gamisch (2016) simulated complex evolutionary scenarios with multiple increases and decreases in diversification parameters at random points across the tree, and assessed the effect of overall tree size. They showed that BiSSE analyses on  351  352  353  354  355  356  357  358  359  360  361  362  363  364  365  366  367  368  369  370  371 small trees are prone to high Type II error. Laurent, Robinson-Rechavi & Salamin (2015) tested the performance of MEDUSA for both simple scenarios with a single speciation rate shift and more complex scenarios with multiple shifts and mass extinctions. Interestingly, they found that overall tree size increased power in the complex scenarios (their Figure 5a), but not for the single rate shift scenario (their Figure 3b). In their single rate shift scenario, power increased with the size of the lineage in which the rate shift occurred (i.e., the subtree). This suggests that power may be affected not by the overall tip number, but by the number of tips in the basetree or the subtree, both of which are correlated with overall tip number. We discuss this in more detail in the section 'Effects of lineage tip number and tip-ratio bias'.
The performance of all three methods was generally similar at high asymmetry values, although BAMM tended to have lower power at Sage 60% even with high asymmetry. Overall, BiSSE appears to perform better than other two -power tended to be high except when the lowest asymmetry was combined with the lowest Sage. MEDUSA appears to perform better than BAMM when asymmetry is low. When Sage was 20%, BAMM rarely detected a rate shift at 2.5X asymmetry even in the largest trees. At 40% Sage, BAMM had lower power than the other two methods at 2.5X asymmetry, irrespective of tree size. Furthermore, BAMM performed worse than the other two when the second set of simulations with 2X asymmetry were analyzed ( Figure  4). Power was negligible for both BAMM and MEDUSA at the weakest asymmetry (1.5X) in Simulation Set 1.
Power should increase if the speciation rate asymmetry is more accurately estimated. This was apparent for BAMM and MEDUSA, which both strongly underestimated asymmetry for smaller trees and at lower asymmetry values (Figure 3). BiSSE, on the other hand, tended to overestimate asymmetry (Figure 3), but there was no clear relationship between error and power for this method.

Effect of subtree age
The power of all methods was lower at Sage 60% compared to Sages 40% and 20%. This is not due to tree size, because the simulated tree sizes were the same for all Sages. However, for a given asymmetry value and overall tree size, subtree sizes are necessarily larger for higher Sages (Supplemental Information 3), and therefore basetree sizes need to be smaller to accommodate the larger subtrees. Thus, subtree size, basetree size and tip-ratio bias were all influenced by Sage, and one or all of these factors may explain why power was compromised at Sage 60% compared to the younger Sages.

Effects of lineage tip number and tip-ratio bias
The second set of simulations explicitly attempted to tease apart the effects of overall tree size, subtree size, basetree size and tip-ratio. We simulated three tip ratios, two of which were subtree biased (1:9 and 1:19) and the third, basetree biased (9:1). We varied basetree and subtree size from 5-720, and the overall tree size from 50-800. We were thus able to not only test whether tipratio bias affected power, but also compare power for trees with the same subtree or basetree size, but varying overall tree size. As expected, overall tree size generally correlated positively with power for all methods. However, although both 1:9 and 9:1 tip ratios had the same tip-ratio bias, these ratios differed in power within the same tree size class; this was the case for all methods ( Figure 4A-C; comparison of red and blue bars). For BiSSE, there was either no difference in power between these two tip-ratios or power tended to be greater at 9:1, but power was always greater at 1:9 for MEDUSA and BAMM. Furthermore, power varied significantly among tree sizes for any given tip-ratio. This was so even when tree sizes were very large, and therefore when power may be expected to be uniformly high. This indicates that the number of tips in the lineages with distinct speciation rates (i.e. basetree and subtree) may play a stronger role than tipratio bias per se. Indeed, both subtree and basetree sizes independently had a strong effect on power for all three methods.
However, the methods differed in terms of how influential subtree and basetree size were. For BiSSE, power did not increase when tree size doubled at the same basetree size, for e.g, when basetree was 10 and subtree size changed to 190 from 90. If lineages on a tree differ in speciation rates, estimates of these rates should be more error prone, because these lineages are smaller. Generally, if lineages differ in diversification rates, the size of such lineages with distinct diversification parameters will be larger in larger trees. This may explain why power increases with tree size, as has been found here and in other studies (e.g. Davis, Midford & Maddison, 2013;Laurent, Robinson-Rechavi & Salamin, 2015;Gamisch et al., 2016). The effect of lineage size also explains results in Davis et al. (2013), where power initially increased with increasing asymmetry but later decreased with further asymmetry increase, and this pattern was consistent for all overall tree sizes (their Figure 1a). They concluded that the positive effect of asymmetry was counteracted by the negative effect of increasing tip-ratio bias as asymmetry increased. However, at higher asymmetries (and thus at stronger tip-ratios bias), the number of tips available for parameter estimation is likely to have been the limiting factor. For e.g., a tree with 500 tips (the largest tree size they simulated) and an asymmetry of 10X had a tip-ratio of 90:1, and thus one of 'character states' (i.e. all lineages sharing the same diversification rate parameter) would have been represented by <6 tips. Therefore, we suggest that their results are better explained by the effect of the number of tips in each state rather than tip-ratio per se (although we do not argue that tip-ratio has no effect), and that stronger asymmetries will always improve power of detection by BISSE as long as there are enough tips of every character state.
For MEDUSA, there was a tendency for increase in power when subtree size increased without a change in basetree size ( Figure 4B; comparison of red and green bars), indicating that overall tree size may partially offset the deleterious effect of small lineage size. This effect was also seen in the BAMM analyses, where the basetree remained at 40, but subtree size doubled.
Inferences can be drawn about the relative importance of subtree and basetree sizes by comparing 9:1 and 1:9 tip-ratios ( Figure 4; comparison of red and blue bars for the same tree size). For BiSSE, power tended to be higher for 9:1 (blue bars), suggesting that small basetree size is more detrimental compared to small subtree size. On the other hand, power was higher at 1:9 (red bars) than at 9:1 (blue bars) for both MEDUSA and BAMM, indicating that the three methods are   414  415  416  417  418  419  420  421  422  423  424  425  426  427  428  429  430  431  432  433  434  435  436  437  438  439  440  441  442  443  444  445  446  447  448  449  450  451   452  453  454  455  456 differently affected by basetree and subtree sizes. Both BAMM and MEDUSA performed poorly at 9:1 -while BAMM rarely detected rate shifts at this tip-ratio even in the largest trees, MEDUSA had very low power when trees had less than 400 tips. The low power maybe related to the fact that both methods strongly underestimated subtree speciation rates ( Figures 4H,I).
Given that BiSSE had the best power and BAMM the worst, it was not surprising that BiSSE had the lowest error and BAMM the highest. One may perhaps expect that speciation rates are more accurately estimated in the monophyletic subtree compared to the paraphyletic basetree. However, we found no strong differences in error between subtree and basetree. An exception was BAMM at 1:9, where error was clearly much higher for the basetree (comparison of red bars in Figure 4F & I).

Recommendations
In practice, a researcher intending to analyze diversification rate shifts may only have information about overall tree size, and not the sizes of lineages with distinct rates. In such cases, we agree with recommendations of Davis and colleagues (Davis, Midford & Maddison, 2013) who suggested that inferences based on BiSSE analyses of trees with fewer than 300 tips should be made very cautiously. We extend the same recommendation to BAMM and MEDUSA. Thus, if no rate shift is detected when using BiSSE, BAMM or MEDUSA on small phylogenies, users should be careful when concluding that speciation rate has been constant.
Davis and colleagues (Davis, Midford & Maddison, 2013) recommended cautious interpretation when analyzing datasets where <10% of species are of one character state. We conclude that the number of tips with a particular character state are a better predictor of power, rather than proportion, especially when using BiSSE. For instance, if 5% of the tips have a character state (i.e., 1:20 tip-ratio when there are only two character states), power of detection may not be compromised as long as there is strong rate asymmetry, there are at least 100 tips with the character state and the character state is not phylogenetically overdispersed. Generally, we suggest that tip-ratio may not be a big problem when analyzing very large trees (>1000 tips).