TRACE: Tennessee Research and Creative TRACE: Tennessee Research and Creative Exchange Exchange Ecological Drivers of Song Evolution in Birds: Disentangling the Ecological Drivers of Song Evolution in Birds: Disentangling the Effects of Habitat and Morphology Effects of Habitat and Morphology

Environmental differences influence the evolutionary divergence of mating signals through selection acting either directly on signal transmission (“sensory drive”) or because morphological adaptation to different foraging niches causes divergence in “magic traits” associated with signal production, thus indirectly driving signal evolution. Sensory drive and magic traits both contribute to variation in signal structure, yet we have limited understanding of the relative role of these direct and indirect processes during signal evolution. Using phylogenetic analyses across 276 species of ovenbirds (Aves: Furnariidae), we compared the extent to which song evolution was related to the direct influence of habitat characteristics and the indirect effect of body size and beak size, two potential magic traits in birds. We find that indirect ecological selection, via diversification in putative magic traits, explains variation in temporal, spectral, and performance features of song. Body size influences song frequency, whereas beak size limits temporal and performance components of song. In comparison, direct ecological selection has weaker and more limited effects on song structure. Our results illustrate the importance of considering multiple deterministic processes in the evolution of mating signals.

A prominent issue is that ecological diversity drives the evolution of mating signals in two distinct ways. First, differences in the transmission properties of habitats can lead to divergence in mating signals as a result of direct habitat-dependent selection for effective signal transmission (Morton, 1975), a process termed "sensory drive" (Endler, 1992). Second, ecological selection can influence mating signals indirectly by causing divergence in traits related to signal production and modification (Endler, 1993), such as body size (Gil & Gahr, 2002) and beak size (Podos & Nowicki, 2004b) in birds. Such traits have been termed "magic traits" because under divergent ecological selection, they give rise "as if by magic" to signal divergence, and ultimately nonrandom mating, resolving a long-standing difficulty in models of ecological speciation (Gavrilets, 2004;Thibert-Plante & Gavrilets, 2013).
In the following sections, we outline evidence for direct and indirect ecological selection on acoustic mating signals and then address their relative contribution and potential interaction in the evolution of birdsong.

| Sensory drive
Selection should favor signal traits that optimize transmission of information from signaler to receiver (Endler, 1993). In long-distance signals, the physical properties of habitats may affect sound transmission, leading to the adaptation of signals to specific environments (Morton, 1975). For example, acoustic signals in forests are subject to scattering effects by vegetation, whereas in more open habitats, they are affected by wind (Richards & Wiley, 1980;Wiley & Richards, 1978). Consequently, acoustic signals of forest species tend to have slower pace, lower frequencies (e.g., Morton, 1975;Ryan & Brenowitz, 1985;Wiley, 1991), and more pure tones (e.g., Richards & Wiley, 1980;Wiley, 1991;Wiley & Richards, 1978) than those of species found in open, grassland habitats. This form of sensory drive (often termed "acoustic adaptation") has shaped the evolution of bird song in most species examined (reviewed by Slabbekoorn & Smith, 2002a). However, a meta-analysis found support for habitat shaping spectral rather than temporal features of song, and the overall effect of habitat on signal structure was small (Boncoraglio & Saino, 2007).

| Magic traits
Animal signals are subject to indirect sources of selection because they are produced by traits with multiple functions (Nowicki, Westneat, & Hoese, 1992). For example, divergent ecologies can select for differences in body size (Grant, 1968), which in turn places limits on the fundamental frequency of sounds (Wallschäger, 1980). Because the fundamental frequency of birdsong is determined by the vibrating frequency of the syringeal membrane (Nowicki & Marler, 1988), larger birds tend to produce lower frequency song (Palacios & Tubaro, 2000;Ryan & Brenowitz, 1985;Tubaro & Mahler, 1998).
Similarly, the beak is under strong selection in the context of foraging and food manipulation (Grant, 1968;Herrel, Podos, Huber, & Hendry, 2005) and is used in coordination with vocal tract movements to modify sound (Goller, Mallinckrodt, & Torti, 2004;Westneat, Long, Hoese, & Nowicki, 1993). This has particular relevance to the widespread trade-off between rates of sound production and the frequency bandwidth of sounds Podos, 1997).
This trade-off has a triangular distribution because sounds produced at a slow rate can have a wide or a narrow frequency bandwidth, whereas as the rate of sound production increases, frequency bandwidth narrows. Ability to perform this trade-off (i.e., "vocal performance") may be affected by beak size through trade-offs in jaw biomechanics, namely between maximal force and velocity (Herrel, Podos, Vanhooydonck, & Hendry, 2008;Herrel et al., 2005) and/or between torque and angular velocity (Palacios & Tubaro, 2000). In support of this hypothesis, morphological adaptation is associated with variation in song structure and performance capabilities in many species of birds (Badyaev, Young, Oh, & Addison, 2008;Ballentine, 2006;Derryberry, 2009;Derryberry et al., 2012;Huber & Podos, 2006;Podos, 2001;Seddon, 2005;Tobias et al., 2014).

| Relative roles of sensory drive and magic traits
Despite extensive research on both direct and indirect sources of ecological selection on bird song, we are only aware of two studies considering both possibilities in tandem (Mason & Burns, 2015;Seddon, 2005). The first study demonstrated that indirect and direct selection both played a role in the evolution of song in antbirds (Thamnophilidae) (Seddon, 2005), although a species-level molecular phylogeny was not available. More recently, Mason et al. (2017) found that body size was more important than habitat in the evolution of song in tanagers (Thraupidae), but no information was available regarding beak size. Thus, we still have only a limited understanding of the relative roles of these mechanisms, partly because comprehensive information on phylogenetic relationships, signal design, morphology, and ecology are rarely available for large radiations.
In this study, we use phylogenetic comparative techniques to assess the relative roles of direct and indirect ecological selection on song diversification across 285 species of ovenbirds (Furnariidae), a diverse clade with comprehensive data on phylogenetic relationships, morphology, and song Tobias et al., 2014).
We used model comparison to assess the relative roles of direct ecological selection via sensory drive and indirect ecological selection via magic traits. To test the role of sensory drive, we predicted that species found in more closed habitats would produce songs at slower rates, with lower frequency characteristics and narrower bandwidths.
To test the "magic traits" hypothesis, we predicted that species with larger body size would produce lower frequency songs (Nowicki & Marler, 1988) and that species with larger beaks would produce songs at a slower pace, narrower bandwidth, and lower vocal performance (Huber & Podos, 2006;Podos, 2001). Finally, several studies have highlighted the prominent role of stochasticity in explaining signal variation within and between species (Irwin, Thimgan, & Irwin, 2008;McCracken & Sheldon, 1997;Mundinger, 1982;Price & Lanyon, 2002), and thus, song divergence may simply be related to evolutionary time since speciation (Pagel, 1999;Tobias et al., 2010). Combining data on habitat, morphology, and phylogenetic relationships allowed us to test the relative influence of sensory drive and magic traits against this stochastic null model.

| Study species
Ovenbirds (Furnariidae) are insectivorous passerine birds occurring in nearly every terrestrial habitat throughout Central and South America.

| Song data
Many species of ovenbirds have a wide vocal repertoire including calls and so-called loudsongs-a consistently patterned, multiplenote vocalization typically repeated at regular intervals (Willis, 1967).
Observational studies on ovenbirds suggest that loudsongs function in territory defense, mate attraction, and pair bonding (Ippi, Vasquez, Van Dongen, & Lazzoni, 2011;Kratter & Parker, 1997;Roper, 2005;Zimmer, Robbins, & Kopuchian, 2008), in common with other tracheophone suboscine birds in which function has been tested experimentally (Tobias, Gamarra-Toledo, Garcia-Olaechea, Pulgarin, & Seddon, 2011). As tracheophone suboscine loudsongs are therefore functionally equivalent to songs produced by oscines, we refer to them hereafter as "songs." We measured song structure from recordings of 1,826 individuals To examine the predicted trade-off between the rate at which sounds are produced and the frequency bandwidth of those sounds, we then plotted frequency bandwidth as a function of pace for all individuals for which we had both values (n = 1,826). We first used the traditional approach for estimating upper bounds for triangular distributions between two variables (Blackburn, Lawton, & Perry, 1992;Podos, 1997). We binned pace into 2-Hz increments (0-2 Hz, 2-4 Hz … 38-40 Hz). Within each bin, we chose the song with the maximum bandwidth. We then calculated a linear regression using these maximum values (n = 20) to determine the equation for this upper-bound regression. Sampling limitations inherent in this traditional upperbound regression method make it prone to false positives (Wilson, Bitton, Podos, & Mennill, 2013). We therefore used a second analytical method to validate our findings using the more traditional method. We used a sliding binning window to identify the 90th percentile of the frequency distribution data. To avoid sampling error due to outliers, we dropped bins that included fewer than 32 samples. We then used the remaining data to estimate how changes in song pace affected the 90th percentile of the frequency distribution data. Both methods recovered the predicted trade-off between trill rate and bandwidth (see Results).
To calculate a measure of vocal performance, we used the upperbound regression following Podos (1997) and measured the minimum (orthogonal) distance of each song from this regression. This measure is referred to as "vocal deviation" following Podos (2001). Higher values of vocal deviation reflect low vocal performance and lower values reflect high vocal performance (VP; Figure 1), although it is important to note that there has been some questioning of the use of the word "performance" in sexual selection research as performance is a nonneutral term (Kroodsma, 2017). Experimental tests have shown that this measure of vocal performance has biological relevance in a number of species (Ballentine, Hyman, & Nowicki, 2004;Draganoiu, Nagle, & Kreutzer, 2002;Illes, Hall, & Vehrencamp, 2006;Moseley, Lahti, & Podos, 2013;Pasch, George, Campbell, & Phelps, 2011; but see Kroodsma, 2017).
We calculated a mean value for each song variable for each species. Species in this family show little variation in song structure within or between individuals, no repertoires, and low regional variation in song ; thus, we do not include measures of song variance in our analyses. We log-transformed all song variables prior to statistical analyses, to meet parametric assumptions of normality and homogeneity of variance. We reduced song variation using phylogenetic PCA (PPCA) (Revell, 2009). Vocal performance was not included in the song PPCA, as it is calculated from variables already included in the PPCA (pace and bandwidth).

| Morphological data
We obtained morphological measures ( Figure 1) for the same 276 species from museum specimens (see Tobias et al., 2014). To capture morphological variation potentially associated with constraints on song production and modification, we used two variables to represent body size-body mass and tarsus length-the latter being the best univariate index of body size (Freeman & Jackson, 1990). Body mass data were from Dunning (1992). We also measured three beak characters: beak length, measured from the anterior border of the nostril to tip of the beak, and beak width and depth (vertically) at the anterior border of the nostrils. The same person (S. Claramunt) took all beak measurements.
All morphological variables were log-transformed. We computed body size as the mean of the two log-transformed body variables and beak size as the mean of the three log-transformed beak variables. We assumed that overall beak size is related to the trade-off between force and velocity and that beak dimensions provide a measure of beak moment (indicative of a trade-off between torque and angular velocity).
This allows us to consider the specific effect of angular momentum of the jaw on constraining song modification. The beak's moment of inertia can be defined as the amount of torque required to move the beak at a certain rate of angular acceleration. Beaks with higher moment of inertia will require more torque to move rapidly. We can approximate the beak's moment of inertia as beak width × depth × length, with length to an unknown power. We leave the power unknown because the exact power of length is dependent on beak shape. Thus, we describe the vector parameter of "beak moment" as beak size and logtransformed beak length (Beak Size, Beak Length; model signal "Beak Size + Beak Length").

| Habitat data
We classified the primary habitat of all lineages using standard published sources ( Figure 2). Categories were (1)  to provide an index of habitat structure for each lineage, following standard procedures . In the case of generalists, we used literature and published range maps to identify the "primary habitat" as that preferred by the species over the largest geographical area. In practice, classification was simplified by the fact that our habitat categories are broad, with almost all ovenbird species predominantly occurring in one such habitat category.      As an alternative, we treated habitat variation as a continuous variable using bioclimatic data extracted from the geographical range of each species (Seeholzer, Claramunt, & Brumfield, 2017). We gathered 23,588 georeferenced locality records (mean = 79.4 records/species, range = 1-786) representing all study taxa. We obtained the locality records from three general sources: specimens, recordings, and observational records. Specimen records were obtained from ORNIS (www.ornisnet.org). Recording records were obtained from Macaulay Library of Natural Sounds (Cornell Lab of Ornithology) and Xeno-Canto (www.xeno-canto.org). The coordinates of all documented records (both specimens and recordings) included in this study were vetted for accuracy using gazetteers. The third group of records came from observational data gathered by the eBird citizen science initiative (May 2013 release, Sullivan et al., 2009) which are extensively vetted by expert review (www.ebird.org). To further ensure accuracy, we applied additional filters to the observational records. For each species represented by ten or more localities, we thinned all localities so that no two occurred within 1 km of each other, which is the resolution of the climatic data.
For each locality record, we extracted elevation and 19 bioclimatic variables from the BioClim database of present-day climatic conditions (Hijmans, Cameron, Parra, Jones, & Jarvis, 2005) and obtained each variable's mean value for all species. To reduce redundancy in the climatic data set, we calculated pairwise Pearson correlation coefficients for the temperature and precipitation variables separately. We retained temperature and precipitation variables that had a Pearson correlation coefficient <0.90 with respect to mean annual temperature (Bio1) and mean annual precipitation (Bio12). Interpretability was increased by purposefully retaining Bio1 and Bio12. We retained four temperature and five precipitation variables: annual mean temperature (Bio1), mean diurnal range (Bio2), isothermality (Bio3), temperature annual range (Bio7), annual precipitation (Bio12), precipitation of driest month (Bio14), precipitation seasonality (Bio15), precipitation of warmest quarter (Bio18), and precipitation of coldest quarter (Bio19). These nine climatic variables were analyzed with the prcomp function in the R Language for Statistical Computing (R-Core-Team, 2016). Because the bioclimatic variables were in fundamentally different units for temperature (°C) and precipitation (mm), we used the correlation matrix as opposed to the covariance matrix (Flury, 1997). We used the Kaiser Criterion (eigenvalues greater than one) and retained principal components 1-2, which explained 75% of the climatic variation. Factor loadings, eigenvalues, and percent variance are presented in Appendix S1. For analyses, we retained only the first eigenvector which explained ~60% of the climatic variation (hereafter, "Environment PC1"; Figure 2) because PC2 explained only 14% of the variance and summarized isothermality and precipitation seasonality, which are less generalizable metrics.

| Phylogeny
We used a calibrated species-level phylogeny of the Furnariidae ( Figure 2) inferred using three mitochondrial (ND3, CO2, and ND2) and three nuclear genes (RAG-1, RAG-2, and Bf7). To calibrate the tree, biogeographic events were used to place priors on the age of the root (split between Tyrannoidae and Furnarioidea of 61 ± 2.8 Ma (Barker, Cibois, Schikler, Feinstein, & Cracraft, 2004)) and on the divergence times of the most recent common ancestor of 12 sets of taxa using two biogeographic events: the closure of the Panamanian Isthmus (3 ± 0.5 Ma following (Weinstock et al., 2005)) and the uplift of the Eastern Cordillera of the northern Andes (3.6 Ma (Gregory-Wodzicki, 2000) with a 95% age interval of 0.8-16 Ma). We allowed for bidirectional uncertainty in these events. We ran analyses for a total of 150 million generations across seven independent runs. We identified and discarded the burn-in of each run (total approximately 1 million generations). Converged runs were used to estimate the posterior distribution of topologies and divergence times. We selected the maximum clade credibility (MCC) tree based on a partitioned, Bayesian search of topology and divergence times in BEAST version 1.5.2 (Drummond & Rambaut, 2007). We also sampled 500 trees from the posterior distribution. Details on data collection, phylogenetic inference, as well as the resulting alignment and tree files can be found in Derryberry et al. (2011) and TreeBASE S11550.

| Phylogenetic comparative analyses
All phylogenetic comparative analyses were conducted in R 3.3.0 (R-Core-Team, 2016). We used phylogenetic generalized least squares models (PGLS) to test the ability of different factors to predict variation in song structure. The dependent factors included the first three principal components from the PPCA used to reduce song variation as well as the individual song traits. The predictors included Environment PC1, Habitat, Body Size, Beak Size, and Beak Moment.
We fitted models that included one measure of habitat and one measure of morphology as main factors to reduce issues of collinearity (see below). We analyzed interaction factors between measures of morphology and the categorical measure of habitat, only. We included an interaction term because we predicted that the strength, but not the direction, of the relationship between morphology and song structure may vary across different types of habitats. For example, as habitat becomes more open, and trill rate less limited in acoustic space, a relationship between beak size and trill rate may become more apparent.
We include an interaction model only in analyses using categorical measures of habitat variation and not in analyses with habitat treated as a continuous variable as we have no a priori prediction of how particular values of "Environment_PC1" might relate to constraints on song structure. We included a constant model as a point of comparison. One strong outlier was removed from the data set prior to analyses.
Some of the predictors are moderately to highly correlated (λ branch transformation: Beak Size and Beak Moment = 0.85, Beak Size and Body Size r = .81, Environment PC1 and Habitat r = .72, and Body Size and Beak Moment r = .63). Collinearity is common in ecological data sets, and combining or eliminating predictors can underestimate the effects of the included predictor and result in mismodelling the underlying determinants of a given behavior (Freckleton, 2011). We thus model collinear predictors, as AIC information theory methods are generally robust to collinearity. The largest problem arises when one predictor is weak but strongly correlated with a predictor of strong effect-the weak predictor is overestimated and the strong predictor underestimated (Freckleton, 2011). We are thus careful not to overinterpret the effect of weaker correlated predictors. We also minimize effects of collinearity by avoiding stepwise regression (Burnham & Anderson, 2002a) and interactive models between collinear variables (Freckleton, 2011).
The modified GLS approach simultaneously estimates and uses the best branch length transformation to adjust for the degree of phylogenetic nonindependence in the model residuals (Freckleton, Harvey, & Pagel, 2002;Revell, 2010). We used the caper (Orme et al., 2012) library to run PGLS for four models of branch length transformation: Brownian motion (unconstrained random walk), lambda (strength of phylogenetic effects), kappa (speciational change), and delta (exponential accelerating or decelerating change). We used the APE (Paradis, Claude, & Strimmer, 2004) and nlme (Pinheiro, Bates, Debroy, & Sarkar, 2015) libraries for a fifth model, an Ornstein-Uhlenbeck (OU) process (constrained random walk) with a single optimum. We used the MCC tree as our phylogenetic hypothesis. The sample size in all analyses reflects the number of taxonomic units for which we had the appropriate data.
In all model fitting, diagnostic plots were used to check that points on the Q-Q plot approximately fit a straight line and that residual points were randomly scattered. Model fit was evaluated using Akaike Information Criterion corrected for sample size (AICc) (Akaike, 1973;Burnham & Anderson, 2002a). Models greater than two AICc units from the top model (ΔAICc of >2) were considered to have less support, following Burnham and Anderson (2002a). To search for the most parsimonious model, we then removed models within two AICc units of the top model that differed from a higher-ranking model by the addition of one or more parameters. These were rejected as uninformative, as recommended by Arnold (2010). For traits that we could not identify a most parsimonious model, we averaged the 95% cumulative weight models (including those with uninformative extra terms following Garamszegi (2014)) across a sample of 500 trees from the posterior distribution of trees. We then computed AICc weights for each of the models (1) to determine the total weight of a particular branch length transformation for a particular signal and (2) to determine an average model using the weighted average of the individual model parameters (e.g., the intercept). Parameters are treated as 0 when not present in a given model.
For each song trait, we provide information on the 95% cumulative weight models. We discuss either the most parsimonious model (if one was selected) or the average model with the weight of each signal. We present coefficients (β) and measures of support for models, including model weight (w i ), which is the probability that the model of interest is the best model in the set and the evidence ratio in relation to the constant model (ER = w i /w constant model ). For all song traits, we discuss total parameter weights from models fit to the MCC tree. We discuss β from either the most parsimonious model using the MCC tree or from the average model across the posterior distribution, depending on the context. We report the weight of the simplest model as the total weight of the models with a constant signal for the five branch length transformations.

| Song traits
A PPCA on song data yielded three principal components with eigenvalues greater than one. Song frequency measures load strongly onto PC1 (larger values of PC1 indicate lower peak, max and min frequencies, and narrower bandwidth), duration and number of notes load onto PC2 (larger values of PC2 indicate shorter songs with fewer notes), and pace loads onto PC3 (larger values of PC3 indicate faster songs) (Table 1).

| Beak morphology, habitat structure, and signal design
We report AICc for all signals and all branch length transformations (Appendix S2). For all song traits, we provide AICc, model weight, and ER for the 95% cumulative weight models (Appendix S3) (Burnham & Anderson, 2002b). We also report the top model and models within two AICc (Table 2) and their coefficients of variation (Appendix S4), dropping models with uninformative parameters (Anderson & Burnham, 2002). Finally, we provide the parameter total weights (Table 3) and coefficients of variation averaged across the posterior distribution of trees (Table 4) (Garamszegi, 2014).
As predicted, body size best explained variation in spectral characteristics of song. The most parsimonious model for Song PC1 was Body Size under the λ branch length transformation and garnered 45% of the model weight (Table 2). All remaining models individually had less than 16% of the total weight. As a parameter, Body Size had 76.2% of the weight across all candidate models, providing strong support for this parameter explaining variation in song spectral characteristics. Birds with larger bodies sang lower frequency songs (ER > 178, β = 6.58 units of Song PC1/unit of Body Size; Figure 3). We found only weak support for other morphological or environmental parameters explaining variation in Song PC1 (parameter total weights: Habitat = 23.2%, Environmental PC1 = 20.9%, Beak Size = 23.4%, Beak Moment = 9.6%).
Our findings for individual song spectral traits were generally consistent with results for Song PC1. Body Size received strong support as the most parsimonious model for peak frequency (ER > 581; w i = 0.42) and maximum frequency (ER > 99; w i = 0.46). In addition, the total weight for Body Size as a parameter was high for most spec- Overall, we find that spectral features of song vary with body size, and any association with habitat is not consistent with predictions under sensory drive.
We did not have a priori predictions under the sensory drive or magic traits hypotheses regarding song length or number of notes and did not find strong evidence of habitat or morphology explaining variation in these two features of song (all ER < 8). The most parsimonious model for Song PC2 was the simplest model under the delta branch length transformation. We also found weak evidence for the most parsimonious models for both song length (ER < 2.5) and number of notes (ER < 8). Consistent with predictions, we found that both morphology and habitat explain variation in song pace. For Song PC3, we found strong evidence for top additive models including the parameters Beak Size, Beak Length, Body Size, and Habitat (ER range: 9,364). Considering the average additive model, Beak Size received the highest weight (71.3%) followed by Habitat (66.2%) and Beak Length (51%). However, Body Size has low parameter weight (28.7%), and considering the known effect of collinearity on model selection (Freckleton, 2011), it is unlikely that Body Size is an important factor explaining variation in Song PC3. Because we approximate beak moment as a vector parameter (Beak Size, Beak Length), an average additive model that includes these two terms is effectively beak moment. Therefore, we find that birds with larger beak moment produce songs  Table 3). Given the collinearity of beak size and beak length, we plotted the region of predicted Song PC3 values for 95% of the observed beak measurements within each habitat type to inspect the direction of the relationship (Figure 3). We constructed this plot for an approximation of beak moment as beak width × depth × length 2 . We also checked against an approximation with length 3 and noted very little difference in predicted relationships.
T A B L E 1 Eigenvalues and loadings of song traits on principal components (PC) from PPCA. Significant loadings in bold  (Table 3). Finally, we found little support for the continuous measure of habitat (Environment PC1) explaining variation in any of the song traits.
Environment PC1 did not garner more than 25% of the total weight for any individual song trait or for any of the Song PCs (Table 3). Habitat (categorical measure) had high weight as a parameter for a number of song traits, and yet Environment PC1 did not.

| DISCUSSION
We have shown that variation in ovenbird songs arises through a combination of direct selection on signal design via transmission properties of the environment and indirect selection on song characters as a byproduct of selection on morphological traits associated with diversification into different ecological niches. Indirect selection is the primary force shaping spectral features of song, whereas both direct and indirect selection act on song tempo and performance. Together, these findings suggest that ecological selection on morphology indirectly drives the evolution of songs in ovenbirds, whereas habitat structure mediates the strength of indirect selection on song tempo and performance.

| Magic traits
We found strong evidence that multiple magic traits influence the diversification of most song traits in ovenbirds. Body size was the most T A B L E 3 For all song traits, parameter total weights. Parameters included in competitive models are in black. See Table 2  important parameter for spectral features of song, whereas beak size was more important for temporal and performance features. These results make sense because body size is primarily thought to affect sound production, specifically the frequencies of sound which birds can produce efficiently, whereas beak size is primarily thought to affect sound modification.
Our analyses indicated that larger birds produce songs at lower peak, maximum and minimum frequencies, in agreement with previous empirical studies (Mason & Burns, 2015;Ryan & Brenowitz, 1985;Seddon, 2005;Tubaro & Mahler, 1998;Wallschäger, 1980) and consistent with the traditional understanding of how birds produce sound (Nowicki & Marler, 1988). Thus, the diversification of ovenbird body size has contributed to the diversification of spectral components of their song.
We also found that temporal and performance features of ovenbird song correlate with beak size, such that birds with larger beak size produce slower paced songs at lower performance. Our findings agree with other studies showing an effect of beak size on the pace of sound production (Huber & Podos, 2006;Seddon, 2005) and on the performance of a trade-off between song pace and bandwidth (Ballentine, 2006;Huber & Podos, 2006;Podos, 2001), including in the woodcreepers, a subclade of the ovenbird family, as currently defined . Specifically, we find that larger beak size is associated with slower paced and lower performance songs.
Thus, our results suggest that as ovenbirds have diversified in beak size, they have also diversified in some temporal and performance components of song.
Birds with larger beaks are thought to face a limitation on producing high-performance songs because of a trade-off between force and velocity. This idea has been examined-and supported-extensively in Darwin's finches (Herrel et al., 2008;Podos & Nowicki, 2004a). In these finches, species with larger beaks have more developed musculature allowing them to crack larger seeds, an increased capacity for force that trades off with velocity, such that larger beaked birds are only able to open and close their beaks relatively slowly. However, this trade-off between force and velocity seems less likely to constrain song production in ovenbirds, most of which are specialist insectivores (Wilman, Belmaker, Simpson, De La Rosa, & Rivadeneira, 2014) with beak musculature adapted to softer food items. Small seeds are only thought to make up >20% of the diet in five ovenbird species (Geositta punensis, G. antarctica, Asthenes dorbignyi, A. arequipae, and A. huancavelicae) (Wilman et al., 2014). In support of the alternative idea that song modification may be constrained by the angular momentum of the jaw (Palacios & Tubaro, 2000), we found evidence that beaks with higher moment of inertia (as we approximated it) are more limited in vocal performance, such that ovenbirds with larger beak moment produce slower songs with lower performance. Our findings do not rule out effects of force and velocity but suggest that diversification in features of the beak that affect angular momentum of the jaw may constrain song diversification.
The key mechanism underlying "magic trait" speciation (Gavrilets, 2004;Thibert-Plante & Gavrilets, 2013) is the linkage between a trait used to recognize mates (here, songs) and a trait under ecological T A B L E 4 For all song traits, β values for all parameters averaged across a sample of 500 trees from the posterior distribution of trees. Parameters included in competitive models are in black.
See Table 2

| Sensory drive
The sensory drive hypothesis is widely accepted on the basis of case studies across a number of different animal groups (Cummings, 2007;Endler, 1992Endler, , 2000Wiley & Richards, 1982), yet its relevance across larger samples of species has been questioned, particularly in birds (Boncoraglio & Saino, 2007;Ey & Fischer, 2009  . Minimum frequencies of ovenbird songs are above 1 kHz, and the peak and maximum frequencies of most songs are lower than 4-5 kHz, thus occupying the band of intermediate frequen- cies (1-4 kHz) that do not suffer much variation in attenuation between habitats (Linskens et al., 1976). While direct ecological selection may have played a role in limiting the overall frequency range of ovenbird songs, it is not possible to determine whether sensory drive has selected against songs outside this frequency band, or alternatively whether song phenotypes have not diversified completely into potential acoustic space (e.g., frequencies above 5 kHz in open habitats) because of morphological constraints or conservatism of ancestral traits. However, it seems plausible that restriction to the ideal frequency window limits the strength of habitat-mediated sensory drive on spectral components of song in ovenbirds.
Although habitat as a categorical measure was an important parameter for a number of song traits, our continuous measure of habitat (Environment PC1) did not help to explain variation in any feature of song. Our categorical habitat scores and Environment PC1 were correlated, and the fact that associations with song were weakened when using a continuous variable underscores our general finding that direct environmental effects on song structure are relatively limited.

| Interactions among mechanisms
Our findings suggest that variation in the signaling environment and constraints on sound modification act independently on song pace. In contrast, we found evidence of both direct and indirect selection interacting to explain divergence in song performance. A measure of the trade-off between torque and velocity (beak moment) was the most important parameter fitted to vocal performance, yet this relationship varied across habitats, such that vocal performance was more sensitive to increases in beak moment in more open habitats. These findings confirm that strong interactions between habitat and morphology are fundamental in governing the magnitude and direction of song divergence and thus suggest that direct and indirect mechanisms of signal evolution cannot be considered in isolation.

| Stochasticity
Our phylogenetic comparative analyses suggest that shared ancestry and stochastic processes explain a large component of song evolution in ovenbirds, consistent with previous findings that evolutionary age explains a large proportion of song divergence in the family . However, we are able to rule out the possibility that stochasticity alone explains the diversification of most song traits in our study. The main exceptions are song length, number of notes, and bandwidth. We did not have a priori expectations that song length and note number would vary with morphological traits or habitat structure, although we did expect that song bandwidth would decrease in more closed habitats and for birds with larger beaks. For all three of these song traits, the evidence for fitted models was very low. Moreover, we only explored parameters associated with ecological selection on these traits, and thus, we may have overlooked a role for social or sexual selection, particularly as song length, note number, and bandwidth have all been shown to be under sexual selection via mate choice in other species (reviewed in Andersson, 1994;Catchpole & Slater, 2008;Searcy & Andersson, 1986).

| CONCLUSIONS
Two deterministic processes-sensory drive and correlated evolution-shape acoustic signals in ovenbirds. These pathways of song divergence act both independently and in concert, with ecological selection on beak and body size playing the most widespread role.
Although body size is particularly important in explaining how spectral features of song evolve and beak size is important in explaining how temporal features of ovenbird songs evolve, morphology alone is not the best predictor. Key temporal and performance measures of song are best explained by both beak size and habitat. Thus, we conclude that a combination of sensory drive and correlated evolution drives signal evolution, with the outcome tightly linked to ecology. In addition, we have demonstrated separate roles for body size and beak size via their constraints on both signal production and signal modification, respectively, providing new evidence that different potential "magic traits" can have contrasting effects on signal diversification. Our work highlights the importance of both direct and indirect sources of ecological selection as critical factors that need to be considered together in models of mating signal evolution.

ACKNOWLEDGMENTS
We thank J. Hadfield and three anonymous reviewers for helpful comments on earlier drafts of the manuscript. We are grateful to many individuals who collected specimens, tissue samples, and sound recordings, as well as to the institutions granting access to these materials; complete lists of data sources are given in Derryberry et al.

CONFLICT OF INTEREST
None declared.

AUTHOR CONTRIBUTIONS
EPD, NS, and JAT conceived the study. EPD, NS, SC, GFS, RTB, and JAT provided data. EPD, GFS, and GED analyzed data. EPD and JAT wrote the manuscript. All authors contributed to revisions.