Behavioral, morphological, and ecological trait evolution in two clades of New World Sparrows (Aimophila and Peucaea, Passerellidae)

The New World sparrows (Passerellidae) are a large, diverse group of songbirds that vary in morphology, behavior, and ecology. Thus, they are excellent for studying trait evolution in a phylogenetic framework. We examined lability versus conservatism in morphological and behavioral traits in two related clades of sparrows (Aimophila, Peucaea), and assessed whether habitat has played an important role in trait evolution. We first inferred a multi-locus phylogeny which we used to reconstruct ancestral states, and then quantified phylogenetic signal among morphological and behavioral traits in these clades and in New World sparrows more broadly. Behavioral traits have a stronger phylogenetic signal than morphological traits. Specifically, vocal duets and song structure are the most highly conserved traits, and nesting behavior appears to be maintained within clades. Furthermore, we found a strong correlation between open habitat and unpatterned plumage, complex song, and ground nesting. However, even within lineages that share the same habitat type, species vary in nesting, plumage pattern, song complexity, and duetting. Our findings highlight trade-offs between behavior, morphology, and ecology in sparrow diversification.


INTRODUCTION
Behavioral, morphological, and ecological traits have been used historically to reconstruct evolutionary relationships, and many taxonomic groups were originally designated on the basis of shared, homologous characters (e.g., Hamilton, 1962;Storer, 1955;Wolf, 1977). For example, similarities in syringeal and cranial morphology, plumage, nesting behavior, and foraging mode were used to establish generic limits and hypothesize relationships in tyrannid flycatchers, one of the world's largest and most diverse avian radiations (Lanyon, 1984;Lanyon, 1985;Lanyon, 1986;Lanyon, 1988a;Lanyon, 1988b). Likewise, Hamilton (1962) inferred species relationships and the origin of sympatry in the avian genus Vireo by comparing species-specific characteristics of distribution, habitat preference, foraging ecology, and external morphology. While contemporary studies of evolutionary relationships now rely largely on genetic data, studies of trait evolution in a phylogenetic framework continue to shed light on patterns of phenotypic evolution and diversification.
Molecular phylogenies facilitate tests of how traits evolve within clades. For example, mitochondrial DNA (mtDNA) sequences for the Empidonax group of tyrant flycatchers are congruent with morphological, behavioral, and allozymic traits, although some behaviors such as nesting and migratory tendency have stronger phylogenetic signal than others such as foraging mode (Lanyon, 1986;Cicero & Johnson, 2002a). In Icterus orioles, song and plumage evolution are highly labile between species but conserved across the genus as a whole. Furthermore, Icterus songs are more labile than those in closely related oropendolas (Psarocolius, Ocyalus), which tend to have conserved song characteristics (Price, Friedman & Omland, 2007).
The New World sparrows (Passerellidae, formerly Emberizidae) are a large, diverse lineage of songbirds that are well suited for studies of trait evolution in a phylogenetic framework. Evolutionary studies of New World sparrows include analyses based on morphology, plumage, soft-part colors, behavior, egg coloration, allozymes, mitochondrial and nuclear gene sequences, and phylogenomic data from ultraconserved elements (Wolf, 1977;Patten & Fugate, 1998;Carson & Spicer, 2003;DaCosta et al., 2009;Klicka & Spellman, 2007;Klicka et al., 2014;Bryson Jr et al., 2016;Sandoval et al., 2017). Although comparisons among these studies are compounded by differences in taxon and character sampling, together they provide a valuable framework for studying sparrow evolution. Whether behavioral traits are more labile (Blomberg, Garland Jr & Ives, 2003) or conserved (Brumfield et al., 2007) than morphological traits in sparrows remains an open question that deserves study.
One especially interesting group of New World sparrows is the historical genus Aimophila, which has been plagued by taxonomic uncertainty due to extensive morphological variation. Members of this group were united originally by characteristics of the bill, wings, tail, and feet (Swainson, 1837;Baird, 1858), but other ornithologists have long thought that they represent species from distantly related lineages (Ridgway, 1901;Dickey & Van Rossem, 1938;Storer, 1955;Wolf, 1977). Within the past decade, molecular predict that traits are more conserved among species in similar habitats than among those in different habitats.
We extracted genomic DNA from tissue using a modified salt extraction procedure (Miller, Dykes & Polesky, 1988), and PCR-amplified and sequenced four protein-coding mitochondrial genes (cyt-b, ND2, ATPase 8, COI) and three nuclear gene regions (intron 5 of transforming growth factor beta 2 [TGFb2] and beta-fibrinogen [Fib5], recombination activating gene RAG-1) using various combinations of primers (Table S2). We focused on the core ingroup taxa and putative allies for all loci (total of 5,344 bp: 3,495 mtDNA, 1,849 nDNA), and added mtDNA sequences from GenBank to fill out the taxon sampling (Table S1). PCR-amplification and sequencing were generally successful except for a few samples at some loci. We amplified DNA in 25 µL reactions with a mixture of 2 µL dNTPs (2 mM), 2.5 µl BSA (10 mM), 1.5 µL of each primer pair (10 mM), 2.5 µL of buffer (10×) pre-mixed with MgCl 2 , 0.1 µL of Taq polymerase, 1 µL of DNA, and double-distilled water. Amplification steps included an initial denaturation at 93 • C for 4 min followed by 30-35 cycles of denaturation (93 • C for 30 s), annealing (42−50 • C for 30 s), and extension (72 • C for 45 s), and a final extension at 72 • C for 5 min. Reactions had at least one negative and often a positive control, and we visualized PCR products on agarose gels stained with ethidium bromide. Following amplification, we cleaned the PCR products with Exonuclease I and Shrimp Alkaline Phosphatase (ExoSAP-IT, US Biochemical Corp.), and sequenced the purified products in both directions using Big Dye terminator chemistry v. 3.1 and an AB PRISM 3730 DNA Analyzer (Applied Biosystems). We checked and aligned all sequences using CodonCode Aligner v. 4.0.3 (CodonCode Corporation).

Phylogenetic analyses
We constructed phylogenetic trees using all 80 ingroup and 4 outgroup samples (Table S1) with mitochondrial and nuclear loci by performing Bayesian and maximum-likelihood concatenated analyses alongside species-tree inference. For the concatenated analyses, we first identified the best-performing model of sequence evolution for each locus and codon position for gene regions via Akaike Information Criterion with MrModeltest v. 2.3 (Nylander, 2004). We then constructed Maximum Likelihood (ML) phylogenies using RAxML v7.0.4 (Stamatakis, 2006;Stamatakis, Hoover & Rougemont, 2008), in which we performed 100 iterations of rapid bootstrapping while simultaneously finding the best tree in a single run with a GTR + I + G model of nucleotide substitution for each locus or gene region. We used BEAST v2.5.1 (Drummond & Rambaut, 2007;Drummond et al., 2012;Bouckaert et al., 2014) on the CIPRES Science Gateway (Miller, Pfeiffer & Schwartz, 2010) to conduct concatenated analyses in a Bayesian framework, in which we linked an uncalibrated clock model across loci but applied a separate HKY + I + G model of nucleotide substitution to each locus. We linked the tree prior for all loci and implemented a Yule model of speciation. We selected the Yule model because it is the simplest model of speciation, in which each lineage is assumed to have the same constant speciation rate, and is also appropriate for inferring phylogenies among species rather than among populations within species (https://www.beast2.org). We ran the BEAST analysis for 1 × 10 8 generations while sampling every 1,000 generations. We discarded the first 10% of sampled generations as burn-in, and assessed convergence and mixing by ensuring that ESS scores for each parameter exceeded 300 in Tracer v1.7.5.
We conducted a species-tree analysis using the *BEAST package within BEAST v2.5.1 (Bouckaert et al., 2014). For this analysis, we implemented a Yule speciation model and a constant population model with estimated population sizes for each gene tree and the resultant species tree. We ran the species-tree analysis for 1 × 10 9 generations and removed the first 10% as burn-in. For both the BEAST and *BEAST analyses, we subsequently generated maximum clade credibility trees from a thinned set of 5000 trees that was sampled every 20,000 or 200,000 generations, respectively.

Trait reconstructions
We scored 12 trait variables (9 binary and 3 multi-state) for each species (Table S3). Of these, 11 traits were described in detail by Wolf (1977) and we followed his scheme in assigning values as closely as possible. These included range size, typical habitat, plumage ''brightness'' (hereafter referred to as patterning), completeness of the postjuvenal molt, presence of a prenuptial molt, nest position, timing of skull ossification, group breeding, song structure, duetting, and duet type. We added geographic distribution as an additional trait in order to reconstruct its history in our focal clades. We assigned trait values based primarily on published information, which we took directly from Wolf (1977) for the species that he included, but we had to interpret and standardize definitions for some traits (e.g., range size, plumage patterning, song complexity) and for species not studied by Wolf. We used available audio recordings (Wolf, 1977 LP of audio recordings;Macaulay Library) to characterize song structure.
Binary traits used in trait reconstructions and tests of phylogenetic signal are described as follows: (1) Plumage patterning: Unpatterned species are generally black or tan, but may show small patches of color or clearly delineated markings, such as the facial patterns on Aimophila sumichrasti. Patterned species have large patches of color that differ from the rest of the body.
(2) Postjuvenal molt: This molt is complete in species where individuals molt the entire plumage at this life stage, and incomplete in species where individuals molt only part of the plumage.
(3) Prenuptial molt (also known as prealternate molt, which occurs before breeding in certain birds): This molt is present in some species and absent in others.
(4) Skull ossification: Normal species have fully ossified skulls by the end of the first year. Skull timing is delayed in species where this process takes longer than one year.
(5) Nest position: Ground nesters typically build their nests on the ground. All species that build nests off the ground, regardless of height, are considered to have raised nests.
(6) Group breeding: Species where more than a pair of adults occur together during the breeding season are considered to have groups (Emlen, 1997). For example, Wolf (1977) characterized Aimophila ruficauda as having groups because he observed one female, one adult male, and additional first year males in the same breeding flock. This differs from other species where a single pair occurs on a territory. We scored group breeding as present only if the species frequently or regularly breeds in groups.
(7) Song structure: Song structure determinations followed Wolf (1977), other published reports (e.g., Rodewald, 2015), and examination of sound files (Table S3). Simple songs consist of one to four note types, although the notes may be repeated many times. They include songs with consistent syntax, including those that begin with a few introductory notes followed by a trill. Complex songs include a variable array of frequency-modulated note types and syntactical constructions.
(8) Duetting: Two individuals duet when they time their vocalizations to occur simultaneously or alternatively in a predictable manner.
(9) Geographic distribution: Northern Temperate species have breeding ranges in North America. Middle American species breed from Mexico through Panama.
Multi-state traits used in trait reconstructions are described as follows: (10) Range size: We followed Wolf's (1977) characterization of species as having small, medium, or large breeding ranges, which we measured from his published distribution maps as ca. 240 km long at the longest diameter, 240-800 km long, and over 800 km long, respectively. This trait, in combination with distribution, reflects geographic patterns of diversity and environmental tolerance (Stevens, 1989). Some species such as Aimophila notosticta, which is confined to the mountains of central and northern Oaxaca, have much more restricted ranges than other widespread taxa.
(11) Habitat: Arid scrub species coincide with Wolf's (1977) ''thorn scrub'' category and live mostly in dry environments characterized by low, bushy vegetation. Pine-oak species live in woodlands that may be dominated by pine and/or oak trees. Grassland species live in open environments with predominantly grassy, herbaceous vegetation. Although this character has three states, we converted it to binary for trait correlation analyses. We designated grassland as ''open'' habitat, and both thorn scrub and pine-oak as ''closed '' habitat (Boncoraglio & Saino, 2007).
(12) Duet type: We used Wolf's (1977) named duet types to indicate duet structure, and only included character states for species that he coded because these designations are somewhat subjective. Squeal duets have broadband elements that sound like squeals. Chitter and chatter duets have similar brief broadband ticking elements. Aimophila carpalis gives a unique ''warbled'' (Wolf, 1977) duet.
We used the BEAST maximum clade credibility tree (ML showed the same topology) with the concatenated mtDNA and nDNA dataset to estimate character transition rates and reconstruct ancestral character states. We reconstructed character states on our tree with all samples as well as in the two clades of Peucaea (colored blue in Fig. 1) and Aimophila plus the closely related genera Melozone and Pipilo (colored pink in Fig. 1). We performed ancestral state reconstructions of our categorical traits using a model-fitting approach that allowed for polymorphic character states within the package corHMM (Beaulieu, Oliver & O'Meara, 2017). Polymorphic character states were assigned likelihoods following the methods of Felsenstein (2004), with each possible character assigned an equal probability. This allowed us to estimate the phenotype of ancestral nodes while incorporating uncertainty in species' phenotypes that were based on missing or incomplete data. We implemented an 'equal rates' model, in which transition rates between any character state were assumed to be equal with an upper bound of 100, while the character state of the root for each group was estimated following differential equations put forth by Maddison, Midford & Otto (2007) and FitzJohn, Maddison & Otto (2009). After estimating the transition rate matrix, we subsequently calculated the marginal likelihood states at each node.
We used Pagel's (1994) correlation method in Mesquite (Maddison & Maddison, 2003) to test Wolf's hypothesis that individual traits vary in association with habitat for Peucaea. In this group, we tested for associations of prenuptial molt, nest location, song structure, and plumage patterning with open and closed habitat. We did not test other traits such as molt or skull ossification because we had no a priori predictions about their relationships, and we lacked the required information for all taxa. Because this test requires binary character states, we did not test non-binary traits. All members of the Aimophila clade live in closed habitat, so within-clade tests for effects of habitat are uninformative; however, we tested for a relationship between song complexity and plumage patterning in that group. We ran tests with 10 extra iterations over 10,000 simulations. Extra iterations implement additional searches within the maximum likelihood framework, and the simulation number is used to estimate statistical significance, with higher numbers above 100 returning better p-value estimates based on simulation output (Maddison & Maddison, 2018). Because the tests of Wolf's specific hypotheses were done on small samples, we followed up on some of the associations they revealed by using the same correlation method to relate song with plumage and habitat use for all species in the tree. We were unable to evaluate additional traits in this way because of missing data across the full tree.
We examined trait lability among all species in the full tree for a subset of behavioral and morphological traits by calculating the D statistic, which is suitable for binary, categorical traits (Fritz & Purvis, 2010), using the function phylo.d within the caper package in R (Orme, 2018). Binary traits included in these analyses included plumage patterning, postjuvenal molt, prenuptial molt, skull ossification, nest position, group breeding, song structure, and duetting. The bounds of the D statistic depend on the number of tips in the phylogenetic comparative analysis, but in general, more negative values imply stronger phylogenetic signal (Fritz & Purvis, 2010). The D statistic is calculated by comparing the sum of observed sister-clade differences in the evolutionary history of the binary trait ( d obs ) to simulated data sets of sister-clade differences generated by randomly shuffling the tip values of the phylogeny ( d r ) and another simulated data set generated by Brownian motion ( d b ). Thus, D is comparable across data sets such that when D is equal to 1, the binary trait in question has a phylogenetically random distribution across the tips of the phylogeny. In contrast, when D is equal to 0, the distribution of binary values across the tips is equal to that expected under Brownian motion (Fritz & Purvis, 2010). Furthermore, values of D can fall outside of the range of 0 to 1, such that negative values indicate phylogenetic conservatism beyond that expected by Brownian motion, while values greater than 1 indicate phylogenetic dispersion beyond that expected by random shuffling of tip values (Fritz & Purvis, 2010). This method also allows one to calculate two separate one-tailed probabilities (i.e., p values) that the observed D statistic is greater than 0 and less than 1.
For each trait, we omitted taxa with unknown or ambiguous character states.

Sequence variation
The complete data set of 84 individuals from 47 species and up to 5,344 bp of sequence contained 1,740 variable (32.6%) and 1,546 (28.9%) potentially parsimony-informative sites. The two clades for which we reconstructed character states had 1,324 (24.8%) variable and 1,188 (22.2%) parsimony-informative sites. Average nucleotide composition for the mitochondrial genes cyt-b and ND2 were similar to values reported in previous studies of this group and related taxa (Klicka & Spellman, 2007;DaCosta et al., 2009), with an excess of cytosine (36%) and a deficiency of guanine (10-13%). Average uncorrected sequence distances among core taxa for the mitochondrial gene regions were 11% in Peucaea (6.6-14.8%) and 4.9% in Aimophila (3.7%-6.1%). The mean distance between Aimophila and the closely related genera Melozone and Pipilo was 9.1% (range of 7.7% to 11.5%).

Phylogeny
Maximum likelihood (Fig. S1) and Bayesian methods (Fig. 1) of phylogenetic reconstruction produced similar phylogenetic hypotheses, with the strongest support obtained for the concatenated analysis of mtDNA and nuclear sequence data (Fig. 1). With the exception of three genes (ATPase 8, Fib 5, TGFb2), the best model was GTR + I + G for the data partitioned by loci, mtDNA partitioned by codon position, and combined mtDNA and nDNA sequences. With all samples combined, taxa grouped into two lineages that received high to moderate support in the phylogenetic analyses. The first lineage included Peucaea, Rhynchospiza, Arremonops, and Ammodramus. Within that lineage, the eight species of Peucaea formed a monophyletic group that was strongly supported and distinct from Rhynchospiza and the other genera. The second lineage included species in multiple genera, with a strongly supported clade that united species retained in Aimophila with species of Melozone and Pipilo. The species quinquestriata was sister to Amphispiza bilineata in a lineage that included Chondestes and Spizella, and those taxa were distant to the clade containing Aimophila. The species tree analyses generated a phylogeny that was concordant with the concatenated approaches and many of the same relationships were recovered (Fig. S2). However, the resultant species tree did not have strong posterior probability values for the large majority of nodes, which likely reflects the relatively small number of loci and the small number of individuals per species used in the coalescent-based species tree analysis (Camargo et al., 2012; Fig. S2). A species tree constructed with many more loci also was not able to resolve all relationships within the family (Bryson Jr et al., 2016).

Trait reconstructions Peucaea and Aimophila clades
Ancestral state reconstructions (Fig. 2 through Fig. 5) show that both the Peucaea and Aimophila clades originated in Middle America (Fig. 2), with some members of each clade shifting their ranges northward into the Northern Temperate zone. Aimophila species descended from a common ancestor that is predicted to have a large geographic range and a preference for pine-oak (closed) habitat (Figs. 2 and 3). We were unable to reconstruct the geographic range and habitat preference of ancestral Peucaea species unequivocally. Molt patterns, plumage patterning, and timing of skull ossification showed different histories in the two clades. The ancestral species in both clades had partial postjuvenal molts, but they differed in the presence (Peucaea) or absence (Aimophila) of a prenuptial molt (Fig. 4). Prenuptial molt has been lost once in Peucaea, and gained twice within the broader Aimophila clade. Evolutionary patterns of plumage coloration likewise differed between clades (Fig. 4). The ancestral Peucaea had unpatterned plumage, and there has been a single transition to patterned coloration in one descendant lineage. In contrast, the Aimophila clade shows more uncertainty, with multiple probable transitions between unpatterned and patterned plumage. While the ancestral Aimophila species had normal skull ossification timing, the skull timing of the Peucaea ancestor is uncertain and there is diversity in this trait among modern lineages (Table S3). Three Peucaea species form a clade with normal skull timing, three species form a clade with delayed skull timing, and a third clade is split with one species in each category.
Ancestral state reconstructions of behavioral traits also showed different patterns. Most species in the two clades live in pairs and do not form larger social groups (Table S3). The only exceptions are P. ruficauda and P. humeralis. Because their close relative P. mystacalis does not form groups, the presence of groups in P. ruficauda and P. humeralis may represent separate gains of the trait or a single gain with a subsequent loss of the trait in P. mystacalis. The ancestral nest type for Peucaea is a raised nest (Fig. 3), while the ancestral nest type for Aimophila is equivocal. However, members of both clades use both nest locations. Simple songs are the ancestral condition in both clades, with complex songs evolving once among the Peucaea group and twice among the Aimophila group (Fig. 5). Many members of both clades produce vocal duets (Fig. 5). Duetting clearly represents an ancestral condition among Peucaea species that is highly conserved, while duets have been lost at least twice within the Aimophila clade (A. notosticta, Pipilo). Furthermore, duet type shows phylogenetic conservatism in acoustic structure (Fig. 5). Peucaea species all sing rapidly modulated ''chitter'', ''chatter'', or ''warble'' duets, while all members of the Aimophila group with well-described duets produce broadband ''squeal'' duets. Pagel (1994)'s correlation tests showed that preference for closed habitat is correlated with patterned plumage (p = 0.011) and simple songs (p = 0.010) in the Peucaea clade.

Trait correlations
In contrast, open habitat preference is correlated with unpatterned plumage (p = 0.0069) and complex songs (p = 0.011), as well as with ground nesting (p = 0.010), in this clade. Open habitat use is not correlated with prenuptial molt (p = 0.11). All species in the Aimophila clade occur in closed habitat, where they exhibit a negative association between vocal and visual signals such that simple song is correlated with patterned plumage (p = 0.021). Across our full tree (Fig. 1), unpatterned coloration is correlated with transitions into open habitats (p = 0.026), mirroring the results within our two focal clades. Song structure did not correlate with transitions to or from open (p = 0.746) habitat. Plumage patterning correlated with song complexity such that patterned birds tended to have simpler songs (p = 0.045) across all species in our tree.

Measures of trait lability
To examine the lability of behavioral and morphological traits among sparrows through time, we estimated character state changes for eight traits using our full tree that included a broader sampling of our two focal clades and related taxa without missing data (Table 1)

Phylogenetic relationships of the Peucaea and Aimophila clades
We found similarities and differences from prior phylogenies of New World Sparrows (DaCosta et al., 2009;Klicka et al., 2014;Bryson Jr et al., 2016;Sandoval et al., 2017). Overall, our results support division of the former ''Aimophila'' into Peucaea, Rhynchospiza, and Aimophila, but the phylogenetic details differ. For one, we found Peucaea carpalis and P. sumichrasti to be sister to the remaining Peucaea with over 95% posterior probability (PP) support in the concatenated analysis (Fig. 1), while DaCosta et al. (2009) could not resolve this relationship; however, support was lower in our species tree (Fig. S2) and in the maximum likelihood tree of Klicka et al. (2014). Another difference was in the clade Table 1 Estimates of phylogenetic signal and the sum of sister-clade differences in binary behavioral and morphological traits using the phylogeny depicted in Figure 1. The D statistic indicates the amount of phylogenetic signal present in the binary trait. When D = 0, the phylogenetic signal of a given trait is equal to Brownian motion. When D = 1, trait evolution is random with respect to phylogeny. Thus, more negative D values indicate stronger phylogenetic signal and fewer changes between sister clades, while higher D values indicate less signal and more changes between sister clades. Values in the P D>0 column indicate the probability that trait evolution exhibits less phylogenetic signal compared to a null distribution of values under Brownian motion. Values in the P D<1 column indicate the probability that trait evolution exhibits more phylogenetic signal compared to a null distribution of values when trait evolution is random with respect to phylogeny. Each null distribution was generated with 1,000 permutations. containing Aimophila rufescens, A. ruficeps, and A. notosticta. While DaCosta et al. (2009) and Klicka et al. (2014) found strong support for a sister relationship between A. notosticta and A. ruficeps based on mtDNA when all three taxa were included, we recovered a sister relationship between A. ruficeps and A. rufescens using both mtDNA and nuclear markers in our concatenated analysis (PP >0.95). Our species tree analysis, on the other hand, was unable to resolve the relationships between these three taxa. The different studies all supported a sister relationship between Aimophila, Melozone, and Pipilo, although Sandoval et al. (2017) did not recover monophyly within Melozone (i.e., some species are more closely related to Aimophila than other congeners) with more intensive sampling of that genus. We also confirmed that quinquestriata is the sister to Amphispiza bilineata, although these taxa are separated by a deep branch and both are distantly related to both ''Aimophila'' and Artemisiopiza (formerly Amphispiza) belli; none of the prior studies included all three taxa in their analyses. Finally, we found Peucaea and relatives to be sister to other sparrows sampled, while Bryson Jr et al. (2016) found the Amphispiza lineage to be sister to other sparrows, including Peucaea, based on UCE sequence data. Together, these studies offer a compelling overview of species relationships among Aimophila, Peucaea, and related sparrow taxa, although additional work is needed to resolve some relationships. Furthermore, they clarify relationships in the three ecological complexes that Wolf (1977) defined, including support for a close affinity between the Aimophila ruficeps complex and species in the genus Melozone.

Trait evolution within the Aimophila and Peucaea clades
All species in the Aimophila and Peucaea clades have Middle American ancestors. The ancestor of the Aimophila clade had a large range size, but range size was equivocal in the Peucaea clade and reflected high variability among those species. Wolf (1977) noted that species in this clade had ranges centered around Mexico, with possible Middle America origins, and pointed out that closely related species varied in range size. Our analyses support these ideas and highlight the variability in range location and size within the group. Four of the eight Peucaea species have expanded (3) or moved (1) their ranges from ancestral Middle America to Northern Temperate locations. Six of the twelve Aimophila/Melozone/Pipilo species also have expanded (4) or moved (2) their ranges into Northern Temperate regions. Anecdotally, none of the species that showed range shifts are long-distance migrants, but northern temperate species tend to have larger ranges (Howell & Webb, 1995). These results fit with recent work showing that the common ancestor of all species in Passerellidae was likely a tropical endemic (Winger, Barker & Ree, 2014). The findings also support Rapaport's rule, which states that high latitude species tend to have larger ranges than low-latitude species (Stevens, 1989;Cicero & Johnson, 2002b).
Ancestors of Aimophila and Peucaea sang simple songs and formed pair bonds. Subsequently, group living evolved only in Peucaea humeralis and P. ruficauda, while complex songs evolved three times and are now present in seven of our modern focal species (Fig. 5). Wolf (1977) used song and duet similarity as a justification for grouping species together, and our phylogeny supports those groupings while confirming that shifts in song form occur primarily between but not within groups. Likewise, Marshall (1964) concluded that voice is a good predictor of relationships within the ''brown towhee '' complex (Melozone fusca, M. crissalis, M. aberti, M. albicollis), especially when used with other attributes. Song structure is known to vary widely across avian species, and other work has shown that song traits may be both conserved and divergent within and among groups (Price & Lanyon, 2002;Price, Friedman & Omland, 2007;Snyder & Creanza, 2019). Importantly, Wolf's (and hence our) divisions of songs into ''simple'' and ''complex'' reflect only two potential measures of complexity-syllable type diversity and syntax. Because we followed Wolf's trait assignments, these categories are qualitative. More detailed and quantitative song-form analyses would be a valuable follow-up to this work, and might show that elements of song complexity are differentially conserved or labile through evolutionary time (Benedict & Najar, 2019).
Ancestral habitat use and nesting behavior varied between clades, as did skull ossification timing, molt patterns, and plumage. The Aimophila common ancestor might have had patterned plumage, while the Peucaea ancestor was likely unpatterned. Modern species in both groups show a range of plumage patterns, which appear to be relatively labile suggesting that color patterning can both appear and disappear. Similar trends have been found in other avian species and across birds more generally (Price, Friedman & Omland, 2007;Hofmann, Cronin & Omland, 2008;Dunn, Armenta & Whittingham, 2015;Maia, Rubenstein & Shawkey, 2016;Shultz & Burns, 2017;Marcondes & Brumfield, 2019), showing that evolution may favor elaborate plumage or drabness depending on selective pressures. In addition, there appears to be a negative association between plumage patterning and song complexity, both within our focal clades and across our full phylogeny. Two lineages that contain species with complex songs (Peucaea cassinii-P. aestivalis-P. botteri and Aimophila rufescens-A. ruficeps-A. notosticta) are characterized by unpatterned plumage, while species in other lineages with simple songs (e.g., Peucaea mystacalis, P. humeralis, P. ruficauda) have patterned plumage. Other studies on the evolution of plumage and song complexity in birds have shown that some groups (e.g., cardueline finches Badyaev, Hill & Weckworth, 2002) exhibit a similar trade-off whereas other groups (e.g., tanagers Mason, Shultz & Burns, 2014) do not show a correlation between song and plumage elaboration. Such mixed results suggest that the relationship between song and plumage likely depends on a variety of factors, which may include physiological processes (Shutler, 2010) or ecological interactions.
Song complexity may be greater in open versus densely vegetated habitats because of the acoustic properties of those habitats (Morton, 1975;Wolf, 1977;Wiley, 1991;Derryberry, 2009;Mason & Burns, 2015;Derryberry et al., 2018;Crouch & Mason-Gamer, 2019; but see Karin et al., 2018;Hill, Pawley & Ji, 2017). Within the Aimophila and Peucaea clades, we found that complex songs are significantly associated with open grassland habitat, and simple songs are associated with closed (arid scrub or pine-oak) habitat. Such a relationship may result from habitat structure, but might also arise because more grassland species (Peucaea botteri, P. cassinii, P. aestivalis) occur in Northern Temperate latitudes where they experience higher environmental variability, which is known to influence bird song complexity (Medina & Francis, 2012; but see Najar & Benedict, 2019). We did not, however, recover the same relationship when all species were included. Therefore, we have tentative support for Wolf's (1977) hypothesis that habitat drives song features within the focal clades, but his observed trend is not universal. It is possible that the observed correlations between habitat and song within the Aimophila and Peucaea clades results from small samples sizes, because a small number of trait transitions drive these correlations (Maddison & FitzJohn, 2015).
Color evolution is often driven by habitat type, with natural selection favoring certain colors, patterns, or lack of patterning (Dunn, Armenta & Whittingham, 2015;Shultz & Burns, 2013;Marcondes & Brumfield, 2019;Miller et al., 2019). However, a global analysis showed that habitat does not predict plumage patterns across birds as a whole (Somveille, Marshall & Gluckman, 2016). We found that unpatterned plumage correlated with open grassland habitat among members of the Aimophila and Peucaea clades, as well as when trait correlation analyses were run using the full tree. Thus, unlike Wolf's (1977) ideas about the influence of habitat on song, his hypotheses regarding habitat and plumage evolution appear to apply broadly within the Passerellidae. Unpatterned coloration can be advantageous for crypsis in open grassland habitats (Hill & McGraw, 2006). Our findings-along with studies of other specific groups such as woodpeckers (Miller et al., 2019) and ovenbirds (Marcondes & Brumfield, 2019)-suggest that the influence of habitat on plumage patterning may be clade-specific.

Lability versus stability of behavioral and morphological traits
Although behavioral traits are expected to be more labile than morphological traits (Blomberg, Garland Jr & Ives, 2003), we found that the behavioral traits identified by Wolf (1977) exhibited stronger phylogenetic signal across our full tree than the morphological traits (Revell, Harmon & Collar, 2008). In particular, prenuptial molt and plumage patterning showed low phylogenetic signal and high lability. This result is counterintuitive for prenuptial molt, because molt strategies in birds are integral to their life history (e.g., Terrill, 2017;Terrill, 2018) and are not predicted to be highly labile. In contrast, concordant with our findings, studies on diverse taxa have shown that plumage patterning is generally quite labile across avian clades (Omland & Lanyon, 2000). Lability in this trait is associated with a variety of biotic and abiotic attributes, such as variation in mating systems (Møller & Birkhead, 1994;Price & Whalen, 2009) and light environments (Shultz & Burns, 2013;Marcondes & Brumfield, 2019). The species studied here all have similar monogamous mating systems, but patterning was correlated with habitat across Passerellidae, providing a potential selective factor shaping patterning. Future work studying this variability would be informative.
Song structure, duetting, nest location, group breeding, skull ossification, and postjuvenal molt are all traits with strong phylogenetic signals. The most highly conserved trait was duetting, which was frequent across the tree but had few evolutionary origins. Both song structure and duet type tended to be conserved, such that close relatives used similar sounds. Complex song is often attributed to sexual selection (Andersson, 1994), while duetting is associated with pair-bond maintenance and territory defense (Logue & Hall, 2014). For song structure, the phylogenetic signal in our focal clades came primarily from the derivation and maintenance of complex song in two lineages (Fig. 5). Conservation of complex song is sometimes found in other groups (Price & Lanyon, 2002;Tietze et al., 2015; but see Price, Friedman & Omland, 2007). For this study, we followed Wolf (1977) in defining song complexity based on the number and variety of note types in the speciestypical song. Although debate exists about what metrics of song best describe ''complexity'' (Pearse et al., 2018;Najar & Benedict, 2019;Benedict & Najar, 2019), increased complexity reflects higher syllable diversity in the species we studied and is conserved in related lineages. This result might suggest that closely related species are under similar selective pressures for maintenance of song structure, potentially relating to visual signaling or habitat as discussed above (Panhuis et al., 2001;Boncoraglio & Saino, 2007).
Duet vocalizations are derived and maintained in many of the focal species in our study. Avian duets have been shown to perform a range of functions, including joint resource defense, mate defense, and pair coordination (Hall, 2009;Dahlin & Benedict, 2014). Work on the genera Melozone and Peucaea has demonstrated that duets of different species have similar functions in resource defense, providing a possible selective pressure maintaining this trait (Benedict, 2010;Sandoval, Méndez & Mennill, 2013;Illes, 2015;Sandoval, Juárez & Villarreal, 2018). Similarly, studies of other New World avian clades have shown that vocal duet presence and form are often evolutionarily conserved (Mann et al., 2009;Mitchell et al., 2019). This pattern is likely driven by life-history traits such as monogamy, territoriality, and sedentariness, which are shown by many of the species included in our analysis (Benedict, 2008;Logue & Hall, 2014). Most strikingly, duet type (Fig. 5) in addition to duet presence is conserved, as noted by Wolf (1977). Our focal species therefore provide a valuable system for future analyses examining how territorial behavior throughout the year and the length of pair bonds might promote evolutionary stability in behavioral traits. Overall, the strong phylogenetic signal found for vocal traits and other behaviors, including nest location and group breeding, counters a general assumption that behavioral traits are more labile than morphological traits (Blomberg, Garland Jr & Ives, 2003).

CONCLUSIONS
Our study elucidated relationships among New World sparrows and showed that behavioral traits such as vocal duetting and nest placement can exhibit stronger phylogenetic signal than morphological traits. Habitat appears to be an important driver of trait evolution within Aimophila and Peucaea, but its influence is not consistent within the Passerellidae. While habitat does not predict song evolution reliably across New World sparrows, the correlations of unpatterned plumage with open habitats and complex songs does hold broadly in sparrows. Outcomes suggest that New World sparrows provide a fertile testing ground for future studies of avian trait evolution.