Genomic divergence in allopatric Northern Cardinals of the North American warm deserts is linked to behavioral differentiation

Abstract Biogeographic barriers are considered important in initiating speciation through geographic isolation, but they rarely indiscriminately and completely reduce gene flow across entire communities. Explicitly demonstrating which factors are associated with gene‐flow levels across barriers would help elucidate how speciation is initiated and isolation maintained. Here, we investigated the association of behavioral isolation on population differentiation in Northern Cardinals (Cardinalis cardinalis) distributed across the Cochise Filter Barrier, a region of transitional habitat which separates the Sonoran and Chihuahuan deserts of North America. Using genomewide markers, we modeled demographic history by fitting the data to isolation and isolation‐with‐migration models. The best‐fit model indicated that desert populations diverged in the Pleistocene with low, historic, and asymmetric gene flow across the barrier. We then tested behavioral isolation using reciprocal call‐broadcast experiments to compare song recognition between deserts, controlling for song dialect changes within deserts. We found that male Northern Cardinals in both deserts were most aggressive to local songs and failed to recognize across‐barrier songs. A correlation of genomic differentiation and strong song discrimination is consistent with a model where speciation is initiated across a barrier and maintained by behavioral isolation.

While the pattern of isolation by barriers is well known, the attributes that allow organisms to pass through biogeographic filters are comparatively understudied. The meeting of taxa at distinct geographic areas of secondary contact, or suture zones, has been documented across North America and other regions (Remington, 1968;Swenson, 2006;Swenson & Howard, 2004, 2005. While suture zones have shown that gene flow can occur in some taxa (Remington, 1968), they may also provide insight into why other taxa do not show introgression during secondary contact. Possible factors include variations in dispersal ability, differences in niche breadth or preferences, pre-and post-zygotic reproductive barriers, or a combination of these. Understanding the importance of any of these mechanisms in preventing gene flow will clarify the link between the genetic structuring of populations and the subsequent initiation of speciation.
The Cochise Filter Barrier, which is a geological and ecological formation separating the Sonoran and Chihuahuan deserts in the southwestern United States and northern Mexico, is one example of the heterogeneous effects that filter barriers can have on the surrounding biota. The barrier formed during the uplift of the Sierra Madre Occidental and the Pliocene and Pleistocene glacial cycles (Morafka, 1977). Pleistocene glacial-interglacial cycles caused the Sonoran and Chihuahuan deserts to expand and contract repeatedly, alternately connecting the deserts via an arid corridor and separating the deserts with woodlands during colder glacial periods (Van Devender, 1990;Van Devender, Betancourt, & Wimberly, 1984). Under contemporary climatic conditions, the deserts are connected by a corridor of xerophytic vegetation (Holmgren, Norris, & Betancourt, 2007;Van Devender, 1990;Van Devender et al., 1984).
In birds, a particularly salient prezygotic reproductive barrier comes in the form of song. Male songbirds sing to defend their territories and attract mates and behave aggressively toward conspecific songs and intruders (Catchpole & Slater, 1995;Gill & Lanyon, 1964). Juveniles learn their songs from nearby singing adults (e.g., Jenkins, 1978) and thus slight variations in dialect are retained in very localized areas as a consequence of low dispersal (Lanyon, 1979;Lemon, 1975;Marler, 1997;Slater, 1989). Females can discriminate against song types and dialects to choose mates (O'Loghlen & Rothstein, 1995;West, King, & Eastzer, 1981).
Divergence in male traits may be associated with speciation, either directly through male-male competition or as mediated by female choice (Burdfield-Steel & Shuker, 2018;Tinghitella et al., 2018;Uy, Irwin, & Webster, 2018). Assessing male-male competition is more tractable than female choice in experiments on wild birds because males more readily respond to experimental stimuli. Most studies done to date assumed that males and females behave similarly to each other in terms of response to male song (e.g., Derryberry, 2011;Dingle, Poelstra, Halfwerk, Brinkhuizen, & Slabbekoorn, 2010). The few studies that have assessed both sexes have found no evidence that males are more discriminating than females, supporting such an assumption (Uy et al., 2018). Thus, to obtain larger samples sizes in order to tease apart the effects of ancestry and geography on song discrimination, we focused on male responses in this study.
If males on either side of a barrier sing different songs, females may not recognize a novel song as a reproductive signal, reducing interactions between populations and preventing successful gene flow (Hunt, Breuker, Sadowski, & Moore, 2009;Lipshutz, 2017b;Searcy & Andersson, 1986). Populations would thus become isolated, and differentiation would be maintained by behavioral isolation. Like other songbirds, Northern Cardinal males are generally less aggressive in response to unfamiliar songs, and the species is sensitive to dialect changes that can occur over tens of kilometers, responding with decreased levels of aggression to more distant dialects (Anderson & Conner, 1985;Lemon, 1966Lemon, , 1974. Given this sensitivity, Northern Cardinals are likely to use song recognition as a means of species recognition, making them a candidate for investigating the relationship between genetic connectivity and song divergence. At present, however, the impact of song variation across the deserts has not been examined with respect to the impact of dialect changes due to geography. Such an assessment would disentangle the roles of dialect changes due to geographic distance versus dialect changes due to allopatry across a barrier, sexual selection, or reproductive character displacement.
The Northern Cardinals in the Sonoran and Chihuahuan deserts are a tractable model for testing these changes as they are currently allopatric without a known contact zone. Northern Cardinals have a fragmented distribution across the Cochise Filter Barrier, being separated by a gap of ~200 km that corresponds to the elevational and environmental change of the Cochise Filter Barrier. Due to this, there should be no contemporary impact on the dialects in this region either from song learning or from reinforcement. Thus, this system allows for the study of the early stages of speciation that are unbiased by ongoing secondary contact, and our approach controls for dialect changes over large distances without relying on contact zones as proxies for connectivity.
Prior work has shown that this species shows phenotypic (Ridgway, 1901) and genetic (Smith & Klicka, 2013;Smith et al., 2011) differentiation across the barrier. At present, however, the amount of gene flow that occurs across the barrier, either currently or historically, has not been quantified. There are multiple potential factors that could have led to the separation of these lineages. From a pure dispersal standpoint, Northern Cardinals should readily be able to cross this region over evolutionary time assuming no environmental or behavioral barriers. The Northern Cardinal has undergone dramatic contemporary range expansions into the northern United States and Canada and there are numerous records of individuals that have dispersed a similar distance or greater across unsuitable habitat (Halkin & Linville, 1999). Further, vagrant Northern Cardinals are regularly observed well outside their species' resident distribution (Sullivan et al., 2009). Reconstruction of Pleistocene species distribution models also indicates that there were more suitable climatic conditions across the Cochise Filter Barrier during glacial cycles (Smith et al., 2011). Thus, it is unlikely that the observed differentiation across the Cochise Filter Barrier is due solely to limitations on this species' dispersal capabilities. Instead, it seems more likely that other traits have impacted divergence in this species.
How focal populations respond to vocal dialects is expected to be linked to the magnitude and direction of gene flow across the Cochise Filter Barrier. Individuals who successfully migrate should exchange their genetic material and dialect with the local population.
When no gene flow occurs across the barrier (i.e., pure isolation), both populations should respond aggressively to their own dialect and should ignore the other desert's dialect. Likewise, if gene flow occurs equally across the barrier in both directions (i.e., isolation with symmetric gene flow), then focal populations should respond equally aggressively to both their own dialect and to the other desert population's dialect. When gene flow is biased in one direction (i.e., isolation with asymmetric or unidirectional gene flow), one population is exposed to both dialects while the other is exposed only to their own. Because of this, the population receiving more migrants should respond aggressively to both dialects. However, the population that receives fewer migrants should respond less aggressively to the foreign dialect, or even ignore it. Populations that have come into secondary contact are predicted to show equal aggression to dialects if focal populations are tested within the contact zone, and ignore foreign songs outside of the secondary contact zone.
Here, we tease apart these complex scenarios by integrating demographic modeling of genomewide genetic variation and fieldbased experiments to test how a barrier regulates speciation. First, we characterized population structure and fit genomic data to pure isolation and isolation-with-migration models. From these analyses, we inferred the depth of divergence and timing of gene flow across the barrier. Second, we performed call-broadcast experiments in each desert to assess male aggression to local and non-local songs.
If song discrimination is a reproductive filter, then we predict that isolation and the extent of gene flow will be correlated with male aggression to non-local songs (Figure 1). By combining genomic estimates of isolation and introgression with responses of wild birds to song differences involved in mate choice, we explored whether behavioral isolation helps regulate gene flow across filter barriers.

| Collection of genomic data
We sequenced genomewide markers from vouchered genetic samples collected east and west of the Cochise Filter Barrier (Figure 2; Table A1). Northern Cardinals are sparse in the region of the barrier itself and as such we lack sampling there (Sullivan et al., 2009). All of the western samples occur in the Sonoran Desert (N = 54) while the eastern samples include the Chihuahuan Desert and adjacent areas in New Mexico and Texas (N = 31). For simplicity, we designated western individuals the Sonoran population and eastern individuals the Chihuahuan population, though they include individuals outside of the deserts proper. These correspond to the igneus and F I G U R E 1 Illustration of hypothesized relationships between gene flow and song (left panel) and geographic distance and song (right panel). If song acts as a reproductive filter (solid black line), male aggression to non-local songs should be high if gene flow between their populations is high (and thus if genetic isolation is low). However, if song discrimination does not act as a reproductive filter (dashed gray line), there should be no correlation between gene flow among populations (or genetic isolation) and aggression. Irrespective of whether song acts as a filter, populations that are at larger geographic distances should show lower aggression to songs due to dialect changes Geographic distance Gene flow Aggression to songs Aggression to songs Song is not a filter cardinalis lineages identified in previous work (Ridgway, 1901;Smith & Klicka, 2013;Smith et al., 2011). We also included three individuals of C. cardinalis carneus from the Pacific Coast of Mexico and one of C. sinuatus as outgroups.
We used a Qiagen DNeasy blood and tissue kit to isolate pure genomic DNA for later sequencing following the built-in protocol with minor modifications. Briefly, in the final elution step, we added two elutions of 200 μl of water to ensure all DNA had been removed from the filter, then concentrated the elution into a volume of 32 μl.
We quantified the concentration of the extracted DNA using Qubit We characterized genetic structure with a STRUCTURE analysis (Hubisz, Falush, Stephens, & Pritchard, 2009;Pritchard, Stephens, & Donnelly, 2000), running five runs each of clusters K = 1 to K = 5 for 100,000 generations of burn-in and 500,000 generations of run time. Note that for the ingroup-only dataset, we ran 10 runs each instead of five runs each. This technique assigns individuals to a specified number of K-clusters and outputs the log-likelihood of the K-cluster in question. These runs were automated with the program StrAuto 0.3.1 (Chhatre & Emerson, 2017). We evaluated the best K value using both the Evanno, Regnaut, and Goudet (2005) method and the Puechmaille (2016) method, as implemented in StructureSelector (Li & Liu, 2018). For the Puechmaille (2016) method, we used a mean membership threshold value of 0.5 after testing values between 0.5 and 0.9 (see Appendix 1 for those results).

| Demographic modeling
We modeled demographic history (N E = effective population size, T = time of divergence, and m = gene flow between populations) and performed model selection using fastsimcoal2 version 2.52.21 (Excoffier & Foll, 2011;Excoffier, Dupanloup, Huerta-Sánchez, & Foll, 2013). Using SNP data in variant call format (VCF) from PyRAD, we generated an unfolded joint site frequency spectrum (SFS) using ∂a∂i (Gutenkunst, Hernandez, Williamson, & Bustamante, 2009) and simulated from it in fastsimcoal2, using a custom script avail-  Nam et al., 2010) and generation time of 1 year based on the year of first breeding of the species (Halkin & Linville, 1999). We tested six demographic models ( Figure  For each model, we ran 25 iterations of 100,000 simulations on 100 parameter sets, selected the iteration with the highest estimated maximum likelihood, and chose the best model by comparing Akaike information criterion (AIC) scores. We considered a model that improved the AIC score (∆AIC) by 2.0 to be a significant improvement, with an improvement of 10.0 or more highly significant.
We then chose the best model and ran 100 bootstraps (100,000 simulations of 100 parameter sets) to calculate mean and 95% confidence interval estimates for effective population sizes, gene flow rates, time of divergence, and time of secondary contact.
The distribution and sampling ranges for model parameters were as follows (distribution; range): effective population sizes (log-uniform distribution; 50,000-1,000,000 haploid individuals), migration (log-uniform distribution; 0.001-20 individuals per generation), the Sonoran-Chihuahuan divergence time (uniform distribution; 500,000-1,000,000 years), and the time of secondary contact (uniform distribution; 0-21,000 years). Note that in fastsimcoal2, upper bounds on the ranges are soft boundaries and parameter values higher (but not lower) than these bounds can be estimated. Finally, we fixed the split between the Sonoran-Chihuahuan clade and C. c.
carneus to 2,000,000 years (Smith et al., 2011) to calibrate the parameters into absolute values.

| Testing behavioral isolation across the Cochise Filter Barrier
We used a playback (call-broadcast) experiment to examine the response of males to simulated intruders (Peters, Searcy, & Marler, 1980;Derryberry, 2007;Derryberry, 2011 Figure 3). Local recordings came from the same population whose responses we measured (i.e., the focal population). Distant recordings came from the same desert as the focal population, but from a large geographic distance (~450-625 km).

Across-Barrier recordings came from the other desert lineage, which
were necessarily at a large geographic distance. Distant songs and Across-Barrier songs should have been novel to the Local population for both desert populations (Lemon, 1975). Control recordings were that of a Cactus Wren (Campylorhynchus brunneicapillus), which is a distantly related bird common in both deserts. We chose these recordings to compare the effects of distance and presumed genetic relatedness on a population's response to a recording. bits. For each recording, we created a stimulus set of approximately 3-min duration (range: 2:59-3:10 min). Each stimulus set included three 50-s bouts of song, with 10-s periods of silence following each bout. We added these silent periods to mimic multiple singing bouts as Northern Cardinals in these areas rarely sang continuously for 3 min (see Appendix 2). We used 5-7 stimulus sets for each treatment except for the Control, in which we had one single Cactus Wren recording. We used an HTC6500LVW cell phone and an HMDX Jam Bluetooth speaker (model HX-P230GRA) with a ~1-m-long 3.5-mm audio cable to play the stimulus sets. We placed the speaker on a tripod ~1 m off the ground. Stimulus sets were standardized to 80-85 dB as measured 1 m away from the speaker. We broadcasted multiple stimulus sets per recording locality (or Treatment) to control for pseudoreplication (Kroodsma, 1989). The Control stimulus set was played at every site (N = 67 Sonoran, 61 Chihuahuan). We played the Sonoran stimulus sets 11-15 times each, except for two Across-Barrier stimulus sets which were played once and twice, respectively. The Chihuahuan stimulus sets were played 9-17 times each.

| Behavioral study sites
To choose sites for behavioral experiments, we placed GPS points in habitat known to have territorial males. We did not verify that points had resident males before beginning playback experiments on those sites. When males were found, we did not mark individuals but instead assumed that males found on territories were the territory holders. Sonoran Northern Cardinal territories in Pima County, AZ, average 1.56 ha (Gould, 1961) while Chihuahuan territories in Nacogdoches County, TX, have a mean ± standard deviation of 0.64 ± 0.14 ha (Conner, Anderson, & Dickson, 1986). Territory sizes range within the species from 0.21 to 2.60 ha (Halkin & Linville, 1999) with size partially dependent on foliage (Conner et al., 1986).
Assuming circular territory shape, this species forms territories with radii averaging 70.5 m in the Sonoran Desert and 45.1 m in the Chihuahuan Desert (range: 25.8-91.0 m). We assumed that distances between sites were sufficient to minimize territorial overlap and that each site comprised a single territory. Distances between Sonoran nearest-neighbor sites ranged from 38 to 205 m. Distances between Chihuahuan nearest-neighbors were 92-204 m. We generated transects of 6-13 sites each whose playbacks we always completed in the same order, barring dangerous conditions such as sudden inclement weather (see Tables A2 and A3). We did this Males had at least ~24 hr between playbacks to return to a non-disturbed state. Two Chihuahuan sites form exceptions to these rules as they had multiple playbacks per day, were done substantially later in the day, and with less than an hour between playbacks.
We conducted four playbacks at each site, randomly selecting one stimulus set from each of the four Treatment localities (Local, Distant, Across-Barrier, and Control). We observed the site for 3 min before playback began ("Pre-Playback" period). We then played the 3-min stimulus set ("Playback" period) and continued to observe the male during a silent post-playback period of 3 min ("Post-Playback" period). In most analyses, we combined the Playback and Post-Playback periods a posteriori into a "Response" period. During each period, we recorded multiple aggression measures: (a) the number of flights >1 m, or "Flybys," within the site; (b) the presence of alarm calls, or "Calls," within the site; (c) the number of male songs produced within the site, or "Close Songs"; (d) the number of male songs produced outside the site, or "Far Songs"; and (e) the distance of males from the speaker recorded every 10-s, or "Distance by Time." Distance to playback equipment is a known proxy for avian aggression and mating signal recognition (Searcy, Anderson, & Nowicki, 2006). We categorized distances into distance bins (0-1 m, 1-2 m, 2-4 m, 4-8 m, 8-16 m, 16-24 m, and >24 m) using markers placed 8 m and 16 m from the speaker. We chose these bins as they were easily estimated by observers while also accurately capturing distance variation of male Northern Cardinals during preliminary trials. We localized birds that disappeared from sight to their nearest distance bin by sound if we could track them without ambiguity. If the localization was ambiguous, we classified the individual as >24 m (out of the site). From these distance records, we computed the minimum distance to speaker, or "Closest Distance," for each period.
The five aggression measures were reduced to a single composite aggression measure via principal components analysis (PCA). We calculated the principal components using Response period (Playback and Post-Playback combined) data, then used the resulting loadings to calculate a posteriori the first principal component, PC1, for the Pre-Playback period, or "Pre-Treatment Aggression" measure. We used generalized linear mixed models (GLMMs) to determine whether there was a correlation between aggression (PC1) and Treatment. We set Treatment and Pre-Treatment Aggression as fixed effects and included the random effects of site and stimulus set in our models to control for individual site differences and differential responses to the various stimulus sets from the same locality.

| Raw genomic results from ddRAD pipeline
We collected 67,191 raw loci from our ddRAD pipeline. After filtering for paralogs, this was reduced to 33,626 processed loci, from which we extracted 28,798 unlinked SNPs out of a total of 361,011 variable sites and 148,370 parsimony-informative sites. Two individuals had high amounts of missing data which led to spurious assignment to groups in preliminary analyses; as such, these individuals were removed from further analyses (see Appendix 1). All other individuals had between 3,353-9,752 loci associated with them (mean ± standard deviation 7,000 ± 1,490, median 6,917).

STRUCTURE analyses on Sonoran and Chihuahuan individuals
showed strong population assignments, with Sonoran individuals always assigned to one cluster and Chihuahuan individuals never assigned to that cluster ( Figure 4). The highest log-likelihood support values were for K = 2 (mean ±standard deviation −181,543 ± 56) and K = 3 (−186,576 ± 131) populations. Comparisons of ∆K (Evanno et al., 2005) show highest support for K = 3. However, when K = 3, the third cluster never achieves more than 25% assignment probability in any individual. Using the methodology suggested by Puechmaille (2016) implemented in Structure Selector (Li & Liu, 2018), the highest support for all metrics is K = 2.
Our demographic analyses with fastsimcoal2 found that a model with asymmetric gene flow ( Figure 5) (Table A4). Though the models with gene flow received higher support than models without (Table A4), the actual estimated gene flow rates across the barrier were minute for both the Sonoran-to-Chihuahuan (mean 1.18 × 10 −5 ; 95% CI 2.11 × 10 −6 -3.60 × 10 −5 ) and Chihuahuan-to-Sonoran routes (mean 1.02 × 10 −5 ; 95% CI 7.24 × 10 −6 -1.38 × 10 −5 ). We compared the effects of GLMMs with and without recording locality (Treatment) as a predictor and found that Treatment had a significant effect on male aggression in both deserts (Table 1).

| Levels of aggression to playbacks from different localities
For playbacks on Sonoran individuals, the model with Treatment was a better fit to the Aggression data than the model without For playbacks on Chihuahuan individuals (Table 1)  TA B L E 1 Models including treatment locality explain responses to song dialects better than models without in both deserts show minimal aggression toward Across-Barrier songs, treating the songs of a foreign desert as heterospecific.

| D ISCUSS I ON
We found that the Northern Cardinal had low gene flow and strong male song discrimination across the Cochise Filter Barrier. Further, we showed that song discrimination between deserts is greater than song discrimination within deserts, indicating that these birds exhibit divergence in song beyond what would be expected through dialect changes alone. This strong discrimination against songs from an allopatric population may mediate the degree of gene flow permeability among deserts. Analogous studies to this one frequently examine differences between groups that are adjacent to one another (e.g., Dingle et al., 2010;Lipshutz, Overcast, Hickerson, Brumfield, & Derryberry, 2017), allowing for tests along axes of sympatric versus parapatric or within versus outside a hybrid zone. However, Northern Cardinals have no known contemporary contact zone, and we found no support for ongoing gene flow, across the Cochise Filter Barrier.
The two allopatric populations have higher connectivity within than between deserts, though the latter could still have been historically high in short bursts. This is in contrast to a hybrid zone system in which connectivity is relatively high in the localized area of contact.
Instead of focusing on what occurs during secondary contact, studying allopatric populations allowed us to show how behavioral divergence evolves in isolation and impacts genomic divergence.

| Potential modes of speciation across the Cochise Filter Barrier
We found that song discrimination and gene flow between allopatric populations were correlated. One potential relationship between these two factors is that song could act as a driver of divergence, with or without allopatry (Uy et al., 2018). The song differences observed in this study may have directly caused genomic divergence through assortative mating during periods of contact (Bensch, Hasselquist, Nielsen, & Hansson, 1998;Coyne & Orr, 2004;Price, 2008), or perhaps reinforced existing divergence (Grant & Grant, 1996;Hoskin, Higgie, McDonald, & Moritz, 2005;Lachlan & Servedio, 2004;Lynch & Baker, 1994;Mason et al., 2017). Relative to the accumulation of novel genetic markers or other traits, behavioral evolution via exchange of learned song can be rapid (Allender, Seehausen, Knight, Turner, & Maclean, 2003;Duckworth, 2009;West-Eberhard, 1983 Yamaguchi, 1999), and so understanding their responses forms one of the critical next steps for this system. Anecdotally, during our experimental trials, female Northern Cardinals occasionally responded to playbacks (N = 9 trials). These females typically behaved similarly to the focal male of the trial and appeared more likely to investigate the speaker when hearing the Local dialect, rather than a novel dialect, tentatively suggesting that males and females behave similarly in this regard (K. L. Provost unpublished data).

All in all, the Cochise Filter Barrier structures Northern Cardinal
populations both genetically, phenotypically, and behaviorally. Given our findings, the barrier appears to be facilitated, at least in part, by strong dialect differences that have evolved between the Sonoran and Chihuahuan deserts. These dialect differences affect song discrimination in male Northern Cardinals more potently than would be expected from geographic distance alone. Across the entire bird community, it is likely that different species have developed greater or fewer dialect differences across the barrier, which may be impacting the observed genetic semipermeability. We suggest that the song discrimination and genetic divergence we found in this species directly interact with each other to create the pattern of differentiation we see across the Cochise Filter Barrier. Future studies of birds codistributed across this barrier may find similar evidence for this pattern, and investigating many different mechanisms across multiple species at once would give insight into how the Cochise Filter Barrier, and other such barriers, worked to create the species diversity seen today.

ACK N OWLED G M ENTS
We are grateful to G. Voelker ( McKenzie, L. Bryan, and numerous others.

CO N FLI C T O F I NTE R E S T
None declared.

Bioinformatic processing of genomic reads
The raw genomic data were processed using PyRAD version 3.0.66 (Eaton, 2014 We ran PyRAD with these settings twice: once with all individuals, and once excluding all four outgroup individuals of C. sinuatus and C. c. carneus. For the latter run, we also excluded one individual in which only three loci successfully sequenced. Only analyses using former run are reported in the main text; however, results from these two datasets did not differ substantially. In addition to having two datasets run independently in PyRAD, we also had no F I G U R E A 1 Outgroups, Sonoran Desert, and Chihuahuan Desert birds form distinct clusters. Results from STRUCTURE runs K = 2 (top) to K = 5 (bottom) are presented. Each vertical bar represents an individual bird, with the proportion of each color indicating assignment to that population. Groupings of bars separated by white gaps represent, from left, (a) Cardinalis sinuatus, (b) C. cardinalis carneus, (c) Sonoran C. cardinalis, and (d) Chihuahuan C. cardinalis. Cardinalis sinuatus is predominantly orange/red, C. c. carneus are predominantly red, Sonoran individuals are predominantly green, and Chihuahuan individuals are predominantly blue. Note that Chihuahuan group includes all samples east of the Cochise Filter Barrier, including individuals collected outside the Chihuahuan Desert proper. Dataset used is the outgroupincluded dataset, which retains two individuals with high missing data (indicated with asterisks). Order of bars is the same as in Main Text Figure 4, excepting outgroups and removed low-data individuals qualitative differences in results when we manually removed C. sinuatus and C. c. carneus from the outgroup-included dataset and used that for analyses (results not shown).

Testing different thresholds for StructureSelector
We evaluated the best K value for our STRUCTURE runs using both the Evanno et al. (2005) method and the Puechmaille (2016) method, as implemented in StructureSelector (Li & Liu, 2018). For the Puechmaille (2016) method, we tested mean membership threshold values of 0.5, 0.6, 0.7, 0.8, and 0.9. We report only results from the 0.5 threshold in the main text.

Genomic structure results
From the dataset in which the outgroup was excluded, we collected 66,254 raw loci. After filtering for paralogs, this was reduced to 32,832 processed loci, from which we extracted 27,701 unlinked SNPs out of a total of 314,308 variable sites and 132,216 parsimonyinformative sites. Each individual had between 2,177 and 9,786 loci associated with it (mean ±standard deviation 7,058 ± 1,515, median 7,110). The individual with 2,177 loci was one of the individuals that was consistently mis-assigned in STRUCTURE analyses (see below); all other individuals had at least 3,376 loci. The main text describes these results for dataset in which the outgroup was included.
Differentiation between the Sonoran and Chihuahuan populations was relatively high. Nei's G ST was 0.33, while Hedrick's G' ST was 0.41.
As mentioned in the main text, two individuals had particularly high amounts of missing data (3 and 2,250 loci, or 99% and 92% missing, respectively, in the outgroup-included dataset). We ran a preliminary STRUCTURE analysis including these individuals as well as with the outgroup taxa. The highest log-likelihood support values for this run were for two (mean ± standard deviation −221,993 ± 81) and three (−197,641 ± 130) population clusters. Using the Puechmaille (2016) method, MedMed and MedMax always support K = 3 across all mean membership thresholds. However, MedMean and MaxMean support K = 3 for a threshold of 0.5, K = 2 for thresholds 0.6-0.7, and K = 1 for thresholds 0.8-0.9. Comparisons of ∆K (Evanno et al., 2005) showed highest support for K = 3, grouping C. sinuatus and C. c. carneus as one group separate from both the Sonoran and Chihuahuan individuals, which were clearly separated from each other except for the individuals with high missing data. The individual with 99% missing data was assigned with approximately equal probability to all groups irrespective of the analysis. The individual with 92% missing data was assigned to the Chihuahuan cluster for K = 2 and then assigned to outgroup clusters for K = 3 through K = 5 ( Figure A1).
Because of high missing data, we chose to remove these individuals from further analyses.
For the dataset described in the main text, with outgroups and individuals with high missing data removed, we tested five mean membership threshold values (Puechmaille, 2016) ranging from 0.5 to 0.9.

The role of male competition and female preference
Whether male song affects gene flow is contingent on the assumptions involving song, fitness, and female choice. If these assumptions bear out and individuals who do not match the nearby dialect are selected against, the song differences that have evolved between deserts may act as a potential reproductive barrier. One key assumption is that males and females have similar responses to dialect differences such that songs that males recognize as conspecific rivals would be attractive to females (Hunt et al., 2009;Uy et al., 2018). This requires that both sexes have the same discriminatory abilities and use the same mechanisms, which is not always the case (Lipshutz, 2017b). Females of different populations may prefer the same group of males in spite of divergent phenotypes (Baldassarre & Webster, 2013;Ryan & Rand, 1993;Coyne & Orr, 2004;Price, 2008). Similarly, for selection to occur against individuals that do not match the local dialect, females must not prefer rare males (Knoppien, 1985;Partridge, 1988).
Captive-raised Northern Cardinal females show no preference for familiar or novel conspecific songs, despite discriminating strongly against heterospecific songs (Yamaguchi, 1999), but it is unclear how females regard individuals from highly diverged genetic lineages of the same species. In this study, males treat songs from across the Cochise Filter Barrier as heterospecific, and given evidence that females are more discriminatory than males (Uy et al., 2018), it is likely that females would behave similarly. Anecdotally, females observed while performing playback experiments appeared more likely to investigate the speaker when hearing the Local dialect, rather than a novel dialect, suggesting that males and females behave similarly in this regard (K. L. Provost unpublished data).
Either scenario (differences in discriminatory ability between the sexes or female preference for novel males) could lead to a disconnect between behavior and fitness if female choice is the dominant method of sexual selection. There is evidence that differences in male-male competition can impact fitness without being directly mediated by female choice, but this evidence has proven to be troublesome to interpret. Males on high-quality territories have higher breeding success (Wolfenbarger, 1999a(Wolfenbarger, , 1999b, but the link between songs and territories appears convoluted. Simple, short song structures were found in Northern Cardinal males on high-quality territories (Conner et al., 1986;Narango & Rodewald, 2015, but these same song qualities may also lead to delayed mating success (Ritchison, 1988) and appear to be heavily influenced by the urbanization of habitat and density of individuals (Narango & Rodewald, 2015. There may also be fitness consequences to songs learned mediated by parasites, as found in other species, which could reflect trade-offs between male competitive ability and ecological adaptation (MacDougall-Shackleton, Derryberry, & Hahn, 2002;Qvarnström, Vallin, & Rudh, 2012); however, there is no current evidence that songs and parasite loads are directly linked in Northern Cardinals. Rather than impacts on territory quality, song structure's impacts on song matching between individuals may be more important for this species. Female Northern Cardinals that match the songs of their mates decrease nestling provisioning by the male, while singing a different song has the opposite effect (Halkin, 1997). Song matching happens frequently between neighboring males, apparently even more so during territorial disputes (Lemon, 1968). In either scenario, whether song structure is linked to territory quality, defense, or nest provisioning, it appears likely that vagrant males should encounter difficulties in forming and maintaining territories (Searcy & Nowicki, 2005) and providing for their mates and offspring effectively.

Because of the large geographic separation, Distant songs and
Across-Barrier songs should have been novel to the Local population for both desert populations (Lemon, 1975). We chose these localities to compare the effects of both distance and presumed genetic relatedness on a population's response to a recording. Comparing the Local and Distant treatments allowed us to assess the impact of distance while controlling for genetic relatedness, as there is no significant mitochondrial or nuclear (Smith & Klicka, 2013;Smith et al., 2011)  Additionally, for the Chihuahuan treatments, songs were compressed to increase volume using Audacity's "compressor" function and we wanted to mimic a natural encounter as closely as possible.
After processing, we had five stimulus sets for each treatment except for the Rattlesnake Springs, NM, and Control treatments. We processed seven Rattlesnake Springs, NM, and one Control treatment stimulus set.
TA B L E A 2 Playback experiment transect data for the Sonoran Desert season (Gould, 1961); therefore, major territorial shifts were not expected during the trials and we assumed that individuals found on territories were the territory holders. Sites were geographically spaced apart using a Garmin GPSMAP 62s to minimize territorial

Playback protocol details
We conducted four playbacks at each site, randomly selecting one stimulus set from each of the four treatment localities (Local, Distant, Across-Barrier, and Control). We randomized the order of the treatments at each site prior to beginning any playbacks. Post hoc, we tested the impact of individuals potentially hearing themselves or their neighbors (e.g., Brooks & Falls, 1975a;1975b;Falls & Brooks, 1975

Analysis of behavioral data details
We compared the responses to the four treatments to determine Aggression" measure. This measure served as a control for aggression present at a site before the recording began playing, which could influence the final aggression levels. The combined-deserts PC1 and Pre-Treatment Aggression measures were used only to assess whether there were differences between the experiments in total aggression levels. To accomplish this, we performed a two-factor analysis of variance using the aov function in the stats package in R, setting up our models as follows: PC1 (or

Pre-Treatment Aggression) ~ Treatment + Population Tested +
Treatment: Population Tested, with the final term in the model representing the interaction between those two variables.
We used generalized linear mixed models (GLMMs) to determine whether there was a correlation between aggression (PC1) and treatment. PC1 was the dependent variable in these models, with stimulus set and the Pre-Playback control variable Pre-Treatment Aggression as fixed effects. We included the random effects of site and stimulus set in our models to control for individual site differences and differential responses to the various stimulus sets from the same locality. GLMMs were run using the Sonoran-only PCA and the Chihuahuan-only PCA data, not the combined-deserts PCA data.
We generated GLMMs using the glmer function in the R package lme4 version 1.1-10 (Bates, Mächler, Bolker, & Walker, 2015) with Gamma error structure and log-link as detected by the qqp function in the package MASS version 7.3-33 (Venables & Ripley, 2002).
When assessing Gamma error structure, the qqp function requires all response values to be non-negative, so we adjusted PC1 and Pre-Treatment Aggression accordingly. For the Sonoran data, we adjusted PC1 to have a mean of 2.0 rather than a mean of 0.0 by adding 2.0 to each value. For the Chihuahuan data, we adjusted PC1 to have a mean of 1.0 rather than 0.0 by adding 1.0 to each value. In addition, Sonoran PC1 was positively correlated with aggression while Chihuahuan PC1 was negatively correlated with aggression-as such, we multiplied the Chihuahuan data by −1.0 to run in the GLMMs. We compared the results from two different optimizer algorithms BOBYQA (Powell, 2009) and L-BFGS-B (Byrd, Lu, Nocedal, & Zhu, 1995) using the R package optimx version 2013.8.7 (Nash, 2014;Nash & Varadhan, 2011). However, both optimization algorithms gave similar parameters for all four datasets, so we only report results from the BOBYQA algorithm.

Testing sensitivity of analyses to failed detections
As mentioned in the main text, we failed to detect males at some sites, which we defined as never detecting a male within our study site either by sight or by sound. We analyzed our resulting data both with and without these sites to see whether it impacted our results.
Here, we present the results from those analyses alongside the results from the main text.
For the Sonoran data, the GLMM including Treatment was a better fit to the aggression data than the model without Treatment When sites without detections were excluded, the GLMM with treatment was significantly supported over the model without (∆AIC C = 4.79), though the model without treatment had greater explanatory power according to adjusted R 2 (Null = 0.71, Full = 0.67).