Comparative analysis examining patterns of genomic differentiation across multiple episodes of population divergence in birds

Abstract Heterogeneous patterns of genomic differentiation are commonly documented between closely related populations and there is considerable interest in identifying factors that contribute to their formation. These factors could include genomic features (e.g., areas of low recombination) that promote processes like linked selection (positive or purifying selection that affects linked neutral sites) at specific genomic regions. Examinations of repeatable patterns of differentiation across population pairs can provide insight into the role of these factors. Birds are well suited for this work, as genome structure is conserved across this group. Accordingly, we reestimated relative (FST) and absolute (dXY) differentiation between eight sister pairs of birds that span a broad taxonomic range using a common pipeline. Across pairs, there were modest but significant correlations in window‐based estimates of differentiation (up to 3% of variation explained for FST and 26% for dXY), supporting a role for processes at conserved genomic features in generating heterogeneous patterns of differentiation; processes specific to each episode of population divergence likely explain the remaining variation. The role genomic features play was reinforced by linear models identifying several genomic variables (e.g., gene densities) as significant predictors of FST and dXY repeatability. FST repeatability was higher among pairs that were further along the speciation continuum (i.e., more reproductively isolated) providing further insight into how genomic differentiation changes with population divergence; early stages of speciation may be dominated by positive selection that is different between pairs but becomes integrated with processes acting according to shared genomic features as speciation proceeds.

both selection and gene flow to explain heterogeneous patterns of differentiation. Specifically, this model holds that divergent selection at loci involved in reproductive isolation protects some regions of the genome from gene flow, elevating an otherwise homogenized landscape of differentiation (Nosil et al. 2009;Nosil and Feder 2012). The second model proposes that selection alone can generate variation in differentiation by accelerating lineage sorting at some regions of the genome. In other words, genomic regions that do not show elevated differentiation simply continue to share ancestral polymorphism (Noor and Bennett 2009;Turner and Hahn 2010;Cruickshank and Hahn 2014). We refer to this model as selection in allopatry and note that there are variants on this model related to when selection acts (Cruickshank and Hahn 2014;Delmore et al. 2015;Irwin et al. 2016).
One common feature of both the divergence with gene flow and selection in allopatry models is that features of the local genomic landscape should contribute to variation in differentiation. For example, genomic regions with lower rates of recombination, higher rates of mutation and elevated gene densities can promote linked selection, defined as any form of selection that influences variation at nearby neutral sites (Charlesworth et al. 1993;Lohmueller et al. 2011;Charlesworth 2012;Cutter and Payseur 2013;Enard et al. 2014). Linked selection can be positive, acting on new or existing mutations (genetic hitchhiking, Maynard Smith and Haigh 1974) or purifying, removing deleterious mutations from the population (background selection, Charlesworth et al. 1993). Measures of differentiation like F ST include a term for within population variation and can be inflated by the reductions in variation that often accompany linked selection (Charlesworth 1998). Low recombination rates make it difficult for linked neutral sites to escape the effects of new mutations via recombination. Higher mutation rates and gene densities provide more targets for selection.
It was recently suggested that patterns of genomic differentiation will reflect features of the local genomic landscape more at later stages of speciation, as drift and selection at these features will take time to influence differentiation (Burri 2017). Comparative analyses examining genomic differentiation across multiple population pairs are ideal for both implicating features of the local genomic landscape in generating genomic differentiation and examining their temporal effects. For example, if genomic variables are conserved across pairs, constraints imposed by processes like linked selection in these regions should generate correlated or repeated patterns of genomic differentiation. Comparative analyses are beginning to accumulate but are often limited to a closely related group of species or populations, precluding an evaluation of temporal effects and introducing statistical nonindependence if a limited number of pairs are included (e.g., sticklebacks, Jones et al. 2012;sunflowers, Renaut et al. 2014;guppies, Fraser et al. 2015; songbirds, Irwin et al. 2016;Van Doren et al. 2017;copepods, Pereira et al. 2016). In addition, working at broader taxonomic scales may eliminate the role shared selective regimes play in generating repeatable patterns of differentiation, isolating the effects of genomic constraints.
Here, we overcome these limitations using new and archived genomic data to estimate genomic differentiation between eight pairs of birds that span a broad taxonomic range (the most recent common ancestor to them all was ß52 MYA, http://www.timetree.org/). We look for (1) correlated patterns of genomic differentiation across these pairs (referred to as "repeatability") and (2) an association between repeatability and the location of pairs along the speciation continuum (i.e., their level of reproductive isolation). We also (3) use linear models to implicate specific genomic features in generating repeatable patterns of genomic differentiation; these features include proxies for both recombination and mutation rates, gene density, chromosome size and proximity to chromosome ends and centromeres. We quantified the speciation continuum using hybrid zone width and genetic distance between pairs. Chromosome size, proximity to chromosome ends and centromeres may influence genomic differentiation as they have shown associations with recombination rates (Butlin 2005;Smukowski and Noor 2011). We also include linkage disequilibrium (LD) as a predictor in linear models; if linked selection is generating repeatable patterns, LD should be higher where repeatability is higher. Birds are well suited for this work as a considerable amount of information is known about speciation in this group (Price 2008) and genomic features are highly conserved across this group; birds have stable chromosome numbers, low rates of interchromosomal rearrangements, high synteny, and similar recombination landscapes (Dawson et al. 2007;Griffin et al. 2007;Backström et al. 2008;Stapley et al. 2008;Ellegren 2010;Kawakami et al. 2014;Zhang et al. 2014;Singhal et al. 2015;Kawakami et al. 2017).
Thus far we have only discussed how the local genomic landscape can affect F ST , a relative measure of differentiation that is inflated by reductions in within population variation. Many studies are beginning to include d XY in their analyses. This is an absolute measure of differentiation that does not include a term for within population variation. Under the divergence with gene flow model of speciation, those regions that contribute to reproductive isolation should have elevated d XY compared to background levels of absolute differentiation. Under the selection in allopatry model of speciation, linked selection should have no effect on d XY or reduce it compared to background levels (Nachman and Payseur 2012). The latter reductions could occur in response to recurrent linked selection in ancestral populations (which eventually results in reduced genetic distance between populations, Cruickshank et al. 2014) and/or selective sweeps of globally adaptive alleles (e.g., Delmore et al. 2015;Irwin et al. 2016). Given increasing interest in d XY and its potential to reflect the local genomic landscape, we include it in our analyses as well. All of the population pairs included in the present study occur in the temperate region where they have likely experience periods of allopatry with glacial expansions (Hewitt 2000). Accordingly, in analyses for objective 3 where we identify specific regions that show repeatable patterns we will focus on those at the bottom of the d XY distribution.
To gain an overview of the relationship between these pairs, we constructed a phylogeny for the group using whole-genome sequence data from all autosomal chromosomes (Fig. 1). The crow is the most distantly related species; all the remaining species cluster into two groups. One group includes the greenish warbler, willow warbler and blackcap while the other includes the flycatcher, stonechat, thrush, and blue/golden-winged warblers. Disregarding sister-pair relationships, the most closely related species are flycatchers and stonechats, and greenish and willow warblers. This topology is what we expected based on previous phylogenetic studies (e.g., Jetz et al. 2012, birdtree.org).

DIFFERENTIATION
To estimate repeatability in patterns of genomic differentiation across pairs we organized scaffolds from each species' reference into chromosomes using synteny with the flycatcher and estimated F ST and d XY between populations in each pair using the same 100 kb windows (Fig. 1). An initial comparison across pairs suggests that patterns may only be modestly consistent but stronger when considering d XY . We evaluated this observation by correlating windowed estimates of F ST and d XY across pairs. In accordance with our observation, correlation coefficients varied from -0.02 to 0.18 for F ST and 0.04 to 0.51 for d XY (Table 1). Squaring the highest coefficients for F ST and d XY , these results suggest that up to 3 and 26% of the variation can be explained by correlations of F ST and d XY between pairs respectively.
We used a second method to quantify repeatability between pairs based on the overlap of outlier windows. We identified outlier windows for each pair as those in the top 5 percentile of the F ST distribution and bottom 5 percentile of the d XY distribution and compared the number of outlier windows that were shared (or overlapped) across pairs to the expected number based on the hypergeometric distribution (see Methods). Similar to results from correlations, estimates of overlap were higher and more significant for d XY (Table S3A). This was also the case when we combined strings of outlier windows into peaks, acknowledging the fact outlier windows may not be independent of one another (Table S3B).

SPECIATION CONTINUUM
We looked for an association between repeatability and the speciation continuum using four different measures for the speciation continuum: hybrid zone width, the percentage of hybrids in these zones and genetic distance from both cytb and autosomal sequences. Starting with hybrid zone width and the percentage of hybrids, we obtained estimates for each pair from the literature and assumed reproductive isolation is greater in narrow hybrids zones with fewer hybrids (Barton and Hewitt 1985). Hybrid zone width ranged from 0 km for greenish warblers to 600 km for blue/golden-winged warblers; the percentage of hybrids ranged from 0% for greenish warblers to 70% for willow warblers (Table S4). We transformed these values into distance matrices and compared them with the correlation matrices generated using windowed estimates F ST and d XY above. Controlling for genetic distance between pairs, both hybrid zone width and the percentage of hybrids were negatively correlated with the with the correlation matrix based on F ST but not d XY . In other words, pairs with more narrow hybrid zones and fewer hybrids showed higher repeatability in F ST (Partial Mantel tests, hybrid zone width, R = -0.48, CI = -0.67-(-0.21), P = 0.02; percentage of hybrids, R = -0.42, CI = -0.85-(-0.0082), P = 0.02) but not d XY (Partial Mantel test, hybrid zone width, R = 0.18, CI = -0.21-0.39, P = 0.23; percentage of hybrids, R = -0.18, CI = -0.44 to 0.47, P = 0.45).
There are some caveats associated with using hybrid zone width and the percentage of hybrids as a proxies for reproductive isolation (e.g., differences in dispersal distance may affect estimates of hybrid zone width and there is considerable variation in how the percentage of hybrids is estimated, see Discussion). Accordingly, we reran these analyses using genetic distance between populations within each pair as a measure of reproductive isolation and assuming pairs that are more reproductively isolated from one another exhibit greater genetic distances. We estimated genetic distance using both cytb and autosomal sequences and similar to results using parameters from hybrid zones, we found a significant relationship between genetic distance within pairs and the correlation matrix based on F ST but not d XY ; repeatability increased with genetic distance between pairs for F ST   Values are correlation coefficients comparing windowed estimates of F ST (below diagonal) and d XY (above diagonal) between each set of population pairs. Table S3. Note, we reran these analyses replacing repeatability estimated by correlation coefficients (Table 1) with values based on the overlap of outlier windows and found similar associations (Table S3; e.g., results for F ST and hybrid zone width, R = -0.41, CI = -0.72-(-0.15), P = 0.03; cytb, R = 0.33, CI = 0.05-0.52, P = 0.05; autosomal, R = 0.41, CI = 0.22-0.59, P = 0.006). This is an important finding, as it suggests the associations we documented are not related to the fact that there is a greater range of F ST values at later stages of speciation.

PREDICTORS OF REPEATABILITY
The repeated patterns of F ST and d XY we documented above suggest that variation in genome-wide estimates of differentiation is influenced by conserved features of the local genomic landscape. We used generalized linear models (GLMs) to evaluate the role-specific genomic features play in generating these patterns. Repeatability was quantified for each window as the number of pairs in which the window was considered an outlier (recall outlier status was determined using the top 5 percentile of the F ST distribution and bottom 5 percentile of the d XY distribution). Separate GLMs were run for each species pair using seven predictor variables (estimated for each pair separately): the proportion of GC bases (proxy for recombination rate), synonymous mutation rate (d s ; proxy for mutation rate), gene count, LD, and three variables related to where windows are located in the genome (microor macrochromosomes [chromosome size], proximity to chromosome ends, and centromeres).
Results from these GLMs can be found in Figure 2 (with blackcaps as predictor) and Table S5 (for each case as predictor). GC content, gene density, LD, and proximity to both chromosome ends and centromeres were significant predictors of repeatability in F ST across all species pairs. Each of these variables was positive predictors of repeatability, except GC content that was negative. In the case of position, this means that windows near the center of chromosomes showed higher repeatability. Linkage disequilibrium, proximity to centromeres, and chromosome size were significant predictors of repeatability in d XY across all species pairs. In the case of chromosome size, this means that windows on microchromosomes show more consistent patterns. GC content, proximity to chromosome ends, gene density, and d s were also significant predictors of repeatability in d XY but not for all species pairs.
Note that while the predictor variables used in these models are species-specific (e.g., gene density estimate for each pair separately) the response variable is the same--a summary variable quantifying repeatability across species pairs. Accordingly, results from these models are not entirely independent.

Discussion
We used genomic data from eight population pairs of birds that span a broad taxonomic scale to study the contribution of local genomic features to variation in genome-wide estimates of differentiation. We rely on the fact genomic features are conserved across birds to draw inferences from our analyses and discuss these findings below, including the potential temporal effect these features can have of genomic differentiation.
Our first objective was to determine if patterns of genomic differentiation were correlated (or repeated) across population pairs. Our results suggest that up to 3% of the variation in F ST and 26% of the variation in d XY can be explained by correlations across pairs. Shared features of the genomic landscape likely contribute to these correlations. population variation, inflating F ST . Many of these genomic features are conserved across birds (Dawson et al. 2007;Griffin et al. 2007;Backström et al. 2008;Stapley et al. 2008;Ellegren 2010;Kawakami et al. 2014;Zhang et al. 2014;Singhal et al. 2015;Kawakami et al. 2017) and likely generated the repeatable patterns we observed. In support of this suggestion, recombination rates (approximated by GC content) and gene densities, two genomic features that are preserved across birds and influence linked selection were consistent predictors of repeatability in F ST . Genetic drift may also contribute to the correlations we documented. For example, drift in a low recombination region can cause it to show consistently high or low F ST . Nevertheless, drift is generally expected to reduce genetic diversity genome-wide. Note that the amount of variation explained by correlations across pairs is not as high as other studies (e.g., 49-77% of the variation in F ST explained by correlations across subspecies pairs of greenish warblers in Irwin et al. 2016). Nevertheless, most of these studies focus on pairs that are much more closely related than those in the present study, such that pairs share more genomic features and are subject to similar selective forces. Our second objective was to determine if there was a positive association between repeatability and the location of pairs along the speciation continuum. We measured the speciation continuum using hybrid zone width, the proportion of hybrids in these zones and genetic distance and found that pairs with narrower hybrid zones and greater genetic distances exhibited more similar patterns of F ST . The latent effects of drift and selection may explain this pattern. Specifically, when population divergence begins, allele frequencies will be roughly equal and F ST will be close to zero. Drift and selection will start acting on standing genetic variation and any new mutations that arise. The effects of these processes, especially drift and background selection, may take time to accumulate (Burri 2017). Accordingly, positive selection and genetic hitchhiking may be the primary forces affecting differentiation early in speciation. If the selective context of speciation is different for the pairs under study, this should lead to less repeatable (or correlated) patterns of differentiation. As speciation proceeds, drift and background selection will begin to affect differentiation as well and, combined with positive selection and genetic hitchhiking, these processes could result in the landscape of differentiation reflecting genomic features more directly. This scenario was described by Burri (2017) and is in line with recent work showing that linked selection (positive or purifying in nature) may generate repeated patterns of differentiation at longer time scales (Phung et al. 2016;Dutoit et al. 2017;Van Doren et al. 2017;Vijay et al. 2017). Note that the beginning stages of speciation may be less repeatable even without different selective pressures. For example, there may be more than one way to respond to selection and the chance positive selection affects the same genomic region may be low.
Our findings related to the speciation continuum will require additional study. To begin with, we used Partial Mantel tests for these analyses that may be prone to Type I errors (Harmon andGlor 2010, but see Diniz-Filho et al. 2013). Alternatives exist (e.g., Redundancy Analyses or correlograms) but require larger sample sizes. Our use of hybrid zone width to quantify the speciation continuum also assumes that dispersal distance is the same across all pairs. This is a common assumption among songbirds as it is difficult to obtain unbiased estimates of dispersal for this group that is based on large sample sizes. Analyses using the percentage of hybrids are not affected by dispersal distance but are associated with another set of assumptions, including that each study had the same resolution to identify hybrids and used similar sampling strategies. Regardless, the association we documented between repeatability and the speciation continuum is intriguing and was documented using not only parameters from hybrid zones but also genetic distance between populations in each pair.
Thus far we have focused mainly on results for F ST ; results for d XY require careful explanation. Concerning the repeatable patterns we documented (first objective, up to 26% of the variation in d XY explained by correlations across pairs), at the outset we argued that speciation in the pairs we studied would have been punctuated with periods of allopatry as all pairs occur in the temperate region where glacial advances would have isolated populations in different refugia (Hewitt 2000). Under this scenario (i.e., without gene flow), d XY should reflect the amount of sequence divergence that has been acquired since populations split (along with variation that existed in the common ancestor) and linked selection should have no effect on d XY or reduce it (e.g., if recurrent linked selection in ancestral populations removes variation from populations prior to their split; Nachman and Payseur 2012;Cruickshank and Hahn 2014). Consistent with the above scenario, patterns of d XY were repeatable across pairs and several genomic features including gene densities and chromosome size predicted repeatable patterns at the bottom of the d XY distribution. Nevertheless, it is important to note that during periods of secondary contact gene flow will reduce d XY in some regions, generating peaks of differentiation. Accordingly, some of the repeatable patterns we documented may also be related to elevated d XY (Nachman and Payseur 2012).
Continuing with d XY repeatability (i.e., results for the first objective), correlation coefficients were higher for d XY than F ST , with the highest correlation coefficient for d XY being more than twice that for F ST (0.18 vs 0.51). This finding could be related to the fact that d XY reflects processes that have accumulated over multiple speciation events (see below for additional explanation) while F ST mainly reflects processes in extant populations. Accordingly, if population pairs are sampled too early in speciation, F ST may not reflect local genomic features yet as it will take time to accumulate (Burri 2017). This suggestion follows from the argument described above about the latent effects of drift and background selection. Nevertheless, additional explanations for increased d XY repeatability are also possible. For example, d XY shows a strong relationship with mutation rates (Geneva et al. 2015;Rosenzweig et al. 2016). Accordingly, much of the pattern we documented may be related solely to variation in mutation rates. It is also important to note that d XY is estimated using far more sites than F ST (variant and invariant for d XY vs just variant for F ST ). Accordingly, these estimates may be more precise, leading to stronger correlation coefficients.
Finally, while we found a correlation between the speciation continuum and repeatability in F ST (second objective) we did not document this association for d XY . Again, this finding may be related to the fact that d XY reflects processes that have accumulated over multiple speciation events while F ST mainly reflects process in extant populations, including speciation. For example, if the recombination landscape has remained the same for millions of years, recurrent linked selection in areas of low recombination has likely been reducing variation over the same time period and these reductions will be passed down over speciation events (Burri 2017). Under this scenario, it will not matter what stage of differentiation the population pairs under study are at, these reductions will be reflected in estimates of d XY . As we have already discussed, there are situations where d XY will reflect processes in extant populations (especially if gene flow is occurring) but the underlying effect of ancestral diversity appears to override any effect these processes have on d XY repeatability and the speciation continuum.
We documented modest but significant repeatability in relative and absolute differentiation across eight population pairs of birds and showed that several genomic features predicted this repeatability. As genomic features are conserved across birds, these results suggest that at least moderate amounts of variation in genome-wide differentiation can be attributed to processes acting at genomic features, including linked selection that may derive from both positive and purifying selection. A considerable amount of the remaining variation in genomic differentiation is likely related to processes specific to each episode of population divergence. This is especially true for pairs early in the process of speciation, as our observation that repeatability increases with the location of pairs along the speciation continuum suggests processes acting at shared genomic features become more important later in this processes. To the best of our knowledge, this is the first empirical support for a temporal role of genomic features in structuring genomic differentiation and we encourage future studies incorporating additional pairs to study this association and the genetics of speciation. Studies focused on a single system and points in time provide only a snapshot of this extensive and often prolonged process.

STUDY SPECIES AND DATASETS
We searched the Sequence Read Archive (https://www.ncbi. nlm.nih.gov/sra) and European Nucleotide Archive (http://www. ebi.ac.uk/ena) for genomic data collected from birds. We limited our search to species for which a draft reference genome had been assembled for the target species or one that was closely related. This search resulted in the inclusion of eight pairs (Table S1). The only pair we did not have a reference genome for was the Vermivora warblers but a reference for the closely related yellow-rumped warblers (Setophaga coronata) is available and was used in the original publication for these data . All pairs are from the order Passeriformes (perching birds or songbirds) and breed in temperate regions (Fig. 1).

GENERATING CONSENSUS REFERENCE GENOMES
The reference genomes we acquired were all assembled into scaffolds, except the collared flycatcher's genome, which was organized into chromosomes based on linkage map for the species and synteny with zebra finch (Ellegren et al. 2012). Accordingly, we used this genome to ensure all windows compared across species were orthologous. To maintain chromosomal synteny, we aligned the scaffolds of each genome individually against the flycatcher genome with SatsumaSynteny (default parameters; Grabherr et al. 2010). We then used bash scripts to parse the output, obtaining information on the order and orientation of query scaffolds and conducted a final alignment with the LASTZ plugin in Geneious (Harris 2007;Kearse et al. 2012). We merged these scaffolds into pseudochromosomes by calling the query base where alignments occurred and Ns in the presence of a gap. Details on consensus genome coverage can be found in Table S2.

CONSTRUCTING PHYLOGENETIC NETWORK
We used ANGSD (Korneliussen et al. 2014) to obtain consensus fasta sequences for populations from each species pair (-doFasta 2 -doCounts 1 -minQ 20 -setMinDepth 10) and IQ-TREE (Nguyen et al. 2016) to construct a maximum-likelihood tree from these sequences. Patristic distance between pairs was estimated using this tree using the cophenetic.phylo function from the R package "ape." To compare relative divergence across species pairs we used the chronos function from the same R package, using the default of correlated rate model and generating an ultrametric tree.

ESTIMATING DIFFERENTIATION
We focused on SNPs for the present study and used a common reference-based bioinformatics pipeline to call them. Details can be found in Delmore et al. (2015Delmore et al. ( , 2016. Briefly, we trimmed reads with trimmomatic (TRAILING:3 SLIDINGWINDOW: 4:10 MINLEN:30) and aligned them to consensus genomes using bwa mem (Li and Durbin 2009) using default settings. We then used GATK (McKenna et al. 2010) and picardtools (http://broadinstitute.github.io/picard) to identify and realign reads around indels (RealignerTargetCreator, IndelRealigner, default settings) and removed duplicates (MarkDuplicates, default settings) for all datasets except greenish warbler that consisted of GBS data.
We used two estimates of differentiation in our study: F ST and d XY . We estimated F ST for datasets comprised of individuals using ANGSD, estimating site frequency spectrums for each population separately (-dosaf 1, -gl 1, -remove bads, -unique only, -minMapQ 20, -minQ 20, -only proper pairs 1, -trim 0) and using these to obtain joint frequencies spectrums for population pairs. These joint frequency spectrums were then used as priors for allele frequencies at each site to estimate F ST . For datasets comprised of pools we estimated F ST using Popoolation2 (Kofler et al. 2011; -min-coverage 30 for Swainson's thrushes and 10 for stonechats, -min-count 3, -minq 20). We summarized F ST into windows of 100 kb and limited analyses to windows with data from all pairs. We excluded the Z chromosome from all analyses as some of the pairs included females where systematic biases related to coverage could affect estimates of differentiation.
We estimated d XY for datasets comprised of individuals using ANGSD as well. First, we estimated allele frequencies at each SNP for both populations of each pair combined -doMajorMinor 4, -doMaf 2, -gl 1, -doCounts 1, -remove bads, -unique only, -minMapQ 20, -minQ 20, -only proper pairs 1, -trim 0, -SNP pval 1e-6). We then reran the program by population using only the SNPs that passed the previous step, to ensure SNPs fixed in one population were not lost. Once we had these allele frequencies, we estimate d XY at each SNP using a script provided with ANGSD (https://github.com/mfumagalli/ngsPopGen//scripts/calcDxy.R) and as (p1 * (1−p2))+(p2 * (1−p1)) where p is the allele frequency of a given allele in populations 1 and 2, respectively and averaged these values in the same 100 kb windows used for F ST . Estimates of d XY by SNP have to be normalized by the number of sites (variant and invariant) in a window. We obtained these values using ANGSD to estimate read depth at all sites (-doCounts 1, -dumpCounts 1, -remove bads, -unique only, -minMapQ 20, -minQ 20, -only proper pairs 1, -trim 0) and excluded sites with coverage less than three times the sample size and more than three times the average coverage to ensure roughly three reads per individual and exclude sites that may have mapping problems resulting from copy number variants. Analyses were limited to windows with data from all pairs and windows with more than 5000 callable sites, as d XY can be highly variable with small sample sizes and coverage (e.g., Clarkson et al. 2014). This filter precluded the use of greenish warblers in analyses of d XY as data for this pair were derived from reduce-representation sequencing and did not have high coverage in windows of 100 kb. This was not a problem for the original publication ) as windows were defined by SNPs rather than base pairs.
For stonechats and thrushes, for which we used pooled sequencing data, we calculated d XY with a custom script. We excluded sites with coverage below 10 for stonechats and below 30 for thrushes. We estimated d XY by multiplying allele frequencies for each base as above and averaging across sufficiently covered bases in each window.

GENOMIC DIFFERENTIATION
We estimated overall repeatability between pairs by correlating windowed-estimates of differentiation across pairs. We also estimated repeatability using information on outlier status and overlap. Specifically, we identified outlier windows for each species pairs as those above the top 5% quantile for F ST and below the bottom 5% quantile for d XY . For each comparison across pairs (e.g., flycatcher to stonechat, flycatcher to greenish warbler, etc.), we counted the number of outlier windows that were shared (or overlapped) and compared this to the expected number of overlapping windows using the hypergeometric distribution, assuming that each window had an equal probability of being considered an outlier. We used these values (observed and expected) to calculate z scores for each comparison and calculated one-sided P-values (i.e., the probability of obtaining an overlap value as extreme or more extreme than our observed value). Z scores are effect sizes that correspond to the number of standard deviations above the average expectation in each comparison, allowing for direct comparison across studies. Note that it is also possible that the outlier windows we identified are not independent of one another. Accordingly, we reran this analysis modifying our overlap approach by combining outlier windows into peaks if they occurred next to one another and using the number of peaks as our estimate of overlap. We used a permutation test to quantify the significance of these values, holding the number and size of outlier regions constant while randomly permuting their location 1000 times and calculating one-sided P-values again.

CONTINUUM
We used four methods to place pairs along the speciation continuum (i.e., to quantify the level of reproductive isolation), starting with the width of hybrid zones and percentage of hybrids in these zones, which we obtained from the literature. The more narrow a zone and the fewer hybrids present the higher the reproductive isolation (Barton and Hewitt 1985;Moore and Dolbeer 1989;Paradis et al. 1998). If hybridization is extremely rare (e.g., with the greenish warblers), we set the width as zero. We also used genetic distance within pairs, which we obtained by aligning ctyb sequences (downloaded from NCBI https://www.ncbi.nlm.nih. gov/) using MEGA as p-distance (the proportion of nucleotide sites that differ between groups) and the distance matrix generated by IQTREE for autosomal chromosome alignments (see "Constructing phylogenetic network"). We used Mantel tests to compare distance matrices quantifying the speciation continuum with distance matrices quantifying repeatability. We accounted for small sample sizes in these tests by using permutation tests to quantify significance and the non-parametric Spearman rank correlation coefficient.

AND THEIR EFFECT ON REPEATABILITY
We looked at the relationship between repeatability and seven structural features of the genome: recombination rate, mutation rate, gene density, chromosome size, proximity to chromosome ends, and centromeres and linkage disequilibrium. We estimated these features for each species and ran separate generalized linear models (GLMs) with repeatability as the response variable with a Poisson distribution for each species. These models were run with the glm function in base R and the ANOVA function was used to evaluate the significance of each predictor variable. We visualized results with the "visreg" package and estimated correlation coefficients (and confidence intervals) for each model by regressing observed repeatability to repeatability predicted by each model.
We used GC content as a proxy for recombination. Recombination affects the patterns of local base composition via the unbalanced transmission of "strong" (GC) over "weak" (AT) alleles at double-strand breaks . This process is termed GC-biased gene conversion and direct support was recently presented in birds (Smeds et al. 2016). Positive correlations between recombination and GC content have also been documented in birds (Kawakami et al. 2014;Burri et al. 2015). Synonymous mutations occur in the exon of genes but have no effect on the sequence of amino acids. The use of d s for mutation rate analysis assumes these sites do not experience selection (Eyre-Walker and Keightley 1999). We used a phylogenetic framework to obtain these estimates; details can be found in the Supplementary Methods. Briefly, we annotated each consensus genome with MAKER and identified potential homologues for high quality transcripts (AED < 0.05) using a Blastn search against all transcripts from the flycatcher (flycatcher was searched against zebra finch). We aligned codons from each pair of sequences using PRANK (http://wasabiapp.org/software/prank) and calculated d N /d s with PAML v4.8. All d N /d s calculations were performed pairwise, comparing all the species with the flycatcher and this in turn, compared to zebra finch. Estimates of d s were extracted for GLMs. We used PLINK 1.9 (Chang et al. 2015) to estimate linkage disequilibrium for one population from each pair (the same population for which we had a reference genome), as the squared correlation coefficient (r 2 ) between pairs of SNPs. SNPs were output from ANGSD using the same filters described above for F ST and d XY and including an additional filter for minor allele frequency, requiring SNPs have minor allele frequencies greater than 0.05. PLINK was run with the command line with the command line "-ld-window 100 -ld-window-kb 100 -ld-window-r2 0" to limit the analyses to SNPs with fewer than 100 variants between them and no more than 100 kilobases apart and report pairs with r 2 values below 0.2 as well. We determined the midpoints for all SNP pairs, binned them into the same 100 kb windows used for F ST and d XY and calculated average values for each window.
Avian genomes are composed of micro-and macrochromosomes. We considered microchromosomes those that are less than 20 Mb (Ellegren 2013) and macrochromosomes those that are greater than 40 Mb. We identified the position of each window along the chromosome by dividing chromosomes in half and generating a variable that increased by a value of 1 for each window from the end of the chromosome to its center. We standardized this measure by dividing these values by half the number of windows on each chromosome, so values increased to 1 at the center of chromosomes. We inferred the location of centromeres using methods employed by Ellegren et al. (2012) and Delmore et al. (2015). Specifically, we identified FISH probes from Warren et al. (2010) on either side of centromeres in the zebra finch genome and used NCBI's blastn (Altschul et al. 1990) to find their location in our genome. We considered sequences between FISH probes as "centromeric regions" and note this method only gives us a rough approximation for the location of centromeres.

Supporting Information
Additional Supporting Information may be found in the online version of this article at the publisher's website: Table S1. Datasets included in the present study. Table S2. Coverage of consensus genomes. Table S3. Repeatability measured using outlier status and overlap values. Table S4. Data used to measure the speciation continuum. Table S5. Results from GLMs examining the relationship between repeatability and predictor variables related to genomic features.