Asymmetry matters: A genomic assessment of directional biases in gene flow between hybridizing spruces

Abstract Assessing directional bias in interspecific gene flow might be important in determining the evolutionary trajectory of closely related species pairs. Using a set of 300 single nucleotide polymorphisms (SNPs) having variable propensity to cross species boundary, we evaluated the genomic extent and direction of interspecific gene flow in a progenitor‐derivative spruce species pair (black spruce and red spruce). A higher rate of gene flow was found from black spruce toward red spruce purebreds than vice versa. This asymmetry could reflect the historical gene flow between the two taxa at the time of species inception and during postglacial colonization. A clear asymmetry in introgression was depicted by a greater gene flow between red spruce and hybrids than between black spruce and hybrids. While backcrossing toward red spruce was invariably high across the genome, the actual species boundary is between hybrids and black spruce where gene flow is impeded at those genomic regions impermeable to introgression. Associations between hybrid index and climatic variables (total annual precipitation and mean annual temperature) were tested, as these might indicate a role for exogenous selection in maintaining the species boundary. While an apparent association was found between the hybrid index and precipitation, it collapsed when considered in light of the directional bias in interspecific gene flow. Hence, considering asymmetrical patterns of introgression allowed us to falsify an apparent role for exogenous selection. Although this was not formerly tested here, we suggest that this pattern could result from asymmetrical endogenous selection, a contention that deserves further investigations.

So far, most studies that have investigated asymmetrical patterns of interspecific gene flow have relied on a modest number of presumably neutral markers, under the assumption that these should reflect the species' entire genome, thus conceived as a cohesive unit.
However, just as the overall level of introgression is heterogeneous across the genome, the level of asymmetry in interspecific gene flow might also be variable across the genome (Fitzpatrick et al., 2010).
We might specifically expect that the directional bias in introgression should be stronger at less permeable genomic regions, that is, at these key loci where gene flow is actually impeded between species (Currat et al., 2008). Accordingly, asymmetrical introgression between European oak species (Quercus robur and Quercus petraea) was reported when analyzing a subset of 60 interspecific outlier loci (from an F ST -outlier analysis) but not when using a presumably selectively neutral subset of 166 nonoutlier markers (Guichoux et al., 2013).
The two taxa would have then come into secondary contact during the Late Holocene (Lindbladh, Jacobson, & Schauffler, 2003;Perron et al., 2000). The current natural range of black spruce spans the transcontinental boreal forest of North America whereas red spruce has a narrow distribution in the montane forests along the Appalachian Mountains of the United States, the Acadian forest of the Maritime Provinces of Canada, and the mixed forest of southern Québec where the climate is typically more humid and temperate. The two taxa cooccur in sympatry over a large geographic area in northeastern North America where their ranges overlap ( Figure 1). Frequent occurrences of hybrid trees have been reported within the contact zone along the F I G U R E 1 Sampling of 33 populations across the contact zone between Picea mariana and Picea rubens in eastern North America. Ranges of P. mariana and P. rubens are shown in green and red, respectively, with the sympatric area in brown. Pie charts indicate mean genomic proportion of P. mariana (black) and P. rubens (red) ancestry in each sampled population based on 300 SNPs using the Bayesian clustering algorithm implemented in STRUCTURE (K = 2) St. Lawrence River valley in Southern Québec, testifying of the extensive gene flow between the two spruce taxa (de Lafontaine et al., 2015;Morgenstern & Farrar, 1964;Perron & Bousquet, 1997).
It is generally recognized that the maintenance of species integrity in the face of interspecific gene flow can be driven by exogenous or endogenous selective pressures (i.e., ecological selection versus selection against heterozygotes or Bateson-Dobzhansky-Muller incompatibilities; Arnold, 1997;Barton & Hewitt, 1985;Moore, 1977;Nosil, 2012).
Attempts at disentangling the roles of environment-dependent and environment-independent factors in driving patterns of introgression between black spruce and red spruce have given mixed results. On the one hand, transplant experiments have shown that red spruce is more shade-tolerant but less cold and drought-tolerant than black spruce, and that black spruce has higher allocation to stem wood, while red spruce has higher allocation to roots (Major, Barsi, Mosseler, Campbell, & Rajora, 2003;Major, Mosseler, Barsi, Campbell, & Rajora, 2003a,b;Manley & Ledig, 1979). These results suggest that the two taxa are ecologically distinct, each exhibiting physiological traits that favor different ecological niches. Red spruce has a greater competitive advantage in warm and mesic conditions while black spruce outperforms in cold and dry environments. Hence, these observations suggest a role for exogenous selection whereby interspecific gene flow is impeded as a result of different selective advantages under contrasted levels of ecological (i.e., moisture and temperature) conditions. On the other hand, controlled crosses suggested that endogenous selective pressures might also be important drivers of hybridization and introgression. Indeed, first-generation hybrids had lower fitness, as illustrated by higher rates of empty and aborted seeds, lower germination rates, lower CO 2 uptake, and lower juvenile survival (Major et al., 2003b(Major et al., , 2005Manley & Ledig, 1979). No record of reproductive phenological barrier between these closely related species is reported in the literature.
Heterogeneous patterns of interspecific gene flow have recently been uncovered across the genome of hybridizing black spruce and red spruce (de Lafontaine et al., 2015). In this previous study, 300 single nucleotide polymorphism (SNP) loci spanning the 12 chromosomes of black spruce were grouped into three classes of markers on the basis of their increasing capacity to cross the boundary between the two spruce taxa (impermeable, neutral, and highly permeable marker subsets, respectively) indicating variable states of introgression along the genome. The main objective of the present study was to assess whether interspecific gene flow between P. mariana and P. rubens is asymmetrical, and if so, whether the level of asymmetry is heterogeneous across the genome. We expected that the directional bias in introgression will vary across the three classes of markers as a result of the differential outcome of historical, demographic, and natural selection processes on the three marker subsets (Guichoux et al., 2013). As a complementary objective, we explored the putative role of ecological (exogenous) selective pressures in maintaining the hybrid zone and the integrity of species in the face of introgression. Because there is no possibility to replicate samples of the studied hybrid zone in different settings (as in Lexer et al., 2010), the geographic context of the hybrid zone between black spruce and red spruce does not lend itself easily to a direct assessment of the relative importance of exogenous versus endogenous (genetic) selective processes (see de Lafontaine et al., 2015). Hence, while formally disentangling the role of environmental and genetic factors in the maintenance of this hybrid zone remains beyond the scope of this study, we did hypothesize that uncovering directional biases in introgression could perhaps provide further useful insights regarding the relative importance of ecological selective pressures. Specifically, given that these two taxa are associated with contrasted ecological niches (Major, Barsi, et al., 2003;Major et al., 2003a,b;Manley & Ledig, 1979), we expected that variation in environmental conditions should reflect the extent and direction of introgression if ecological selection is an important determinant of the genetic boundary between species.

| MATERIALS AND METHODS
A total of 385 adult individual trees from 33 populations spanning the allopatric and sympatric zones of P. mariana and P. rubens ( Figure 1 and The intersection of at least two of three population genomic approaches (namely F ST -outlier analysis, geographic, and genomic clines) was used to classify the sampled loci according to their capacity to introgression. Of the 300 SNPs, 23 were classified as impermeable to introgression, 238 were assumed to be neutral, and 39 were highly permeable to interspecific gene flow (de Lafontaine et al., 2015).
Here, we assessed the direction of interspecific gene flow estimated from each of the three marker subsets and to evaluate whether variation in environmental conditions reflects the direction of introgression. Detailed methods regarding SNP discovery, DNA extraction, and genotyping can be found elsewhere (de Lafontaine et al., 2015).

| Detection of admixed individuals
The proportion of black spruce and red spruce genome admixture for each sampled individual (hybrid index; q-value) was estimated based on all SNPs, using the Bayesian clustering algorithm implemented in STRUCTURE 2.3.3 (Pritchard, Stephens, & Donnelly, 2000). Five independent runs, each with 1,000,000 Markov chain Monte Carlo (MCMC) iterations after 50,000 burn-in periods, were performed for K (number of groups) set to 2, which corresponded to the optimal partitioning (de Lafontaine et al., 2015) and to the actual number of taxa.
Expected levels of admixture are 0 and 1 for purebreds, 0.5 for firstgeneration (F 1 ) hybrids, and 0.25 and 0.75 for backcrossed individuals. Hence, theoretically, threshold values of 0.125 and 0.875 should be optimal for distinguishing between purebreds and first-generation backcrossed individuals. We used simulations to assess the efficiency (proportion of individuals in a category that are correctly identified) and accuracy (proportion on an identified category that truly belongs to that category) of the STRUCTURE analysis to tease apart purebreds from admixed individuals (Vähä & Primmer, 2006). Simulated P. mariana, P. rubens, first-generation hybrids (F 1 ) and backcrosses with each parental (F 1 × P. mariana and F 1 × P. rubens) were obtained using HYBRIDLAB 1.0 (Nielsen, Bach, & Kotlick, 2006). We used the multilocus genotypes (300 SNPs) of individuals from the allopatric population group of each taxon as reference parental genotypes (n = 92 and 98 individuals for P. mariana and P. rubens, respectively). Simulated P. mariana, P. rubens, F 1 , F 1 × P. mariana, and F 1 × P. rubens (n = 1,000 simulated individuals per category) were analyzed in STRUCTURE with five runs of 500,000 MCMC iterations after 10,000 burn-in periods.

| Estimation of asymmetrical gene flow
The program MIGRATE-N was originally designed and is still most typically used to estimate effective population sizes and past migration rates among n populations (i.e., movement of individuals within a metapopulation) (Beerli & Felsentein, 1999, 2001Beerli & Palczewski, 2010). The analysis uses maximum-likelihood or Bayesian inference to jointly estimate the parameters M (M = m/μ; where m and μ are the migration and mutation rates, respectively) describing the mutation-scaled asymmetrical migration rate, and Θ (Θ = 4N e μ) describing the mutation-scaled effective population size in a coalescent framework by which alleles are traced back in time to a single ancestral copy. Importantly, Nm in MIGRATE-N has the same meaning as Nm in the context of F ST . As seen in any other population genetics method, m is seen as a backward immigration rate, which, in other words, is the probability that a randomly chosen individual (i.e., a multilocus genotype) in a given generation came from a population (i.e., a gene pool) different from the one in which it is currently observed in the preceding generation. In recent years, instead of assessing spatial movement of individuals among populations, an increasing number of studies have successfully used the MIGRATE-N approach to test various scenarios of asymmetrical gene flow between closely related sympatric species and their hybrids based on STRUCTURE assignments of individuals (e.g., Andrew, Ostevik, Ebert, & Rieseberg, 2012;Field et al., 2011;Ortego, Gugger, & Sork, 2016;Scascitelli et al., 2010;Starr et al., 2013). Following this novel approach, we estimated the magnitude and direction of long-term gene flow between the two spruce taxa and their hybrids using the maximum-likelihood coalescent-based approach implemented in MIGRATE-N 3.6 (Beerli & Felsenstein, 1999, 2001. The 195 individuals from the sympatric zone Search parameters included a first set of seven replicate runs with 10 short chains, each visiting 10,000 genealogies and using 500 trees to improve the driving values for the next chain as well as three long chains, each visiting 100,000 samplings with a burn-in of 5,000 trees. To insure convergence, the following three runs were longer (×5) and consisted of 10 short chains of 50,000 trees, followed by three long chains of 500,000 steps, with the first 10,000 trees discarded as burnin. This search strategy was deemed satisfactory because the results (Θ and M parameter estimates) were largely congruent among runs (Beerli, 2009). However, the 95% confidence intervals generated by any individual run of MIGRATE-N were too narrow, a caveat already reported on the basis of simulations (Abdo, Crandall, & Joyce, 2004).
Hence, we used the variation across the MLE obtained from our 10 replicate runs to compute the mean and standard deviation of Θ and M parameter estimates. Differences among these parameters were tested by analyses of variance (ANOVAs) followed by post hoc analyses using Tukey's HSD (honest significant difference) test for multiple comparisons. False discovery rate (FDR; Benjamini & Yekutieli, 2001) control was performed to adjust p-values for all multiple hypotheses tests (α = 0.05), and all reported p-values were FDR-adjusted.

| Climatic data analysis
The latitude, longitude, and elevation of sampling locations were input into BIOSIM 10 ( Régnière, 1996) which interpolates daily climatic conditions using a simulation model that integrates climatic data recorded between 1970 and 2000 at the four closest weather stations of each sampling location within a range of 50 km. Nine climatic variables were initially assessed: the annual mean temperature (°C), the annual mean of the daily minimal temperature, the annual mean of the daily maximal temperature, the annual number of degree-days above 0°C, above 5°C and above 10°C, the number of frost days, the total annual amount of precipitation (in mm of water including rain and snow), and the annual amount of snow (in mm of water). All climatic variables were highly correlated with either mean annual temperature or total annual amount of precipitation (as previously reported in Prunier, Gérardi, Laroche, Beaulieu, & Bousquet, 2012); hence, only these two latter climatic factors were retained for further analysis (Table   S1). In order to assess whether variation in environmental conditions reflects the direction of introgression, significant differences in the mean of temperature and precipitation variables among geographic/ genotypic classes (i.e., allopatric P. mariana, sympatric P. mariana, hybrids, sympatric P. rubens, allopatric P. rubens) were tested by analysis of variance (ANOVA) followed by post hoc analysis using Tukey's HSD (honest significant difference) test for multiple comparisons along with FDR control. Furthermore, the relationship between the hybrid index (q-value) and each climatic variable within the sympatric zone was tested with linear least-squares regression.

| Detection of admixed individuals
Simulations were used to assess the efficiency (proportion of individuals in a category that are correctly identified) and accuracy (proportion on an identified category that truly belongs to that category) of the STRUCTURE analysis to distinguish purebreds from admixed individuals (i.e., hybrids sensu lato) using the theoretical threshold hybrid indexes (q-values of 0.125 and 0.875; see Materials and methods).

STRUCTURE assignments of simulated individuals based on all 300
SNPs resulted in almost complete lack of overlap among the five categories assayed (simulated P. mariana, F 1 × P. mariana, F 1 , F 1 × P. rubens, and P. rubens; Figure 2a). Hence, simulation results indicated high accuracy and efficiency of the STRUCTURE analysis (both values above 99.8%) to tease apart purebreds from first-generation backcrosses using the theoretical thresholds (Figure 2a).
Based on these thresholds, assignment scores of the actual 385 sampled spruce trees indicated that 21% were admixed individuals ( Figure 2b). Within the sympatric zone, the proportion of hybrids increased to 38%. While these admixed individuals belong to a wide range of hybrid generations, the distribution of the hybrid indexes was clearly a skewed ( Figure 2b). Specifically, 74% of admixed individuals were backcrossed toward red spruce. The remaining hybrids were distributed at low frequency across the various levels of admixed ancestries, including recent-generation hybrids (11%) and backcrosses toward black spruce (11%), along with advanced-generation hybrids falling in between the simulated categories (4%). In practice, the individuals assigned as purebreds by STRUCTURE analysis (79% of total sample) should indeed include mostly pure P. mariana or P. rubens individuals, as well as some late-generation backcrosses toward the parental taxa.  Table S2. For all three sets of markers, the mutation- There was a higher rate of unidirectional gene flow from black spruce to red spruce than the opposite (M P.mariana→P.rubens > M P.rubens→P.mariana ).

| Estimation of asymmetrical gene flow
This asymmetry was significant with the impermeable and neutral markers subset (p-values = .002 and.0006, respectively) but not with highly permeable markers (p-value = .15). Bidirectional gene flow was high between red spruce and hybrids (M P.rubens↔Hybrids ) and did not vary significantly among the three marker subsets (p-values = .07).
By contrast, bidirectional gene flow between black spruce and hybrids (M P.mariana↔Hybrids ) increased among the marker subsets with negligible gene flow at impermeable loci, intermediate levels at presumably neutral loci, and high gene flow at highly permeable loci (pvalue < .0001). Hence, there was a clear asymmetry in introgression with greater overall (bidirectional) gene flow between red spruce and hybrids than between black spruce and hybrids (M P.rubens↔Hybrids > M P. mariana↔Hybrids ; p-value < .0001). However, no significant unidirectional gene flow asymmetry was found between purebreds and hybrids (i.e., F I G U R E 2 Frequency distribution of hybrid indexes (STRUCTURE q-values) of multilocus genotypes (n = 300 SNP loci) of (a) simulated purebreds, first-generation backcrosses, and first-generation (F 1 ) hybrid individuals (n = 1,000 simulated individuals per category) and (b) adult spruce tree individuals (n = 385 individuals) sampled across the allopatric and sympatric zones of Picea mariana and Picea rubens. Sampled spruce tree individuals were assigned to one of three genotypic classes on the basis of their STRUCTURE q-values: pure P. mariana (q-values < 0.125), pure P. rubens (q-values > 0.875), or admixed individuals (i.e., hybrids sensu lato; q-values between 0.125 and 0.875) p-value = .20).

| Climatic data analysis
Significant differences in both temperature and precipitation varia-

| DISCUSSION
Uncovering the patterns of gene flow between closely related taxa is important for understanding the processes that maintain distinct species. Studies reporting morphological identifications of red spruce and black spruce and molecular assessments of the extent of introgression between these taxa have been quite controversial (Bobola et al., 1996;Gordon, 1976;Manley, 1972;Perron & Bousquet, 1997).
Yet, in the present study, simulations indicated that the STRUCTURE clustering analysis used with a set of 300 SNP markers allowed to tease apart purebreds from first-generation backcrosses using the theoretical thresholds with high accuracy and efficiency (Vähä & Primmer, 2006). Using this approach, the proportion of admixed individuals (i.e., hybrids sensu lato) reached 38% within the sympatric zone, supporting previous morphological and genetics studies reporting extensive interspecific gene flow between black spruce and red spruce (Morgenstern & Farrar, 1964;Perron & Bousquet, 1997).
However, the level of introgression between these two spruce taxa is not homogeneous across the genome and, more specifically, gene flow varied among the 300 markers assayed in the present study (de Lafontaine et al., 2015). This property was further explored here in order to assess whether the level of asymmetry in interspecific gene flow between P. mariana and P. rubens is also heterogeneous across the genome.
Within the sympatric zone, there was a higher rate of unidirectional gene flow from black spruce to red spruce purebreds than the opposite (M P.mariana→P.rubens > M P.rubens→P.mariana ). Such asymmetry, based solely on data from purebred individuals, could reflect the current different population sizes of the two taxa. According to Burgess et al. (2005), higher gene flow should be expected from the more abundant species (black spruce) to the rare species (red spruce). However, under this density-dependant scenario, we should also expect that first-generation hybrids will be more likely to mate with the more abundant species (Lepais et al., 2009), which, in this instance, would lead to increased directional introgression toward black spruce. Because we did not observe such a pattern and also because MIGRATE-N delivers long-term migration estimates (Beerli, 2009), we conclude that the migration rate between pure types should reflect the general trend in historical gene flow between the two taxa at the time of species inception as well as during postglacial colonization (Currat et al., 2008;Guichoux et al., 2013). First, red spruce originated from a glaciationinduced isolation of a preexisting black spruce population most likely during the Pleistocene (Perron et al., 2000). As a consequence, the ge-  . Hence, given the progenitor-derivative nature of this species pair, the direction of gene flow during the speciation was necessarily higher from black spruce (progenitor) toward red spruce (derivative) than vice versa. Second, based on pollen paleorecords of eastern North America, Lindbladh et al. (2003) suggested the replacement of a black spruce forest by a red spruce-dominated forest during the Late Holocene (since ca. 1,500 cal year BP), when red spruce massively invaded the area. According to the theoretical neutral model of introgression during range expansions (Currat et al., 2008), introgression occurs almost exclusively from the resident to the invading species.
Our genetic data corroborate this scenario for the two spruce species whereby the genome of the resident taxa (black spruce) infiltrated the genomic background of the colonizing taxa (red spruce) more than the opposite. Another prediction of the neutral model of introgression is that the asymmetrical pattern of introgression from resident to invasive taxa should be strongest at markers experiencing reduced gene flow (Currat et al., 2008;Guichoux et al., 2013). Correspondingly, the asymmetry of gene flow between black spruce and red spruce was found at both impermeable and presumably neutral markers, but not at highly permeable markers, where gene flow appeared virtually unimpeded in both directions.
The most striking pattern is a clear asymmetry in introgression depicted by a greater overall (bidirectional) gene flow between red spruce and hybrids than between black spruce and hybrids (M P. rubens↔Hybrids > M P.mariana↔Hybrids ). As expected, this asymmetry was heterogeneous across the genome, but this heterogeneity was entirely due to the genomic variability of the gene flow between black spruce and hybrids. Indeed, while gene flow between red spruce and hybrids (i.e., backcrossing toward P. rubens) remained invariably high even for impermeable loci subset, there was an increase of total gene flow between black spruce and hybrids (i.e., backcrossing toward P. mariana) according to the permeability of the marker subsets (i.e., negligible gene flow at impermeable loci, intermediate levels at presumably neutral loci, and high gene flow at highly permeable loci). At highly permeable loci, the asymmetrical pattern of introgression was no longer detected, that is, at these loci, backcrossing was equally high toward both parental species.
We have argued above that the directional gene flow observed between the two parental species (black spruce toward red spruce) might reflect the known long-term demographic history of the two spruce taxa (i.e., at the time of species inception and during postglacial colonization). Yet, it has been suggested that both historical and contemporary processes can act in concert to influence and maintain patterns of introgression (Abbott et al., 2013;Hamilton & Aitken, 2013). The higher level of backcrossing toward P. rubens than toward P. mariana might reflect more contemporary evolutionary processes including exogenous or endogenous selection. Indeed, in natural plant populations, it is commonly assumed that selection remains the primary mechanism implicated in determining contemporary patterns of hybridization and introgression (Fitzpatrick et al., 2010;Hamilton & Aitken, 2013;Hamilton et al., 2013a;Lexer et al., 2005;Martin et al., 2006;Scascitelli et al., 2010). Because red spruce and black spruce are associated with contrasted ecological niches (Major, Barsi, et al., 2003;Major et al., 2003a,b;Manley & Ledig, 1979) and as there is no evidence of reproductive phenological barrier, we expected that variation in environmental conditions should reflect the extent and direction of introgression if exogenous selection is a key determinant for the maintenance of the boundary between species.
Overall, both temperature and precipitation variables (mean annual temperature and total annual amount of precipitation) had higher values for red spruce than for black spruce, with intermediate values for hybrids. These results are largely congruent with transplant experiments suggesting that red spruce has a greater competitive advantage in warm and mesic conditions while black spruce outperforms in cold and dry environments (Major, Barsi, et al., 2003;Major et al., 2003a,b;Manley & Ledig, 1979 Hamilton et al., 2013a,b), there was no significant relationship between hybrid index and temperature within the sympatric zone of black spruce and red spruce. By contrast, such a relationship was found between the hybrid index and precipitation, albeit with very low explanatory power. Precipitation was higher for P. rubens purebreds than for hybrids and P. mariana purebreds. Interestingly, this apparent association collapses when considered in light of the directional bias in interspecific gene flow. Indeed, within the sympatric zone, a difference in total annual amount of precipitation was only observed between red spruce and hybrids; yet, this cannot stand out as a strong selective filter because the gene flow was invariably high between these two genotypic classes. Furthermore, while an actual barrier to interspecific gene flow was strictly found between P. mariana and hybrids, the amount of precipitation did not differ between these.
While formally disentangling the role of environmental and genetic factors in the maintenance of the hybrid zone between black spruce and red spruce was beyond the scope of this study, uncovering directional biases in interspecific gene flow did provide some useful insights. Specifically, it allowed us to falsify an apparent association between hybrid index and precipitation we initially detected before considering asymmetrical patterns of introgression. Although we cannot rigorously discard the possibility of exogenous selection based solely on our exploratory analysis, we found no evidence that two of the most important climatic factors, namely temperature and precipitation, played a key role in the maintenance of species boundary between black spruce and red spruce. This is in sharp contrast with other hybridizing spruce species pairs from western North America for which temperature and precipitation determined the distribution and structure of hybrid zones and where climatic differences were greater than in our study (De La Torre et al., 2014Hamilton & Aitken, 2013;Hamilton et al., 2013a,b). However, it must be acknowledged that there could be other exogenous selective pressures that were not assessed in the present study, such that a different ecological factor is more important than precipitation or temperature for preventing black spruce backcrosses. Hypothetically, assuming that (1) there is no reproductive phenological barrier, (2) selection (rather than purely neutral processes) is the primary mechanism implicated in determining contemporary patterns of introgression, and (3) exogenous selection is not a key determinant for maintaining the spruce hybrid zone, endogenous selective pressures (such as Bateson-Dobzhansky-Muller incompatibilities) could perhaps represent another asymmetrical mechanism preventing ongoing backcrossing toward black spruce at impermeable genomic regions. These genomic incompatibilities would be asymmetrical as backcrossing toward red spruce remains virtually unimpeded. It is well established that such asymmetrical incompatibilities constitute a possible outcome of the Bateson-Dobzhansky-Muller models (e.g., when unidirectionally inherited markers are involved in the incompatibility) and represent Darwin's corollary to Haldane's rule (Turelli & Moyle, 2007;Welch, 2004;Wu & Beckenbach, 1983).
In a previous study, we uncovered the semipermeable nature of the barriers to gene flow between black spruce and red spruce by identifying loci representing impermeable, neutral, and highly permeable genomic regions (de Lafontaine et al., 2015). With the present study, we moved forward our understanding about species delimitation of these progenitor-derivative taxa by showing that the barriers to gene flow are asymmetrical. First, we suggested that the higher rate of gene flow from black spruce to red spruce than the opposite likely reflects the known long-term demographic history of the two spruce taxa. Second, while backcrossing toward red spruce was invariably high across the genome, the actual species boundary is between hybrids and black spruce, where gene flow is impeded at those genomic regions impermeable to introgression. Hence, in this progenitor-derivative species pair, the barrier to interspecific gene flow is asymmetrical and prevents introgression of the derivative species (red spruce) back into the genomic background of the progenitor species (black spruce). Uncovering directional biases in interspecific gene flow provided some insights about the relative importance of the selective pressures maintaining the hybrid zone and the integrity of closely related species. Specifically, considering asymmetrical patterns of introgression allowed us to falsify an apparent association between hybrid index and precipitation. Although our study did not completely rule out the role of exogenous selection in maintaining the species boundary between red spruce and black spruce, it does however advise caution when inferring exogenous selection pressure from a given ecological variable based on environment-genetic associations alone.