Identifying loci under selection via explicit demographic models

Abstract Adaptive genetic variation is a function of both selective and neutral forces. To accurately identify adaptive loci, it is thus critical to account for demographic history. Theory suggests that signatures of selection can be inferred using the coalescent, following the premise that genealogies of selected loci deviate from neutral expectations. Here, we build on this theory to develop an analytical framework to identify loci under selection via explicit demographic models (LSD). Under this framework, signatures of selection are inferred through deviations in demographic parameters, rather than through summary statistics directly, and demographic history is accounted for explicitly. Leveraging the property of demographic models to incorporate directionality, we show that LSD can provide information on the environment in which selection acts on a population. This can prove useful in elucidating the selective processes underlying local adaptation, by characterizing genetic trade‐offs and extending the concepts of antagonistic pleiotropy and conditional neutrality from ecological theory to practical application in genomic data. We implement LSD via approximate Bayesian computation and demonstrate, via simulations, that LSD (a) has high power to identify selected loci across a large range of demographic‐selection regimes, (b) outperforms commonly applied genome‐scan methods under complex demographies and (c) accurately infers the directionality of selection for identified candidates. Using the same simulations, we further characterize the behaviour of isolation‐with‐migration models conducive to the study of local adaptation under regimes of selection. Finally, we demonstrate an application of LSD by detecting loci and characterizing genetic trade‐offs underlying flower colour in Antirrhinum majus.


| INTRODUC TI ON
Elucidating the genetic basis of adaptation and identifying genetic determinants of population and species divergence are key foci in evolutionary biology. In natural systems, genetic variation is shaped by the demographic history (driven by the neutral processes of mutation, migration and drift) together with natural selection on loci underlying adaptive traits. While all gene genealogies are constrained by the demographic history of the population, the genealogies of loci affected by selection are perturbed and may differ in key characteristics compared to those evolving under neutrality, though converging patterns can arise (Bierne et al., 2011;Edmonds et al., 2004;Li et al., 2012;Slatkin & Excoffier, 2012).
A multitude of methods have been developed that identify loci under selection as those whose summary statistics deviate from the genome-wide distribution. These "outlier" approaches can generally be grouped into three classes: those that (a) detect regions of elevated differentiation between populations (via e.g. F ST -related statistics), (b) detect regions of perturbed site frequency spectrum (SFS) via diversity or diversity-related estimators (e.g., π, Tajima's D) and (c) detect regions of extensive linkage disequilibrium (LD) via haplotype statistics (e.g., extended haplotype homozygosity [EHH], integrated haplotype score) (Beaumont & Nichols, 1996;Biswas & Akey, 2006;Luikart et al., 2003;Oleksyk et al., 2010;Sabeti et al., 2002;Vitti et al., 2013). While in empirical studies inference of selection is often achieved through corroboratory evidence from multiple measures, the choice of a particular class and hence summary statistic is motivated by the type of selection one aims to infer; with the first geared towards divergent selection between populations and the others towards footprints of selection within single populations.
Under the premise that adaptive genetic variation is a function of both selective and neutral forces, accounting for the demographic history of the study system is critical for the correct identification of selected loci François et al., 2016;Hoban et al., 2016;Hofer et al., 2009). This is commonly achieved by contrasting locus-specific statistics against an estimate of the expected distribution of these statistics under demography alone, with the power of such an approach being a function of both the summary statistics used and the accuracy with which the neutral distribution is inferred. In the context of identifying local adaptation, the canonical statistics employed is F ST , and the first methods to infer its neutral distribution used simulations under an island model calibrated by matching the observed heterozygosity at each locus (Beaumont & Nichols, 1996;. Under island models, the distribution of sample allele frequencies is also well captured by Pólya distributions (Balding & Nichols, 1994;Rannala & Hartigan, 1996), which can be learned using likelihood-based methods that jointly classify loci into neutral and selected classes (Beaumont & Balding, 2004;Foll & Gaggiotti, 2008;Galimberti et al., 2020). While generally powerful, these methods suffer from high false-positive rates in the case of asymmetric divergence between populations (Galimberti et al., 2020;Lotterhos & Whitlock, 2014;Luu et al., 2017), which violates a key assumption of island models. This can be alleviated by using hierarchical island models Galimberti et al., 2020) or a more flexible distribution to capture neutral allele frequencies (e.g., principal components analysis [PCA], Luu et al., 2017). For some natural systems, however, these approaches may still be insufficient to capture the demographic history of the population and a (potentially complex) demographic model should be used. Williamson et al. (2005), for instance, inferred such a model from putatively neutral loci and then identified loci under selection as those for which an additional selection parameter is required.
Under coalescent theory, demographic models are parametrized by the effective sizes (N E ) of each population and the effective rates of migration (M E ) between them, which respectively describe the level of drift and gene flow within and between populations (Charlesworth, 2009;Petry, 1983). Importantly, both N E and M E may change through time. Different modes of selection and adaptive processes can be expected to alter these demographic parameters in different ways. In a single population, a selective sweep is expected to reduce N E at selected and linked sites while diversifying selection is expected to increase it (Galtier et al., 2000;Gossmann et al., 2011). In the case of two or more populations connected by gene flow, balancing selection and adaptive introgression are expected to increase M E at selected and linked sites, while divergent selection is expected to reduce M E at those sites (Charlesworth et al., 1997;Geraldes et al., 2006;Petry, 1983;Won et al., 2005).
Notably, locus-specific demographic parameters are not just informative about the strength (i.e., the magnitude of variation in N E or M E ) and mode of selection (i.e., a reduction or elevation of N E or M E ), but may also identify the population (or environment) in which selection acts (e.g., a reduction in M E in one but not the other direction). Directional selection modulates fitness in natural populations by purging maladaptive alleles via extrinsic barriers such as hybrid or immigrant inviability, or lower fecundity (Naisbit et al., 2001;Nosil et al., 2005;Rundle & Whitlock, 2001;Schluter, 2000). This effectively reduces M E at selected loci proportionally to the strength of selection (Petry, 1983). Under local adaptation, alternate alleles may confer higher fitness in their respective local environment but reduced fitness in the foreign environment (i.e., antagonistic pleiotropy [AP]), or an allele may confer higher fitness in its local environment but have no differential effect relative to the alternate allele in the foreign environment (i.e., conditional neutrality [CN]) (Anderson et al., 2013;Kawecki & Ebert, 2004;Savolainen et al., 2013). To date, such genetic trade-offs have only been characterized in a handful of cases, primarily through demanding experiments involving the transplanting of alternate genotypes in their reciprocal environments (Anderson et al., ,2013(Anderson et al., , , 2014Oakley et al., 2014;Troth et al., 2018). Characterizing the nature of such trade-offs directly from genomic data presents a promising complementary approach, applicable to natural populations, to investigate the genetic basis of local adaptation, a key concept in ecological genetics.
Inferring demographic parameters using coalescent theory is, however, computationally challenging, as the underlying but unknown genealogies must be integrated out numerically (Hey & Nielsen, 2007). As a result, there exists only a single likelihood implementation to infer locus-specific and global demographic parameters jointly: an Markov chain Monte Carlo (MCMC) sampler that attributes loci to different classes (e.g., selected and neutral) and jointly infers the demographic parameters of a two-population isolation-with-migration (IM) model for each group (Sousa et al., 2013). To extend this approach to more complex models, simulationbased techniques such as approximate Bayesian computation (ABC) (Beaumont et al., 2002;Marjoram & Tavaré, 2006;Sisson et al., 2018) may be employed. The use of ABC to infer genome-wide demographic parameters has a long tradition (Beaumont et al., 2002;Dussex et al., 2014;Sisson et al., 2018;Tavaré et al., 1997; and it may also be used to infer locus-specific parameters in a hierarchical setting, but it is computationally challenging. Indeed, the dimensionality of a genome-wide model is prohibitively large for any naïve Monte-Carlo scheme as the probability that a simulation matches the data at all loci is virtually zero. When inferring genome-wide parameters, loci are exchangeable and this problem is easily overcome by using summary statistics that are functions of all loci such as the (scaled) moments of the distribution of locus-specific statistics (Tavaré et al., 1997;Wegmann et al., 2009). To benefit from this in a hierarchical setting, Bazin et al. (2010) proposed a two-step algorithm in which hierarchical parameters are inferred based on moments of locus-specific summary statistics, and locus-specific parameters are then inferred with simulations of a single locus conducted with parameters drawn from the posterior distribution of the hierarchical parameters. As recently shown (Kousathanas et al., 2016), this approach can be generalized to arbitrary parameter dependencies when using an ABC-MCMC setting that also eliminates the need for a two-step approach. These approaches were successfully used to infer locus-specific (Bazin et al., 2010) and cluster-specific (Aeschbacher et al., 2013) migration rates, as well as locus-specific selection coefficients (Foll, Poh, et al., 2014;Kousathanas et al., 2016) for up to several hundred loci. Scaling such inference up to whole genomes, however, remains difficult due to the requirement to simulate all loci.
In this paper, we introduce LSD, a framework for identifying loci under selection via explicit demographic models that scales to genomic data. Similar to the approach by Bazin et al. (2010), LSD works in two steps. However, rather than inferring hierarchical parameters for all loci, LSD first obtains point estimates of demographic parameters for neutral loci, which is then compared against per-locus estimates to identify selected loci. This has the benefit of requiring only simulations of a single locus, which can be efficiently recycled (Thalmann et al., 2011). As a downside, the approach requires a priori knowledge on putative neutral sites. However, as we show with simulations, the approach is very robust to mis-specifications.
While LSD is flexible regarding the choice of demographic model and can in principle accommodate any discrete population model (including single population and stepping-stone models) as well as detect different modes of selection, we demonstrate here LSD's utility in studies of local adaptation by focusing on the detection of loci under divergent selection between populations under IM models.
We validate and assess the performance of LSD via extensive simulations, provide general insights into the properties of IM models in relation to the power of LSD and other widely applied genome scan methods, and demonstrate an application of the method to the detection of functionally validated loci underlying flower colour in two parapatric subspecies of Antirrhinum majus (common snapdragon) (Schwinn et al., 2006;Tavares et al., 2018).

| Model
We begin by outlining the conceptual framework underlying LSD. Consider a demographic model ℳ, parameterized by demographic parameters , that generates genetic data D. To quantify deviations from neutrality, LSD first estimates the demographic parameters ̂ from a collection of loci assumed to be neutral ( Figure S1). In a second step, LSD performs demographic inference on all loci and determines the posterior distribution l ( ) = ( | D l ) for each locus.
Finally, LSD assesses the concordance of ̂ with l ( ) by determining h l , the highest posterior density interval (HPDI) of l ( ) that contains ̂ , and uses p l = 1 − h l as a metric to identify locus l as incompatible with ̂ . For loci with parameters ̂ , p l follows a uniform distribution (the coverage property) and is interpreted as a p-value to reject ̂ for locus l . The joint posterior distribution l ( ) may further provide information on the magnitude and directionality of selection.
Given that the evaluation of the likelihood is nontrivial and may be intractable under more complex models, we resort to an approximate approach (Marjoram & Tavaré, 2006) (Figure 1). Under an ABC framework, the likelihood is approximated by simulations, the outcomes of which are compared with observed data in terms of summary statistics. That is, we find the set of parameters that minimize the distance between the observed data D and the simulated data D ′ . To efficiently evaluate this, we reduce the dimensionality of the data via summarizing them into a set of lower-dimensional summary statistics S and S ′ , which are selected to capture the relevant information in D and D ′ , respectively (Beaumont et al., 2002;Joyce & Marjoram, 2008;Sisson et al., 2018).
An appropriate model for generating simulated genetic data is provided by coalescent theory (Kingman, 1982;Wakeley, 2001), parametrized by population demographic parameters where N E refers to the vector of effective population sizes, M E to the vector of effective migration rates and to the mutation rate. We stress that population sizes and migration rates may vary through time.

| Implementation
We implemented the framework described above as shown in Figure 1 and detailed below.

| Simulations
While the framework is readily used for any type of locus, we consider here a locus to consist of a genomic window with a shared genealogy.
This effectively implies that recombination is free between loci and absent within. We simulate genealogies using msms (Ewing & Hermisson, 2010), under a user-defined demographic model. The processing, format and final output of observed genetic data will often differ from that of coalescent simulations, given that observed genetic data may be subject to various presequencing (e.g., pooling), sequencing (e.g., sequencing errors, stochastic sampling of reads) and postsequencing (e.g., filters) events that perturb and reformat the data from the original source. We thus implement two complementary programs that interface with coalescent simulators to replicate observed sequencing pipelines and generate simulated sequencing data: lsd-high can accommodate and simulate both individual and pooled data and assumes mid-to high coverage (>10×) data, while lsd-low accepts individual data and can additionally accommodate low coverage (>2×) data by utilizing genotype likelihoods via mstoglf and angsd (Korneliussen et al., 2014).
A suite of summary statistics is then calculated for the simulated and observed data via the same programs. Summary statistics currently implemented include the number of segregating sites (S), private S, nucleotide diversity (π), Watterson's estimator (θ W ), Tajima's D (θ D ), relative divergence (F ST ), absolute divergence (D XY ) and site frequencies. In principle any summary statistic can be included, contingent on the data and appropriate additions to the programs' scripts. To account for potential correlation between summary statistics and to retain only their informative components, we apply a partial least squares transformation ).

| ABC inference
The estimation of demographic parameters is performed with abctoolbox , via the ABC-GLM algorithm using the subset of n simulations closest to the observed summary statistics. In a first step, LSD infers demographic parameters of putatively neutral loci to obtain point estimates ̂ . We do so based on F I G U R E 1 Analytical framework to identify loci under selection via explicit demographic models (LSD). LSD identifies loci under selection by first estimating demographic parameters and then quantifying the departure of these parameters from neutral expectations. Our specific implementation of LSD employs approximate Bayesian computation (ABC) for parameter estimation, and is performed in a genome scan approach [Colour figure can be viewed at wileyonlinelibrary.com] simulations of a single locus. In contrast to classic ABC regression approaches, ABC-GLM can readily use such simulations to accurately infer posterior distributions from many loci as it approximates the likelihood function, rather than the posterior distribution (Thalmann et al., 2011). However, we propose a slightly different approach that does not characterize the full posterior distribution, but we found to result in point estimates ̂ that are more accurate (Text S4, Figures S1 and S2) and robust to misidentification of putatively neutral loci (Text S5, Figure S3). Specifically, we first infer locus-specific posterior distributions l ( ) for each putatively neutral locus with ABC-GLM, then calculate the product of these densities ( ) = ∏ l l ( ), and identify ̂ = argmax ( ).
In a second step, LSD infers locus-specific posterior distributions using ABC-GLM on all loci, either using the same set of simulations as in the first step, or from simulations of a single locus conducted under the parameters ̂ , except for the parameters affected by selection (e.g., M E ).

| Simulations
To test the performance of the LSD implementation, we simulated pseudo-observed genomes using the program msms under different demographic and selection parameter values, focusing on IM models relevant for the characterization of local adaptation. We assumed a diploid system, a common mutation rate µ = 5 × 10 −7 per bp per generation, and all loci to be biallelic with ancestral allele a and derived allele A. Each simulated pseudogenome represented a unique demographic-selection regime and comprised n n = 1000 neutral loci and n s = 50 selected loci of 5 kb length, for a total (pseudogenome) size of 5.25 Mb. We assume no within-locus recombination.

| Demography
We simulated four models representing different levels of complexity in terms of population structure and demographic history ( Figure 2; Text S1). In all models, selection is inferred from the reciprocal scaled migration rates between two contrasting environments (M 12 , M 21 ; where M E = Nm E ). We used neutral migration rates M 12 = M 21 = M = 0.5, 5 and 50 migrants per generation and inferred selection as deviations from these rates. We use model ℳ 1 to represent a simplified, generalized model of local adaptation, model ℳ 2 to represent a more complex case of local adaptation comprising multiple, structured populations, model ℳ 3 to reflect a scenario typical of glacial-induced divergence and secondary contact population dynamics and model ℳ 4 to represent a case of hierarchical divergence with complex demography. The specific parameter choices for all models are given in Text S1.

| Selection
To simulate genetic trade-offs, selection was simulated on alternate alleles in the contrasting environments, on top of the demographic F I G U R E 2 Models used in the simulations and case study. Model ℳ 1 represents a simple two-deme isolation-with-migration (IM) model with reciprocal migration. Model ℳ 2 represents a six-deme island-continent model where common differences between environments are modelled by connecting the sampled demes (i.e., islands) to respective meta-population continents via gene flow. Model ℳ 3 represents a two-deme divergence with bottleneck and exponential growth model. Model ℳ 4 represents a fourdeme hierarchical divergence model with sequential founder events, bottleneck and exponential growth. Red and blue demes reflect contrasting environments, while grey demes reflect neutral environments where no selection acts. In all models, selection is inferred from the deviation from neutrality of the reciprocal migration rates between the two contrasting environments ( model. Specifically, we assumed the beneficial alleles to be dominant such that the relative fitness was 1 + s 1 , 1, 1 and 1, 1, 1 + s 2 for the three genotypes AA, Aa and aa in the demes or metapopulations occupying the two environments, respectively. For the selection coefficients s 1 and s 2 , we used all combinations of 0, 0.001, 0.01 and 0.1 and thus included cases of CN, in which either s 1 > 0, s 2 = 0 or s 1 = 0, s 2 > 0, as well as cases of AP with s 1 > 0, s 2 > 0. CN regimes are by definition always asymmetric, while AP regimes can be either symmetric (s 1 = s 2 ) or asymmetric (s 1 ≠ s 2 ). We further var-

| Assessing accuracy
We inferred selection by contrasting the locus-specific migra- AUC and asymmetry are reported for each simulated pseudogenome, each representative of a unique demographic-selection regime.

| ABC parametrization
Migration rates were drawn from log 10 M 12 , log 10 M 21 ∼ U [−4,3] in all cases, while all other parameters were fixed to their true values ("fixed" parametrization). We do this to assess the sensitivity and accuracy of LSD under ideal conditions, and to avoid confounding with model and parameter mis-specifications. For ABC parameter estimation, we retained the 1% closest simulations of 250,000 total simulations.

| Comparison with other methods
We compared the performance of LSD against that of two widely employed genome scan methods, on the simulated pseudogenomes.
pcadapt (Luu et al., 2017) identifies candidate single nucleotide polymorphisms (SNPs) as outliers with respect to population structure, ascertained via PCA, while outflank (Whitlock & Lotterhos, 2015) infers candidate SNPs by testing against a null model inferred from a highly revised Lewontin-Krakauer model extended to account for nonindependent sampling of populations and sampling errors. Given that these approaches are SNP-based (whereas LSD is windowbased), we consider a specific genomic window to be an outlier if at least one SNP within that window is called significant. This may reduce false negatives at the expense of inflating false positives, but reflects typical usage of such genome scans. For both methods, the true number of simulated populations was specified and SNPs were thinned to accommodate the particular LD structure of our simulated pseudogenomes when computing PCs and calibrating the F ST null distribution, before running on the full SNP dataset.
For a more realistic comparison where model parameters may not be known with confidence, we also considered cases of ℳ 1 and ℳ 4 in which all demographic parameters were unknown ("free" parame- remained the same as before. We used the closest 10,000 out of 1,000,000 simulations both to infer ̂ and to identify outlier loci. For added realism, we inferred ̂ from a full pseudogenome of 1050 loci, of which 1000 were neutral but 50 were mis-specified and weakly selected with s 1 , s 2 = 0.001; T s = 4000, M 12 = M 21 = M.

| Case study
To evaluate the performance of LSD on real data, we applied it to the detection of loci underlying floral colour in two parapatric subspecies  (Whibley et al., 2006). Several genetic loci have been shown to control the differences in these floral patterns (Bradley et al., 2017;Schwinn et al., 2006), and recently, Tavares et al. (2018) Figure S4), using an estimate of µ = 1.7 × 10 −8 per bp per generation (Tavares et al., 2018)  We focused our analysis on chromosome 6 on which the ROS and EL loci lie. To acquire empirical estimates of neutral demographic parameters, we excluded all genomic regions present in the structural annotation plus 10-kb flanking regions to generate a subset of putatively neutral regions on that chromosome. M 12 and M 21 were then estimated via 10-kb windows from these neutral regions by retaining the closest 10,000 out of 1,000,000 simulations, as outlined above.
To identify selected loci in the second step, we used sliding windows of size 10 kb and a 1-kb step-size, and retained the closest 5000 out of the same 1,000,000 simulations. Window size was identical to that of Tavares et al. (2018) and was chosen to reflect a compromise between statistical power and resolution (genomic signatures of selection were previously found to be in the range of ~20-40 kb ;Tavares et al., 2018), as well as to be compatible with the system's LD, which is expected to decay rapidly in outcrossing populations of A. majus.

| RE SULTS
3.1 | Two-deme IM case (model 1 ) The power to detect selection increased with increasing selection coefficients when these were similar (s 1 ≈ s 2 , cells along diagonal of subpanels in Figure 4). In such cases, stronger selection coefficients on alternate alleles increasingly polarize and ultimately maintain larger allele frequency differences between the two environments. In tandem, the power to detect selection also generally increased with the time since the onset of selection T S . In contrast, when s 1 ≫ s 2 or s 1 ≪ s 2 , one of the two alleles may proceed to fixation, in which case the power to detect selection decays or is lost (e.g., grey cells in left-most column of subpanels when derived allele A is lost, and AUC values and inferred (a)symmetries tending toward 0.5 and 0 respectively in bottom row cells when ancestral allele a is lost; Figure 4). This is particularly evident when the onset of selection is more distant in the past.  Figure 5, the inferred (a)symmetry generally reflected the true (a)symmetry of the underlying selection coefficients well, particularly for regimes with high power to correctly identify selected loci ( Figure 4). In lower powered regimes, we observe some cases where the inferred asymmetry does not reflect the underlying asymmetry of the selection coefficients accurately (e.g., blue cells along diagonals in subpanels at T S = 4000; Figure 5; Figure S6B). We interpret these results further in the discussion.

| Standing variation vs. de novo
A lower initial frequency of the derived allele may be expected to affect LSD's power to identify selected loci and its power to capture the underlying (a)symmetry of selection coefficients. However, we find that results for simulations building on selection from the de novo and standing variation cases showed generally very similar patterns (Figures 4 and 5; Figure S6). One notable exception however was the inaccurate inference of (a)symmetry in a few regimes with high power (AUC > 0.8) in the de novo case (e.g., blue cells along diagonals in subpanels at T S = 4000 in Figure S6B). This we attribute to the lower initial frequency of the derived allele A and consequently longer time needed to approach drift-migrationselection equilibrium for the de novo cases. This is explored further in the discussion.

| More complex cases (models 2 , 3 and 4 )
A key feature of LSD is its potential to explicitly accommodate complex demographies, which can lead to an inflation in false positives when not properly accounted for (Foll & Gaggiotti, 2008;Lotterhos & Whitlock, 2014;De Villemereuil et al., 2014). Despite the added complexity of models ℳ 2 and ℳ 3 , results were generally very similar to that of model ℳ 1 , with high power to identify selected loci (AUC > 0.8) across a large range of migration-selection regimes, an optimal migration rate at an intermediate value (M = 5), a similar dependence of power to detect selection on s 1 , s 2 and T S , and inferences of (a)symmetry that reflected well the underlying (a)symmetry of selection coefficients (Figure 6; Figures S7 and S8). One notable difference between these models was that model ℳ 2 generally required a longer time to generate power to detect selection when compared to models ℳ 1 , ℳ 3 and ℳ 4 , which we attribute to the larger metapopulation N E in model ℳ 2 . For model ℳ 4 , selection coefficients needed to be slightly higher (s 1 , s 2 ≥ 0.01; Figure S8) to attain high power to identify selected loci, and to accurately identify asymmetry.

| Robustness to mis-specification of the neutral set
As shown for all models and a subset of the regimes (M = 5; T S = 40,000; s 1 = s 2 = 0.01, 0.1), LSD is highly robust to the inclusion of selected loci in the neutral set, with negligible reduction in power (AUC) at up to a 13% mis-specification for model ℳ 4 and up to 20% for models ℳ 1 , ℳ 2 and ℳ 3 (Text S5; Figure 7).

| Comparison to other methods
The performance of LSD is comparable to that of pcadapt and outflank under model ℳ 1 , with minimal difference between the methods across the range of migration-selection regimes (Figure 8; Figure   S9). For the complex model ℳ 4 , however, both pcadapt and outflank have little to no power to correctly identify selected loci (AUCs ~ 0.5; S8A-S10A), implying that LSD is robust to parameter specification.

| Case study results
We identified a region of reduced effective migration between 52.9 and 53.2 Mb on chromosome 6 (Figure 9), consistent with the location of the ROS and EL loci (Tavares et al., 2018). Under model ℳ 1 , this region is characterized by a set of smaller, multiple peaks (p l < 0.01) reflecting signatures identified by previous authors, with the left-most peaks corresponding to ROS1 and ROS2 (shaded in red) and the right peaks to EL (shaded in green; Figure 9b). The joint posterior probability distributions reveal symmetric selection acting on both regions, implying that selection acts with similar strength in the two populations. Under model ℳ 2 , we find fewer outliers in the ROS-EL region than in model ℳ 1 , with the left-most peak in this region corresponding to ROS2 and the right peaks consistent with EL. ROS1 appears to be less of an outlier than in model ℳ 1 (p l ≈ 0.02). In contrast to model ℳ 1 , the ROS2 and EL peaks in model ℳ 2 are char-

| Identifying selection
Using simulations, we demonstrate that LSD has high diagnostic power (AUC > 0.8) to identify selected loci across a large range of demographic-selection regimes. This power relies upon two fundamental aspects that contribute to generating observable patterns.
First, selection must effectively be realized (i.e., result in a frequency shift of the beneficial allele). This requires that the strength of selection and initial frequency of the beneficial allele be sufficient to both counter the homogenizing effect of migration (Felsenstein, 1976;Haldane, 1930;Lenormand, 2002;Olson-Manning et al., 2012;Slatkin, 1973;Yeaman, 2015) and the eroding effect of drift (Wright, 1931). Second, the genomic data must contain signatures of selection that can be detected. In the case of LSD, this requires that the signatures of selection are discernible from the underlying noise (drift and migration) that characterizes the system, which demands sufficient time for said signatures to be reflected in the A lack of power in LSD must be interpreted considering these two conceptually different aspects. Notably, the lack of discrimination power for high migration rates and low selection coefficients can be attributed to selection failing to realize as a consequence of local, beneficial alleles being swamped by immigrant, maladaptive alleles.
In contrast, the lack of signal under low migration rates constitutes a methodological limitation of our implemented model, as it becomes increasingly difficult to detect reductions in effective migration when neutral or genome-wide migration rates are already at a low level, even when selection is effectively being realized in the demes.
This is analogous in effect to the loss of power to detect selection in highly differentiated populations in F ST outlier tests (Hoban et al., 2016;Martin et al., 2013). Under the same principle, we argue that the converse expectation can be assumed to hold for loci underlying adaptive introgression or balancing selection. That is, we expect power to detect such loci to be low when populations are minimally differentiated and high in highly divergent systems, as the detection of candidate loci in these cases is informed by increased effective migration.
The power of LSD to correctly identify selected loci generally increases with stronger selection coefficients and longer time since the onset of selection, though with exceptions. Specifically, if selection is of similar or equal strength in both demes or metapopulations, we observe a strong correlation between the power to detect selection and the true underlying selection coefficients. This follows theory which states that the reduction in effective migration is proportional to the strength of selection (Petry, 1983). However, we defer from translating these changes to explicit selection coefficients because in addition to the strength of selection, changes in effective migration are also a function of the recombination rate between linked and selected loci (Cutter & Payseur, 2013;Lotterhos, 2019;Petry, 1983). If selection differs strongly between demes or metapopulations (s i ≫ s j ) or when the onset of selection is sufficiently distant in the past, however, one allele may become fixed in the system.
In such a case, the signal to detect selection rapidly decays (Huber et al., 2016;Przeworski, 2002). Moreover, we observed little power to detect very recent selection, intrinsically related to our choice of summary statistics (Hohenlohe et al., 2010). That said, extending LSD to include additional statistics sensitive to linkage disequilibrium such as EHH (Sabeti et al., 2002) or single density score (SDS) (Field et al., 2016) will increase the power to detect more recent selection. Finally, given that LSD relies on localized deviations in effective demographic parameters, we predict that it may miss signals of polygenic selection, as signals of selection become increasingly hard to distinguish from the genomic background the smaller the locus effect sizes are.
From our simulations, we find that the power to detect selection is similar between the de novo and standing genetic variation regimes (model ℳ 1 ; Figure 4; Figure S6A), likely as a result of only considering loci for which the derived allele was not lost. This particularly affected the de novo case, under which the derived allele was lost in most simulations. Indeed, this is in line with theoretical expectation (Olson-Manning et al., 2012) and supports the notion that most empirical cases of local adaptation attribute selection of advantageous alleles to arise from standing variation (Jones et al., 2012;Lai et al., 2019;Reid et al., 2016). To clearly distinguish between these regimes, additional information on the evolutionary history of the system such as allele age, mutation rate or supplementary phylogenetic information is required (Peter et al., 2012).

| Comparison to existing methods
LSD's power to detect selection is comparable with that of pcadapt and outflank under the simple model ℳ 1 , but strongly outperforms these alternatives under the more complex model ℳ 4 . These results appear to hold regardless of whether prior knowledge of model parameters is confidently known for LSD, as LSD exhibits similar performance under the ideal "fixed" and realistic "free" parametrizations. This suggests that the methods pcadapt and outflank utilize to address population structure, namely PCA and an implicitly modelled F ST null distribution, respectively, are sufficient to control for the effects of relatively simple demography, but insufficient to capture more complex demographies. By modelling complex demographies explicitly, LSD is less affected by model complexity, though this extra power comes with some costs.
First, a demographic model needs to be specified that appropriately describes the neutral genetic variation of the system, allows for inferences of selection through changes in demographic parameters (e.g., N E or M E ), and is sufficiently simple to remain computationally tractable. A preliminary analysis of model choice may therefore constitute a prerequisite to successfully recapitulate the signal of complex evolutionary histories in the simulated data. Importantly, the model should always be validated by demonstrating that the observed data can be accurately and sufficiently captured ( Figure S15).
Second, LSD remains computationally more demanding than both pcadapt and outflank. Inferring demographic parameters is generally computationally challenging as the underlying genealogies need to be integrated out numerically (Hey & Nielsen, 2007), which for complex models usually requires simulation-based approaches such as ABC. Existing ABC approaches to infer locus-specific parameters (Bazin et al., 2010;Kousathanas et al., 2016) are difficult to scale-up to genome-wide data as they require the simulation of prohibitively many loci. To circumvent this problem, LSD implements an efficient ABC approach that requires simulations of single loci only, which is possible because LSD neither attempts to infer the hierarchical distribution of locus-specific parameters nor to obtain posterior estimates on whether a locus is affected by selection. Instead, LSD identifies loci under selection by quantifying whether locusspecific estimates of demographic parameters are incompatible with those estimated from a set of putatively neutral loci.
The a priori identification of this neutral set constitutes the third requirement. Such a set may be informed by the particular structural or functional class the sites belong to (Williamson et al., 2005) and may for instance consist of genomic regions not linked to structural annotations. Alternatively, a more naïve strategy may rely on the whole genome or a random subset of the genome to reflect neutral diversity. Even if this assumption of neutrality is violated (Begun et al., 2007;Fay et al., 2002;Li & Stephan, 2006), we show that our method to estimate ̂ is robust to the misidentification of neutral loci, with minimal effect on power even at high percentages of up to 20% mis-specification. We attribute this to our method of estimating ̂ as the product of per-locus posterior densities, which amplifies the signal (density) according to majority rule.
We finally note that most widely applied genome scan methods (e.g., pcadapt and outflank) detect outliers at the SNP level, while our current implementation of LSD focuses on genomic windows.
Focusing on genomic windows offers a means to aggregate information across linked loci, thereby potentially increasing power and reducing false positives from spurious signals at individual SNPs (e.g., Galimberti et al., 2020). These benefits are conditional on a choice of window size compatible with the LD decay in the system, as this maximizes the power and accuracy to capture the signal generated by selection. The LSD framework is by no means limited to genomic windows, however, and can be extended to SNP data when using an appropriate simulator and summary statistics calculator.

| Revealing trade-offs underlying selection
LSD can shed light on the directionality of selection by inferring the (a)symmetry in deviations of migration rates between populations.
In our simulations, this inferred (a)symmetry accurately reflects the (a)symmetry in the underlying selection coefficients for older onsets of selection, but less so for more recent onsets (Figures 5 and 6B;Figures S6B and S7B). This is because inferred asymmetries in deviations of effective migration rates are also affected by asymmetries in allele frequencies of the beneficial allele. For instance, a new beneficial mutation that arises in one of two demes will initially be rare among migrants. Only as its frequency increases will selection start to act against immigrants in both demes ( Figure 10). Hence, the inference of asymmetry in LSD may include cases where selection is effectively asymmetric, but also those in which selection is effectively symmetric but prior to drift-migration-selection equilibrium.

| Real-world application
We demonstrate a real-world application of LSD by successfully isolating and characterizing the selection signal of loci underlying an extrinsic reproductive barrier in A. majus. Our results from contrasting a single population (model ℳ 1 ) and three populations (model ℳ 2 ) per subspecies both identified the ROS and EL loci which were previously reported to underlie differences in floral patterns between these subspecies (Tavares et al., 2018  However, whether selection on alternate alleles follows the same positive frequency-dependence across the broader scale, including more distant populations, is currently unknown. The difference in signal between local pairs at the contact zone (ℳ 1 ) and the global set (ℳ 2 ) may be generated by different frequency-dependent selection curves for the alternate alleles and potentially loss of AP away from the contact zone ( Figure S16).

| CON CLUS ION
Loci under selection are predicted to exhibit genealogies with demographic parameters divergent from those of neutral nonlinked regions, leading to heterogeneity in demography across the genome.
In this study, we condition the identification of candidate loci on divergent population parameters using explicit demographic models, Our power analysis using simulations shows that LSD, and our implementation of it, represents a powerful method for detecting selection that is robust to different and complex demographies.
Furthermore, given that certain demographic parameters (e.g., migration) are not inherently commutative, we show that the directionality or population-specificity in selection can be inferred. This can facilitate identifying in which environment selection acts and hence elucidate genetic trade-offs, bridging an analytical divide between experimental ecology and population genomics. Importantly, the proposed approach as well as our implementation is not limited to the demographic models investigated here, nor the explicit choice of simulation programs or summary statistics used. This flexibility and customizability of LSD can facilitate, for example, more realistic accommodation of recombination (via different coalescent simulators), improved detection of more recent selection (via linkage-informative statistics), and inference of other modes of selection (e.g., balancing selection) and adaptive introgression by conditioning the detection of selection on, for instance, an increase (rather than reduction) in M E or changes in N E relative to neutral expectations.

ACK N OWLED G EM ENTS
We thank Nicholas Barton for providing valuable comments on the manuscript and the Genetic Diversity Centre at ETH Zurich and in particular Niklaus Zemp for providing IT support. This work was supported by the Swiss National Science Foundation (SNSF) grants 31003A_160123 and 31003A_182675 awarded to A.W. and grant 31003A_173062 awarded to D.W.

DATA AVA I L A B I L I T Y S TAT E M E N T
We provide scripts to perform LSD genome scans at the GitHub repository: https://github.com/hirzi/ LSD.