Landscape permeability and individual variation in a dispersal‐linked gene jointly determine genetic structure in the Glanville fritillary butterfly

Abstract There is now clear evidence that species across a broad range of taxa harbor extensive heritable variation in dispersal. While studies suggest that this variation can facilitate demographic outcomes such as range expansion and invasions, few have considered the consequences of intraspecific variation in dispersal for the maintenance and distribution of genetic variation across fragmented landscapes. Here, we examine how landscape characteristics and individual variation in dispersal combine to predict genetic structure using genomic and spatial data from the Glanville fritillary butterfly. We used linear and latent factor mixed models to identify the landscape features that best predict spatial sorting of alleles in the dispersal‐related gene phosphoglucose isomerase (Pgi). We next used structural equation modeling to test if variation in Pgi mediated gene flow as measured by Fst at putatively neutral loci. In a year when the population was recovering following a large decline, individuals with a genotype associated with greater dispersal ability were found at significantly higher frequencies in populations isolated by water and forest, and these populations showed lower levels of genetic differentiation at neutral loci. These relationships disappeared in the next year when metapopulation density was high, suggesting that the effects of individual variation are context dependent. Together our results highlight that (1) more complex aspects of landscape structure beyond just the configuration of habitat can be important for maintaining spatial variation in dispersal traits and (2) that individual variation in dispersal plays a key role in maintaining genetic variation across fragmented landscapes.


Introduction
Dispersal is key to the maintenance of genetic variation and adaptive potential in fragmented landscapes. Differences in species ability to maintain genetic diversity in fragmented landscapes can, in part, be explained by interspecific differences in dispersal capacity (Steele et al. 2009;Peterman et al. 2015). However, there is now clear evidence that many species across a broad range of taxa harbor extensive heritable variation in dispersal  and that this variation can facilitate demographic outcomes such as range expansion (e.g., Duckworth and Badyaev 2007;Ochocki and Miller 2017) and invasions (e.g., Phillips et al. 2006;Elliott and Cornell 2012;).
In comparison, few studies have considered the consequences of intraspecific variation in dispersal for genetic outcomes such as the maintenance and distribution of genetic variation across fragmented landscapes (Cheptou et al. 2017). This is an important gap given that the genetic makeup of populations can drive the trajectories of both ecological and evolutionary processes (Rius and Darling 2014;Szucs et al. 2017;Wagner et al. 2017).
This gap comes partly from a lack of integration of intraspecific variation into fields like landscape and spatial genetics (Edelaar and Bolnick 2012;Pflueger and Balkenhol 2014). Studies in these fields emphasize that gene flow across fragmented landscapes is strongly constrained by population configuration, habitat quality, and matrix heterogeneity (i.e., the landscape features intervening populations), but typically assume that dispersers respond to the landscape in the same way (Manel et al. 2003;Holderegger and Wagner 2008). However, fragmentation itself can exert strong selective pressures on dispersers, in some cases leading to the maintenance of multiple dispersal strategies across the same landscape (Cheptou et al. 2017;; Legrand et al. 2017). This variation might change expectations of spatial genetic structure and mask landscape genetic relationships, yet empirical tests of this are lacking (but see McDevitt et al. 2013). For example, using simulations, Palmer et al. (2014) showed that distance-based connectivity metrics underestimated the number of migrants arriving into isolated populations when individuals were allowed to vary in their dispersal ability, with the effect most severe when dispersal was rare. This spatial sorting of good dispersers into more isolated populations is expected to also impact the distribution of genetic variation, for example, by increasing genetic neighborhoods beyond expectations drawn from mean dispersal distances (DiLeo et al. 2014). Because dispersal traits often co-evolve with other aspects of morphology, physiology, and behavior (Clobert et al. 2009;, individuals might also interact with, and respond to, the landscape matrix in different ways (Merckx and Van Dyck 2007;Delgado et al. 2010). This could mean that the effects of landscape on gene flow might be missed if intraspecific variation is ignored.
Intraspecific variation thus has the potential to play an important, but so far unexplored role in structuring species genetic response to landscape fragmentation. The maintenance of several dispersal strategies across a single landscape might allow widespread gene flow to be maintained under a broad set of ecological conditions. We test this in a model species, the Glanville fritillary butterfly (Melitaea cinxia) in theÅland Islands, Finland. Importantly, individuals vary in their dispersal ability: butterflies heterozygous or homozygous for the "c" allele in a SNP associated with the gene phosphoglucose isomerase (Pgi) have higher flight metabolic rate, which translates to substantially increased dispersal propensity and dispersal distance in the field, especially at cooler temperatures (reviewed in Niitepõld and Saastamoinen 2017). Evidence for this comes from laboratory experiments linking flight metabolic rate to Pgi genotype (Niitepõld 2010), and from a study linking dispersal distances of butterflies tracked in the field to Pgi genotype (Niitepõld et al. 2009). The butterfly persists in a highly dynamic metapopulation with frequent colonizations and extinctions, and we focus on a 2-year period representing extremes of fluctuations experience by the butterfly: the year 2011 when populations had extremely low connectivity because of a large decline in the previous year, and 2012 when populations had high connectivity following the recovery of populations in the year prior (Ojanen et al. 2013). Recent work suggests that Pgi plays a central role in metapopulation persistence by maintaining high recolonization rates despite drastic population fluctuations ). However, we do not yet know to what extent Pgi also contributes to the maintenance of neutral genetic variation, which is important given that genetic diversity can affect persistence independently of demographic processese.g., populations founded by genetically diverse individuals have been observed to have higher chances of successful establishment, population growth rates, and persistence, independent of founder population size (Ahlroth et al. 2003;Szucs et al. 2017). Specifically, we ask: (1) what landscape factors drive spatial sorting of genotypes that vary in their dispersal ability? And (2) does this dispersal variation mediate the genetic response to landscape structure and population fluctuations? While previous work has found that more isolated patches have higher frequencies of good dispersers (Haag et al. 2005, Hanski andSaccheri 2006;Zheng et al. 2009), we further predict that dispersers will respond differentially to heterogeneity in the landscape matrix. We also predict that the good dispersers will facilitate genetic admixture; population with higher frequencies of the Pgi-c allele should be less genetically differentiated than populations with low frequencies. Finally, we predict that the effects of Pgi will be context dependent. Modeling predicts that the dispersive genotype will have the largest advantage when there are many open patches to colonize (Zheng et al. 2009), and thus we expect the association between landscape structure, Pgi, and genetic structure to be highest in 2011.

STUDY SPECIES AND SAMPLING
In Finland, the Glanville fritillary butterfly is only found in the Aland Islands where it persists in a metapopulation encompassing over 4000 meadows (hereinafter 'patches') that contain one or both of its host plants, Plantago lanceolata and Veronica spicata. In late summer, females lay clutches of 150-200 eggs, which develop into larvae that live gregariously until the last larval instar in the following spring (Boggs and Nieminen 2004). Before winter diapause, larvae spin nests at the base of host plants, and every fall since 1993 the number of nests have been counted in all patches inÅland allowing the quantification of long-term population dynamics (see Ojanen et al. 2013 for survey methods). In the majority of cases nests are composed of single full-sib families, although multiple families on the same host plant can also occur (Fountain et al. 2018). In any given year, only about 20% of patches on average are occupied, with frequent local recolonizations and extinctions, and large variation in the number of total larval nests ( Fig. 1). Our study focused on the period 2010-2012, which is characterized by a large population decline in 2010 due to poor weather conditions, followed by a population recovery. The major recovery occurred in 2011 where a record number of re-colonizations were documented (Fig. 1B). In 2012, there were fewer colonizations but a large increase in population density, with a record number of larval nests (Fig 1A).
In 2011 and 2012, three larvae per nest were collected from patches acrossÅland and were genotyped at 40 neutral markers and five putatively function markers from candidate genes related to flight or dispersal traits including Pgi ( (Orsini et al. 2009;Kvist et al. 2013;Ahola et al. 2014;Kvist et al. 2015;Wong et al. 2016;Duplouy et al. 2017;Supporting Information Appendix A and B). Full details of genotyping and quality control are given in Supporting Information Appendix A. After excluding any SNP or individual with a call rate <95%, 34 neutral markers and all of the functional markers remained, and total sample sizes were 3365 larvae representing 1504 nests in 250 patches in 2011, and 8229 larvae representing 2999 nests in 322 patches in 2012. We only had a few samples from the extreme south of our study region in Lemland. As these samples were clear outliers with low connectivity and frequencies of Pgi-c (Figs. S1 and S2), we excluded them from downstream analysis. We further excluded 98 patches in 2011 and 129 in 2012 with only a single larval nest where the effects of genetic drift are expected to be especially strong, giving a sample size of 152 patches (74 old, 78 new) in 2011 and 193 patches (138 old, 55 new) in 2012.

HYPOTHESES
To test the effects of landscape on neutral and functional genetic variation, we developed a set of landscape resistance surfaces reflecting the permeability of the intervening landscape matrix to dispersing butterflies. We assigned each landscape feature a value representing its resistance to a dispersing M. cinxia individualeither a value of 1 (not resistant to movement) or 10 (restricts movement), generating a total of 20 surfaces with different combinations of resistant and nonresistant features (Table S1). Patches were always given a value of 1 as patches represent suitable habitat for the species, and continuous urban areas were always given a value of 10. For each of these surfaces, we calculated pairwise resistance distances between M. cinxia patches using the program CIRCUITSCAPE (McRae 2006). This program uses circuit theory to calculate effective distances among patches by taking into account the relative permeability of the intervening landscape. Further details of the landscape classifications are given in Supporting Information Appendix A.
For each of the landscape hypotheses, we calculated an index of connectivity for each occupied patch using the incidence function model (Hanski 1994): where d ij is either the geographic distance (in km) or resistance distance (from CIRCUITSCAPE) between focal patch i occupied in year t and source patch j occupied in the previous year (t -1), and N j is the number of M. cinxia nests found in source patch j in year t -1. The constant α scales the dispersal kernel and should be equal to 1/mean dispersal distance of the species, which has been estimated to be 1 km (Fountain et al. 2018). Because resistance distances are unit-less, mean dispersal distance for each landscape resistance hypothesis was chosen as the resistance value that roughly translated to 1 km in Euclidean distance units, which was then used to calculate α (Table S1). We did this by taking the predicted value of the resistance distance at a Euclidean distance value of 1 km using a simple linear model, after first linearizing relationships with log transformations. Many of the connectivity variables calculated from alternative landscape resistance surfaces were highly correlated, because some landscape features were only present in small proportions in our study region and thus did not have large effects on the calculations of resistance distances. For our main analyses we settled on five uncorrelated variables of landscape connectivity: Si metapop , which is scaled by Euclidean distance among patches but assumes that the intervening landscape does not restrict dispersal, Si water , Si forest , and Si agriculture , which assume that water, forest, and agriculture restrict dispersal, respectively, and Si roads , which assumes that roads facilitate dispersal. The five connectivity variables had pearson correlations below 0.6 and variance inflation factors in linear models under 2 (Table S2).

CONNECTIVITY AND Pgi
Our first objective was to test which landscape connectivity hypotheses best predicted the distribution of the Pgi-c allele. Based on previous work, we expected an interactive effect of patch age; good dispersers are expected to be at the highest frequency in newly colonized (hereinafter "new"), isolated populations and at the lowest in old, isolated populations because dispersive individuals also have high emigration rates (Zheng et al. 2009). We conducted model selection on mixed effect models including all connectivity variables and their interaction with patch age (i.e., new or old population) as fixed effects, and genetic cluster membership as a random effect. Frequency of the Pgi-c allele in each patch was used as the response variable. Genetic cluster membership represents groups of individuals or populations that share a common demographic history (i.e., share common historical population dynamics and/or ancestry), and was determined for each year separately using BAPS6 (see Supporting Information Appendix A; Fig. S1; Corander et al. 2008;Cheng et al. 2013). Mixed models were run for each year separately and we applied log transformations to Si metapop and to Si roads , and exponential transformations to Si water and Si agriculture to linearize relationships. Each variable was scaled and mean centered. Models were implemented in the lme4 library (Bates et al. 2015) and validated graphically by plotting residuals against fitted values and normality assumptions were checked with QQ plots. For model selection, we retained a candidate set of models with high support for further analysis and interpretation. A model was included in the candidate set based on two conditions: (1) the model was within AICc < 2 of the top model and (2) the model was not simply an embellishment of a higher ranked model (i.e., did not contain uninformative parameters; Arnold 2010). All predictors appearing in the resulting candidate model set were considered as potentially important in their effects on Pgi and were subject to downstream analysis (see below). We used this approach since alternatives such as model averaging and summing akaike weights can lead to flawed interpretation of effects when variables are even weakly collinear (Galipaud et al. 2014;Cade 2015).
The residuals of the top model for both 2011 and 2012 showed evidence of spatial autocorrelation. To test that this did not bias our results, we compared models with and without spatial random effects implemented in r-inla (Lindgren & Rue 2015; Supporting Information Appendix A). In one case, we found the sign of a weak main effect changed in the spatial compared to the nonspatial model (from 0.006 to -0.003; Table S3). However, the major effect of this variable manifested as an interaction with a strong negative association in new populations, and the direction and strength of this interaction did not change. We found very little difference in all other estimates between the spatial and nonspatial models, and thus did not pursue spatial models further (Table S3).
As a second step, we used gene-environment association analysis (Rellstab et al. 2015) using latent factor mixed models (lfmms; Frichot et al. 2013) implemented in the LEA library (Frichot and François 2015) to confirm associations identified as significant in the linear mixed effect models. While the linear mixed model approach controlled for potential effects of neutral genetic structure by including genetic cluster membership (calculated from the 34 neutral loci) as a random effect, the latent factor mixed models explicitly considers all loci (Pgi and neutral loci) simultaneously. This applies a stronger control of background genetic structure, particularly when structure is complex and hierarchical (de Villemereuil et al. 2014), and also allowed us to confirm that we only see associations between connectivity and allele frequencies for Pgi and not for other loci. A drawback is that it cannot incorporate additive or interactive effects, and thus we used it only to confirm results in the partitions of the data that were found to be significant in linear mixed effect models (e.g., new populations). In addition to testing the individual Si variables identified as important in the mixed models, we explored potential additive effects by testing for association between loci and Si variables that incorporated the resistance of two or more of the landscape features. For example, the additive effects of Si water and Si forest were included in the model as the predictor Si water+forest (Table S1). Supporting Information Appendix A describes how the latent factors were specified starting from knowledge of the number of k genetic clusters identified from BAPS6. Loci included in each analysis were Pgi, 34 neutral markers, and four additional markers previously identified as being outliers in fragmented vs. continuous landscapes or from flight induced gene expression differences (Kvist et al. 2015;Fountain et al. 2016).

STRUCTURE ON GENETIC STRUCTURE
Our second objective was to test if intraspecific variation at the Pgi locus mediates the effect of landscape on the distribution of neutral genetic variation. We hypothesized that landscape structure will influence neutral genetic structure either directly (i.e., by limiting dispersal of all individuals in the same way), or indirectly through its effect on the spatial distribution of individuals with contrasting Pgi genotypes.
Direct effects were tested using linear mixed effect models with neutral genetic differentiation as a response variable. A patch-specific measure of genetic differentiation was calculated by taking the average pairwise Weir and Cockerham unbiased F st (Weir and Cockerham 1984) for each patch following Pflueger and Balkenhol (2014). Prior to calculation of F st , the data were subset to include only a single individual per nest, and a maximum of 50 nests per patch as there was large variation in sample sizes. Although Weir and Cockerham's unbiased F st accounts for unequal population sizes, calculations are biased at very low sample sizes (Willing et al. 2012), and so we limited analyses to patches with more than two nests, excluding 43 patches in 2011 and 37 patches in 2012. Models were run for each year separately with the five connectivity metrics and population age as fixed predictors, and genetic cluster membership as a random factor. We did not find evidence for an interactive effect of population age and so only included it as a main effect. Predictor variables were transformed, centered, and scaled, and model selection was implemented as described above (see 'Association between landscape connectivity and Pgi').
We next tested for indirect effects of landscape structure on genetic differentiation in the partitions of data where we found strong association between connectivity and Pgi using structural equation modeling (SEM) implemented in the lavaan library (Rosseel 2012). Originally developed by Sewall Wright (1934), SEM allows the evaluation of complex a priori-defined relationships among variables that potentially involve direct and indirect effects. It is thus an ideal framework for testing the relative effects of connectivity vs. Pgi on genetic structure, while controlling for directed or residual relationships between them. We modeled two pathways: (1) connectivity directly affects genetic differentiation, and (2) connectivity indirectly affects genetic differentiation through its effects on the frequency of the Pgi-c allele. Connectivity was included in the model as a latent variable -i.e., a construct that is not easily measured directly but can be indicated by a number of observed, and potentially correlated, variables that have some level of measurement error (Grace et al. 2012). We included connectivity as a latent variable, with the five connectivity hypotheses Si metapop , Si water , Si forest , Si agriculture , and Si roads as indicators. We estimated model parameters using maximum likelihood and assessed model fit with chi-square tests, where a P-value > 0.05 indicates the model-implied covariance fits the observed covariance well (Grace et al. 2012). The relative effects of paths were evaluated from standardized path coefficients and individual significance of paths.

CONNECTIVITY AND Pgi
For data from 2011, selection on mixed effect models identified 10 models with AICc <2 (Table S4). All of these models included Si forest and Si water , and seven of the models were embellishments of higher ranked models-i.e., they were the same as a higher ranked model but included uninformative parameters with little effect on model fit (Arnold 2010). We thus retained three models for further analysis and interpretation: the top model including an interaction of Si forest and age and main effects of Si water and Si agriculture , a model including an interaction of Si forest and a main effect of Si water , and a third model including an interaction of Si water and main effects of Si forest and Si agriculture (Table 1). All variables showed negative associations with Pgi-c, with Si forest and Si water having the strongest effects in new populations (Table 1). The effect of water and forest in new populations remained significant in latent factor mixed models as evidenced by a significant association of Si water+forest with Pgi-c but no other loci (Fig. S3). In contrast Si water showed a significant association with a putatively neutral locus but not Pgi-c, Si forest showed a significant association with both Pgi-c and a different putatively neutral locus, and Si water+forest+agriculture and Si agriculture showed no significant associations with Pgi-c or any other locus (Supporting Information Appendix A; Fig. S3).
The negative association between Pgi and connectivity in new populations switched in linear mixed models in 2012, and the effect of Si forest disappeared ( Fig. 2; Table 1). Model selection identified six models with AICc <2 (Table S5). All of these models included an interaction of age and Si roads , and either a main or interactive effect of Si water . Four of the models were embellishments of higher ranked models. We thus retained two models for further analysis and interpretation: the top model containing interactions between of both Si roads and Si water with patch age, and a lower ranked model with an interaction between Si roads and age and a main effect of Si water (Table 1). Si roads showed a strong positive association with Pgi-c in new populations, whereas Si water showed a negative association with Pgi-c in old populations but only a weak or nonexistent relationship in new populations in the top model (Table 1; Fig. S4). Neither Si roads nor Si water were found to have significant associations with Pgi-c or any other loci in latent factor mixed models (Supporting Information Appendix A; Fig. S3). Details on the calibration of the latent factor mixed models, including reports of genomic inflation factors can be found in Supporting Information Appendix A and Table S6.

STRUCTURE ON GENETIC STRUCTURE
For 2011, selection on mixed effect models testing for direct effect of connectivity and age on F st identified a model containing only the random intercept as the most likely model (Table S7). We also tested the model after removing a single patch that had a very high F st value. This led to the selection of a model including Si water as the best, however it explained only 3% of the variation in F st , and the random intercept-alone model remained in the candidate set with AICc <2 (Table S8). We thus conclude that there is no, or a very weak, direct effect of connectivity on genetic differentiation in 2011. For 2012, eight models had a AICc <2, however five of them were embellishments of higher ranked models (Table S9). We thus retained three models: the top model including equal effects of Si metapop and Si water , a model with just Si metapop , and a model with equal effects of Si water and Si roads ( Table 2).
Pgi showed a significant association with connectivity only in new populations in 2011, and thus we only tested for indirect effects of connectivity on F st (mediated by Pgi) in this partition. The full structural equation model including all five landscape connectivity indicator variables showed poor fit as indicated by a significant deviation of the observed covariance from the    model-implied covariance (χ 2 = 37.4, df = 13, P < 0.001). Because Si metapop did not have strong effects in our earlier models, we removed it as an indicator variable. This model fit the data well as indicated by no significant deviation of the observed and modeled-implied covariance (χ 2 = 11.94, df = 8, P = 0.15). The removal of Si metapop did not affect the strength or significance of associations. Si water and Si forest were the strongest and only significant indicators of connectivity (Fig. 3). Our model supported a significant negative effect of connectivity on Pgi-c, and a significant negative effect of Pgi-c on genetic differentiation, but no significant direct effect of connectivity on genetic differentiation ( Fig. 3; Fig. S5).

Discussion
Here, we show that patch connectivity metrics incorporating landscape matrix best predicted the distribution of a dispersal polymorphism during a population recovery in the Glanville fritillary butterfly. In particular, newly colonized populations that were isolated by water and forest matrix had significantly higher frequencies of an allele associated with increased dispersal ability (Pgi-c allele). We further found that patch connectivity alone did not predict genetic differentiation at neutral markers, but rather the effect of landscape on genetic structure was mediated through individual variation in the Pgi locus; populations with higher frequencies of the Pgi-c allele had lower F st . In the following year when the density of populations increased, these relationships disappeared, suggesting that good dispersers only have an advantage when there are many empty patches to colonize. Together our results suggest that both individual variation in dispersal traits and landscape matrix heterogeneity are important for predicting spatial patterns of genetic variation.

VARIATION IN DISPERSAL
We found that spatial sorting of individuals based on their Pgi genotype was best explained by a connectivity metric that incorporated the effects of water and forest matrix in 2011. Importantly, the basic metapopulation model did not show a significant association with the frequency of Pgi-c.   (Haag et al. 2005;Hanski and Saccheri 2006;Zheng et al. 2009), our results suggest that more complex processes are at play. There are a number of potential reasons why previous work was able to capture spatial sorting of Pgi with Si metapop while we could not. We found evidence for a negative relationship between the frequency of Pgi-c and metapopulation connectivity for new populations, however, this relationship was "not" supported within individual genetic clusters (Fig. S2). In contrast, the association between Pgi and Si water+forest held true even within genetic clusters (Fig. S2). This suggests that Si metapop might, at least partly, be confounded with geography, whereas Si water+forest captures connectivity at scales more relevant to dispersal. Our sample size is also much larger and we were thus able to capture a larger amount of variation in landscape structure. In comparison, previous work selected population extremes (e.g., extremely low and high connectivity), and geographic distance might have been sufficient to capture patterns. It should also be noted that previous studies tested only a single model of patch connectivity, whereas we competed several models assuming different landscape structures.
Our results hence suggest that the Pgi dispersal polymorphism in the M. cinxia system in theÅland islands is not maintained by variation in patch configuration alone (i.e., the metapopulation model), but that the landscape matrix further influence dispersal patterns. This is an important finding given that studies investigating the drivers of dispersal evolution almost exclusively use simplified landscape models that assume a homogenous matrix (Bowler and Benton 2005;Henriques-Silva et al. 2015). Knowledge of the importance of the landscape matrix is thus lacking, and this is one avenue in which landscape genetic approaches can contribute to understanding how the matrix might modify predictions of dispersal evolution. Our results suggest that landscape features that intervene discrete habitat patches matter, and this is unsurprising given that dispersal traits are often correlated with other aspects of species biology . For example, Pgi shows a genotype-by-temperature interaction in M. cinxia, where heterozygotes have higher flight metabolic rate (and thus dispersal ability) at moderate and cool temperatures, but individuals without a c allele fly better at very warm temperatures (Niitepõld 2010). This might explain why forest is an important predictor of the spatial distribution of the dispersive allele, as it could provide a cooler environment for Pgi-c individuals. However, it is unclear if the association between forest and the frequency of Pgi-c is driven by individual differences in the use of the landscape matrix, or rather a distance effect-i.e., if Pgi-c individuals are able to fly further or faster through the forest.

GENETIC DIFFERENTIATION
In our test of direct effects of connectivity on genetic differentiation, we found no evidence that any of the connectivity metrics predicted F st in 2011-newly colonized patches with lower connectivity did not display higher genetic differentiation at neutral loci compared to highly connected (Table 2). Rather, patches with higher frequencies of Pgi-c had significantly lower F st (Fig. 3;  Fig. S5). Although Pgi explained only a small proportion of variation in F st , the effects of the landscape matrix on dispersal and gene flow would have been completely missed if individual variation at the Pgi locus was not included in our model. This highlights the importance for integration of intraspecific variation in dispersal traits in landscape genetic studies. While sex differences have recently been considered in landscape genetic models (e.g., Paquette et al. 2014;Bertrand et al. 2017), to our knowledge, only two papers have integrated non-sex related variation in dispersal traits: DiLeo et al. (2014) found that individual variation in the number of flowers on dogwood trees influenced spatial patterns of gene flow beyond the effects of Euclidean distance; and McDevitt et al. (2013) found that genetic admixture of weasels in Poland was likely mediated by the movement of medium-, rather than large-or small-bodied weasels. While we acknowledge that variation in dispersal traits are often hard to measure, the increasing accessibility of genomic data will facilitate the identification of candidate loci relevant to dispersal in non-model organisms (e.g., Swaegers et al. 2015;Dudaniec et al. 2018). Because dispersal is often controlled by multiple genes , applying landscape genomic methods that can capture the effects of polygenic adaptation will be important (e.g., redundancy and canonical correlation analysis; Rellstab et al. 2015 and reference therein). We further expect our approach to be relevant not only to systems where multiple dispersal strategies exist within a single landscape, but also the perhaps more common case of directional selection, where dispersal might be under positive or negative selection depending on broad patterns of landscape structure (Cheptou et al. 2017). An increasing number of studies have reported variability in landscape genetic relationships across replicate landscapes (e.g., Bull et al. 2011;Dudaniec et al. 2012;DiLeo et al. 2013;Balbi et al. 2018), and it will thus be interesting to see if divergent selection on dispersal genes might be a hidden source of variation contributing to these patterns (e.g., Peterson and Denno 1997).
Our study highlights the role of a single gene on the maintenance of gene flow across the landscape, and also joins a growing list of evidence that Pgi in particular has large effects on ecological processes in M. cinxia (reviewed in Niitepõld and Saastamoinen 2017). However, it is unlikely that Pgi is acting alone. For example, work by Wheat et al. (2011) showed that Pgi may epistatically interact with other genes, such as succinate dehydrogenase (Sdhd); an allelic combination at these two loci was associated with maximal metabolic endurance in M. cinxia. Linkage of Pgi to other functional loci have not been resolved, but the low frequency and fitness of individuals homozygous for the C allele suggest possible linkage to a deleterious mutation (Orsini et al. 2009), and that the variation in Pgi is maintained through balancing selection via a heterozygote advantage ). Further, females colonizing new populations have been found to be divergent in a suite of life-history traits, many but not all are associated with variation in Pgi (Hanski et al. 2006;Saastamoinen 2008;Kvist et al. 2013;Wheat et al. 2011). This suggest a more complex dispersal syndrome, the genomic architecture of which remains to be characterized. We further emphasize that Pgi explained only 10% of variation in F st , and clearly other processes are at play that may not have been captured by our model (e.g., temperature and condition-dependent dispersal). Interestingly, the four other candidate loci included in our study did not show significant associations with landscape structure, despite being found as outliers in previous work (Kvist et al. 2015;Fountain et al. 2016). In particular, Fountain et al. (2016) found that three of these loci changed more than expected under neutral expectations in contemporary vs. museum samples. Further, they found that allele frequencies changed in the same direction in comparisons of new, isolated vs. old continuous contemporaryÅland populations, and in fragmented vs. continuous replicate landscapes. In our study, these three loci (Mc1:1873:36910, Mc1:1124:71239, Mc1:1687 shifted in new, isolated vs. old, connected populations in the same direction as reported by Fountain et al. (2016), although the differences were subtle (Fig. S7). This suggest that these loci may indeed be under selection, but we lack the power to detect significance given that we have samples from very different timescales compared to Fountain et al. (2016).

CONTEXT MATTERS
Associations between patch connectivity and variation in Pgi in new populations in 2011 disappeared in 2012. Modeling studies on Pgi indicates that Pgi-c individuals should have the greatest selective advantage when there are many empty patches to colonize (Zheng et al. 2009;Hanski et al. 2011). Thus, it was expected that our results would be much stronger in 2011-a year that marked the largest number of re-colonizations recorded in Aland following a large population decline that left many empty patches. In comparison, the metapopulation experienced a large increase in population size in 2012 but relatively fewer colonization events; all patches in 2012 had high connectivity. This appears to be driven by the much higher number of potential source patches and nests in sources in 2012, and less by difference in distances between sources and targets (Fig. S6). Observations from mark-recapture suggest that M. cinxia exhibit negative density dependent dispersal (Kuussaari et al. 1996), suggesting that there should be fewer dispersal events in the high density year 2012 compared to 2011. Intriguingly, effects of connectivity on Pgi even appeared to switch in 2012 (Table 1; Fig. S4), although these associations were not significant in latent factor mixed models. This might be suggestive of a more complex interaction between individual variation and density-dependent dispersal. Modeling work predicts that Pgi-c should rise to higher frequencies at very high population densities where it gains an advantage by spreading genes over more patches (Zheng et al. 2009), however this has not been empirically demonstrated. Future work should seek to resolve the drivers of yearly differences in dispersal, focusing particularly on the effects of density and weather, which may influence dispersive and nondispersive genotypes differently.
Importantly, the association between Pgi and F st in 2011 suggests that the polymorphism plays a key role in maintaining genetic variation across the landscape following perturbation. This finding provides a more mechanistic understanding of population persistence in this highly dynamic system. Recent work showed that regions inÅland with higher long-term frequencies of Pgi-c maintained higher metapopulation sizes, presumably by increasing colonization rates . Our results suggest that regional persistence of the metapopulation might be further facilitated through Pgi-mediated genetic rescue.
While some strong associations emerged from our analysis, model selection suffered from uncertainty with several likely, and sometimes non-nested models appearing to have similar support. This is a common problem with variables derived from landscape measures, which are inherently correlated Prunier et al. 2015). Although our connectivity variables were well below typical collinearity thresholds (Dormann et al. 2013), it is likely that weak linear relationships still contributed to this uncertainty. This might also explain why connectivity variables with strong effects in 1 year did not emerge as important predictors in the next year, although part of this is also likely due to differences in spatial locations of populations sampled in the different years. Although it is hard to say definitively which landscape features restrict dispersal, our results make a strong case for water as it was an important predictor across years, and forest as it had strong effects across multiple methods and different partitions of the data. What is less clear are the effects of other variables that were found to be important for prediction but of weak effect, with inconsistent results across methods (e.g., Si agriculture in 2011). Future work would benefit from fine-tuning landscape resistance surfaces to better account for these potential small additive effects (e.g., using optimization; Peterman 2014), and from testing relationships under a broader set of conditions in carefully selected landscapes where the independent effects of landscape variables can be better teased apart.
Future work is also required to determine the effects of Pgi in small populations (1-2 nests), which were excluded from our analysis because estimating F st requires larger samples (Morin et al. 2009). Pgi-c individuals might be especially important to counteract drift in these small populations, and individual-based genetic approaches (Shirk et al. 2018) could be employed in the future to better quantify genetic structure including these patches. However, populations founded by a single female likely contribute little to overall population dynamics , as these populations would be prone to inbreeding depression in the following generation (Haikola et al. 2001;Nieminen et al. 2001), and small, inbred populations have high observed extinction rates in the field (Saccheri et al. 1998)

Conclusions
Our work adds to growing evidence that intraspecific variation plays a key role in driving diverse biological processes (Bolnick et al. 2011;Moran et al. 2016;Des Roches et al. 2018). We showed that heterogeneity in the landscape matrix is an important predictor of spatial variation in dispersal traits, and that this individual variation mediated the effects of landscape on genetic structure. Our results therefore highlight a need for better integration of studies on dispersal evolution and landscape genetics. While studies of dispersal evolution may need to consider more complex representations of landscape structure that captures heterogeneity in the landscape matrix, landscape geneticists should consider that key associations between landscape and genetic structure might be missed if intraspecific variation in dispersal is ignored.

Supporting Information
Additional supporting information may be found online in the Supporting Information section at the end of the article. Figure S1. Cluster membership estimated from BAPS for 2011 (A) and 2012 (B). Each point is a patch and the colour indicates membership to a particular cluster. Figure S2. Associations between the frequency of the Pgi c allele and two indices of connectivity (Si water+forest and log-transformed Si metapop ) for each genetic cluster in 2011 (A) and 2012 (B).  Figure S4. Scatterplots showing the relationship between the frequency of the Pgi-c allele and a patch connectivity metric incorporating roads as a facilitator to dispersal (A) and water as a barrier to dispersal (B) for the year 2012. Figure S5. Scatterplots showing relationships between population-specific genetic differentiation (F st ) and the frequency of the Pgi-c allele (A), F st and Si forest (B) and F st and Si water (C) for newly colonized populations in 2011. Figure S6. Sources of variation in patch connectivity per year. Yearly differences in connectivity for newly colonized populations, S imetapop , (A), geographic distance between source and target patches for newly colonized populations, d ij (B), number of nests in source patches, N j (C), and total number of source patches (D). Figure S7. Relationships between connectivity and minor allele frequencies (MAF) of the four other candidate loci in newly colonized and old populations. Table S1. Descriptions of landscape connectivity hypotheses. Table S2. Pearson correlation coefficients among the five landscape connectivity variables. Table S3. Posterior mean estimates, standard deviation, and quantiles of Bayesian nonspatial and spatial INLA models. Table S4. Results of model selection on linear mixed effect models testing for associations between the frequency of the Pgi-c allele, population age (old or new), and competing metrics of patch connectivity (Si) for the year 2011. Table S5. Results of model selection on linear mixed effect models testing for associations between the frequency of the Pgi-c allele, population age (old or new), and competing metrics of patch connectivity (Si) for the year 2012. Only the top 15 models are shown. Table S6. Genomic inflation factors of for each value of k and each predictor tested in latent factor mixed models. Table S7. Results of model selection on linear mixed effect models testing for associations between population genetic differentiation (F st ), population age (old or new), and competing metrics of patch connectivity (Si) for the year 2011. Table S8. Results of model selection on linear mixed effect models testing for associations between population genetic differentiation (F st ), population age (old or new), and competing metrics of patch connectivity (Si) for the year 2011 with a singly outlier with a high F st value removed. Only the top 15 models are shown. Table S9. Results of model selection on linear mixed effect models testing for associations between population genetic differentiation (F st ), population age (old or new), and competing metrics of patch connectivity (Si) for the year 2012. Appendix A -Supplementary methods and results Appendix B: Information about the 5 candidate and 40 neutral SNPs used.