Lack of genetic structure suggests high connectivity of Parnassius phoebus between nearby valleys in the Alps

The spatial scale of intraspecific genetic connectivity and population structure are important aspects of conservation genetics. However, for many species these properties are unknown. Here we used genomic data to assess the genetic structure of the small Apollo butterfly (Parnassius phoebus Fabricius, 1793; Lepidoptera: Papilionidae) across three nearby valleys in the Central Swiss Alps. One of the valleys is currently used for hydropower production with future plans to raise the existing dam wall further. We found no significant genetic structure, suggesting a currently high connectivity of this species in our studied region.


Introduction
The maintenance of genetic diversity is a key target of current conservation efforts because such diversity is thought to enable species to cope with changing environments (DeWoody et al. 2021). Among the factors that can reduce genetic diversity are habitat fragmentation and global climate change (Pauls et al. 2013;Schlaepfer et al. 2018). Alpine environments may especially be threatened by climate change (Engler et al. 2011), the latter often promoting the subdivision of locally adapted species (Jordan et al. 2016). The scale at which intraspecific gene flow occurs is thus an important property of a species with significant implications for conservation and management. However, the spatial scale of genetic connectivity is often unknown as its assessment either requires large-scale mark-recapture studies or genomic data (Gagnaire et al. 2015).
Here, we assessed the potential for intraspecific gene flow in an alpine butterfly -the small Apollo (Parnassius phoebus Fabricius, 1793; Lepidoptera: Papilionidae).
The species occurs locally in alpine environments from Alaska over Russia to the Alps (Todisco et al. 2012). Many of its allopatric populations have been described as distinct subspecies whose taxonomic status has though remained elusive (Weiss and Rigout 2005). For example, there is an ongoing debate if P. phoebus from the Alps should be named P. sacerdos (International Commission on Zoological Nomenclature 2017) or not (Bálint 2021), where P. sacerdos and Eurasian P. phoebus are polyphyletic based on mitochondrial haplotypes (Todisco et al. 2012). Given the unresolved taxonomy, we use P. phoebus here, which is consistent with the current Swiss red list for butterflies (Wermeille et al. 2014). P. phoebus subspecies differ often phenotypically from each other but intraspecific phenotypic variation also occurs at smaller scales. Indeed, a former study on alpine melanism, highlighted the adaptive value of increased melanism with increased elevation and latitude in P. phoebus (Guppy 1986). Males that were darker on their hindwings spent a greater proportion of time in flight at low air temperatures and showed increased movement (Guppy 1986). Importantly, the global diversity within P. phoebus is young, i.e., evolved over the last ~125'000 years, where geographically distant populations within a continent diverged as recently as 10'000-50'000 years ago (Todisco et al. 2012). Like for other species of this genus, P. phoebus is thought to have moderate dispersal capabilities, being able to fly from some hundred metres to few kilometres (Guppy 1986;Brommer and Fred 1999). However, natural barriers has been shown to limit intraspecific gene flow in other Parnassius species (Keyghobadi et al. 1999), but to which degree this is true for P. phoebus is not known.
Parnassius phoebus is a univoltine species and in the Alps can be found in humid, often flooded habitats with mostly extensive stands of Saxifraga aizoides, the primary larval food plant of this species (Lepidopterologen Arbeitsgruppe 1987). Habitats include relatively flat headwaters and riparian zones of small and large watercourses, often with alluvial plains in the subalpine and alpine and occasionally the montane zones. In the Bernese Alps, the species can be found from 1400 to 2300 m elevation, occurring both on limestone and silicate rock substrates. Imagoes feed on nectar from a range of plants, including thistles, Origanum and various cushion-forming plants, such as Saxifraga. Eggs are generally not directly laid on the host plant, but either on dried plants in its vicinity or directly on the soil substrate (Lepidopterologen Arbeitsgruppe 1987).
We used nuclear genomic data to assess the potential for gene flow among individuals collected from three nearby valleys in the Central Swiss Alps (Fig. 1). We focused on this region because the Trift valley experienced significant past and future anthropogenic alterations as a consequence of artificial damming for hydropower production (Haeberli et al. 2016;Guillén-Ludeña et al. 2018). This, together with the impact of climate change could thus render P. phoebus locally vulnerable, especially if current intraspecific gene flow would be limited (Condamine and Sperling 2018).

Sampling
We collected a total of 18 butterflies during summers 2015-2020. Sampling was conducted in three valleys in the Central Swiss Alps (Susten (N=6), Trift (N=8), Wenden (N=4), Fig. 1, Suppl. material 1: Table S1). We captured all individuals with hand nets and killed them with an overdose of ethyl acetate. Full bodies were dried for further genetic analyses.

Genetic data processing
We genotyped all individuals using single-end restriction-site associated DNA (RAD) sequencing with the restriction enzyme PstI. For all individuals we extracted the DNA from thorax tissue using the Qiagen DNeasy Blood and Tissue kit (Qiagen, Zug, Switzerland) following the manufacturer's protocol. Library preparation and sequencing on one Illumina HiSeq 4000 lane was outsourced to Floragenex (Portland, OR, USA). All genomic data is archived on NCBI (BioProject ID: PRJNA814465).
We filtered all obtained genomic data following (Lucek et al. 2020), i.e., we only retained reads with an intact PstI restriction site, followed by de-multiplexing and barcode-trimming with process_radtags from Stacks 1.48 (Catchen et al. 2013). Using the FASTX toolkit (http://hannonlab.cshl.edu/fastx_toolkit/), we then removed reads containing bases with a Phred quality score <10 or more than 5% of base pairs with quality <30. This approach yielded ~18.5 million high quality reads in total for our analysis. Given the lack of a Phoebus reference genome, we generated a de novo assembly of RAD-tags using all filtered reads for all individuals with ustacks 1.48 (Catchen et al. 2013) with the following settings: minimum stack size of 50 reads, a maximum of three base pairs of difference for stacks to be merged, excluding loci with unusually high coverage to avoid repetitive regions. The initial de novo assembly consisted of 11'004 contigs. To further identify and remove exogenous contigs from the assembly, we compared the assembly against the NCBI GenBank nucleotide collection with the blastn function from BLAST+ 2.7.1 (Camacho et al. 2009). A total of 40 or 0.4% of all contigs were of exogenous origin and we removed them from the initial assembly.
In a next step, we mapped the reads of each individual against our reference assembly with minimap2 2.2 (Li 2018) and genotyped all specimens with BCFtools 1.10.2 (Danecek and McCarthy 2017). We filtered the genotypes with VCFtools 0.1.16 (Danecek et al. 2011) to remove indels, to include only bi-allelic polymorphic sites with a minimal depth of six and a minimal genotype quality of 20, employing a minor allele frequency filter of 0.03 and allowing up to 50% of missing data per site. Due to high rates of missing data, two specimens were filtered out (Suppl. material 1: Table S1). The overall filtering resulted in 5157 SNP sites available for our downstream analyses.

Genetic analyses
To test for an individual based genetic structure, we first employed a phylogenomic analysis comprising all retained specimens. We used RAXML 8.2.11 (Stamatakis 2014) implementing a generalised time-reversible (GTR) model with optimised substitution rates and a gamma model of rate heterogeneity. We further applied an ascertainment bias correction to account for the fact that we only used polymorphic SNP positions with the ASC_GTRGAMMA function implemented in RAXML. Significance was assessed using 1000 bootstrap replicates followed by a thorough maximum likelihood search.
We next inferred population structure with Admixture 1.3.0, which implements a likelihood approach to estimate ancestry (Alexander et al. 2009). We ran ADMIXTURE by varying the number of assumed populations, i.e., K, from 1 to 5 and performed a cross-validation test to determine the optimal value of K. In a second step we used a principal component (PC) analysis as implemented in GenoDive 3.0.5 (Meirmans 2020) to visualize the genetic relationship among individuals.
Finally, we estimated the overall level of pairwise genetic differentiation (F ST ) among individuals from the three valleys (Susten, Trift, Wenden; see Suppl. material 1: Table S1) using GenoDive, with 1000 bootstrap iterations to estimate significance. Because genetic differentiation would only occur at few loci that experience direct or indirect selection in the case of recent divergence (Seehausen et al. 2014), we also performed locus-by-locus F ST in Genodive analyses between Trift individuals and individuals from Susten and Wenden combined.

Results
The bootstrap approach employed in our RAXML analysis found no significant node splits (i.e. >95% bootstrap support), suggesting the absence of a detectable differentiation among individuals. Similarly, no clustering occurred related to the three different valleys (Fig. 2a). The best number of genetic clusters as inferred by Admixture was likewise one (K=1), where the subsequent model assuming two genetic cluster showed no clustering by valleys (Fig. 2c). The two leading PC axes accounted for 9.3 and 8.4% of the total variation respectively and only here some individuals from the Trift valley seemed to be differentiated from the other individuals along PC1 (Fig. 2b).
The level of pairwise genetic differentiation among valleys was generally low and non-significant   Table S1 for details). Circle colour indicates the different valleys. For each individual the respective sample ID is given (see Table S1). Map source: Federal Office of Topography swisstopo; B. Example of Parnassius phoebus (individual K13); C-E. Habitat pictures for Wenden, Trift and Susten, respectively. same comparison identified only 11 SNPs with an F ST > 0.20 (Fig. 3), however, none of the associated contigs could be mapped to a known gene by BLAST.

Discussion
Using genomic data, we found a lack of genetic structure among individuals of the small Apollo Parnassius phoebus that could be attributed to the three valleys in close proximity, i.e., being 4-8 km apart, which we sampled in the central Swiss Alps (Figs 1, 2). Our results thus suggest a high connectivity of this species in our studied region. Consequently, the valleys, mountain ridges, glaciers or other potentially unsuitable habitat structures in our studied region (Fig. 1) do not present strong barriers to gene flow. This finding contrasts with observations in other Parnassius species where such geographic structures resulted in fine-scale population structure (Brommer and Fred 1999; Keyghobadi et al. 1999). While the absence of significant genetic differentiation, as estimated by F ST , may also highlight the statistical limitations given the sample size of our study, the individual-based analyses that we applied would allow to detect potential fine-scale structure (Rieder et al. 2019).
P. phoebus is an evolutionary young species that has moreover recolonized the studied area only after the last glaciation period (Todisco et al. 2012). Consequently, even if local adaptation would have occurred, the respective populations may not necessarily have had enough time to accumulate genetic differentiation beyond few genomic regions that experience selection (Nosil 2012;Seehausen et al. 2014). Indeed, our locus-based analysis identified Wenden Susten Trift  Table S1).
very few sites of accentuated differentiation (Fig. 3). Such genomic differentiation at only few target loci may be consistent with a potential very early stage of divergencewith-gene-flow, where further differentiation depends on the evolution of barriers to gene flow (Nosil 2012). However, the interpretation of such genomic regions has to be done with care, as they can also emerge through non-adaptive processes including genetic drift (Ravinet et al. 2017). Lastly, both the lack of significant genomic differentiation and the limited number of loci that showed accentuated differentiation could reflect a limited resolution given the restricted number of polymorphic SNPs available for our analyses and the absence of a reference genome. A high connectivity despite potential natural barriers may suggest that P. phoebus could be less affected by future anthropogenic modifications in the studied area (Haeberli et al. 2016;Guillén-Ludeña et al. 2018). However, such modifications will act combined with the effects of climate change, which is thought to be a main threat for species of the genus Parnassius (Condamine and Sperling 2018). Although P. phoebus can likely track its climatic niche by shifting its range up the mountains until they can go no higher, the species also depends on the availability of host plants, which can be equally affected by both factors (Condamine and Sperling 2018). Therefore, from a conservation perspective, it would be advisable to broaden the geographic scope of our study to identify the scale of potential population structure in P. phoebus across the Alps, ideally with denser genomic data. In addition, future anthropogenic habitat modifications, as it is planned for the Trift valley (Ehrbar et al. 2018), should be accompanied by a genetic monitoring for both P. phoebus and its host plant.