Delimiting continuity: Comparison of target enrichment and double digest restriction‐site associated DNA sequencing for delineating admixing parapatric Melitaea butterflies

Parapatrically distributed taxa pose a challenge for species delimitation due to the presence of gene flow and inherent arbitrariness of exactly defining the species boundaries in such systems. We tackled the problem of species delimitation in a parapatric species pair of Melitaea butterflies using two popular genomic methods—double digest restriction‐site associated DNA sequencing (ddRAD) and target enrichment. We compared newly generated target enrichment dataset with 1733 loci to the already available ddRAD data from a previous study on the same set of specimens using a suite of phylogenetic, population genetic, and species delimitation methods. We recovered consistent phylogenetic relationships across the datasets, both demonstrating the presence of a genetically distinct Balkan lineage and paraphyly of Melitaea athalia with respect to Melitaea celadussa. Population genetic STRUCTURE analyses supported the presence of two species when using ddRAD data, but three species when using target enrichment, while a Bayes factor delimitation analysis found both two and three species scenarios equally decisive in both datasets. As the results obtained from both methods were largely congruent, we discuss some practical considerations and benefits of target enrichment over RAD sequencing. We conclude that the choice of method of genomic data collection does not influence the results of phylogenetic analyses at alpha taxonomic level, given a sufficient number of loci. Finally, we recommend a solution for delineating species in parapatric scenarios by proposing that parapatric taxa be consistently classified as subspecies or complete species, but not both, to promote taxonomic stability.

pared newly generated target enrichment dataset with 1733 loci to the already available ddRAD data from a previous study on the same set of specimens using a suite of phylogenetic, population genetic, and species delimitation methods. We recovered consistent phylogenetic relationships across the datasets, both demonstrating the presence of a genetically distinct Balkan lineage and paraphyly of Melitaea athalia with respect to Melitaea celadussa. Population genetic STRUCTURE analyses supported the presence of two species when using ddRAD data, but three species when using target enrichment, while a Bayes factor delimitation analysis found both two and three species scenarios equally decisive in both datasets. As the results obtained from both methods were largely congruent, we discuss some practical considerations and benefits of target enrichment over RAD sequencing. We conclude that the choice of method of genomic data collection does not influence the results of phylogenetic analyses at alpha taxonomic level, given a sufficient number of loci. Finally, we recommend a solution for delineating species in parapatric scenarios by proposing that parapatric taxa be consistently classified as subspecies or complete species, but not both, to promote taxonomic stability.

K E Y W O R D S
admixture, ddRAD sequencing, genomics, hybrid zone, parapatry, species delimitation, target enrichment

INTRODUCTION
Most biologists would likely agree that, of all the taxonomic units, that of a species is fundamental (de Queiroz, 2005;Hohenegger, 2014).
The reality of species as a truly discrete biological entity is however unlikely to hold true universally as speciation is usually a long and gradual process (de Queiroz, 2007). The ability to categorize biological diversity into distinct species, that is, species delimitation, is a prerequisite for the identification of individual organisms. The latter, in turn, is important in most biological research, including conservation, but also in many other disciplines and areas of society, such as legislation and food industry (Tewksbury et al., 2014). Despite the global efforts to document and describe the bewildering diversity of life on earth, most species remain undescribed or even undiscovered (Mora et al., 2011). In addition to the complexities caused by the usually low differentiation of recently diverged species, traits such as phenotypic plasticity and morphological crypticity displayed by some species seriously complicate the process of species delimitation (Jarman & Elliott, 2000;Knowlton, 1993). Delimitation is further complicated by several other factors. For instance, many species have a large or patchy distribution and show extensive intraspecific variability in geographical space. Additionally, hybridization and introgression between species have been shown to be much more common than previously expected. It has been estimated that 10% of animal species and at least 25% of all plant species are involved in hybridization (Mallet, 2005). Finally, there is a variety of ways to define the term 'species'. Some species concepts have a focus on temporal changes (i.e., when speciation occurs in time) while others focus on spatial (when geographic isolation between diverging populations) or biological (reproductive isolation) aspects of variability (de Queiroz, 2007), rendering the definition of species even more elusive. Despite these difficulties, categorizing biodiversity into species remains a central commitment of taxonomy.
A wide geographical distribution or a geographically biased taxonomic sampling often complicates the process of species delimitation to a significant degree (Linck et al., 2019;Mutanen et al., 2012). When the distributions of species are large, extensive genetic differences among populations in different areas can be maintained, despite migration (Gavrilets et al., 2000). The same may be true on a smaller scale when dispersal is limited by species' own traits or geographic barriers. In European butterflies, the degree of population differentiation is positively correlated with range size and negatively with traits determining their mobility and level of generalism (Dapporto et al., 2019). Also, variation in environmental conditions can result in selection acting differently in different areas within species ranges (Gavrilets et al., 2000). The geographic relationship between species or populations has traditionally been categorized as either allopatric, parapatric, or sympatric (Hendry et al., 2009) although there is a full continuity between these distributional categories. Parapatry refers to the distributional pattern where the separate ranges of two species have a narrow overlap (Bull, 1991), typically within a hybridization zone (Hewitt, 1988). While such circumstances provide an opportunity to peek into the speciation process through reinforcement, parapatry, in the presence of gene flow, poses a particular challenge for species delimitation, as the end point of the speciation process is often vague and gene flow may be extensive (Hendry et al., 2009).
Indeed, the well-documented cases of parapatry demonstrate how utterly complicated species delimitation under such circumstances can be. A well-known example of such a system is that of carrion and hooded crows, Corvus corone (Linnaeus) and Corvus cornix (Linnaeus), in Europe (Poelstra et al., 2013;Saino & Villa, 1992). These two taxa are now regarded as two different species  despite high levels of interbreeding and introgression.
Development of high-throughput sequencing technologies has enabled access to whole genomes or large proportions of genomes that can potentially provide useful information to resolve the patterns of evolution from population to very deep phylogenetic levels Herrera & Shank, 2016). Two of the popular reduced representation approaches, double digest restriction-site associated DNA sequencing (ddRAD; Peterson et al., 2012) and target capture methods such as ultraconserved elements (UCEs;Faircloth et al., 2012), anchored hybrid enrichment (AHE; Lemmon et al., 2012) and other target enrichment approaches (Gnirke et al., 2009, see also Mayer et al., 2016), are now widely used in systematics research.
RADseq approaches have been used particularly to elucidate evolutionary patterns at relatively shallow phylogenetic levels (Wagner et al., 2013), while target capture-based approaches can be used for deep as well as shallow-level phylogenomics (Banker et al., 2020 (Van Oorschot & Coutsis, 2014;Higgins, 1955;Platania et al., 2020). The species are morphologically distinguished based on male genitalia, but intermediate male genital characters have been reported from the contact zone (Beuret, 1933;Van Oorschot & Coutsis, 2014). Genomic analyses done using a ddRAD dataset on this species pair revealed wide admixture in the contact zone and across a part of the range of M. athalia, but less so in M. celadussa, suggesting a mostly unidirectional introgression (Tahami et al., 2021). Additionally, this analysis revealed a genetically diverging and non-admixed, but morphologically indistinguishable clade in the Balkans that had not been detected previously. Using morphometrics of the male genitalia, the same study reported several specimens with intermediate genital characters in the contact zone.
As parapatric systems are likely to reveal a full spectrum of genetic admixture across different systems, taxonomic delimitation of parapatric taxa is bound to be almost inherently arbitrary. However, despite the conceptual nature of their delimitation, there is no consensus in the taxonomic community on the principles of how such taxa should be taxonomically ranked. We believe that genomic approaches could provide excellent means for higher levels of T A B L E 1 Comparison of number of raw reads from the target enrichment and ddRAD datasets standardization for delimiting parapatric taxa by enabling the evaluation of important biological properties, such as levels of introgression and genetic divergence, in a quantitative way and based on large amounts of standardized genetic markers (Dietz et al., 2019;Eberle et al., 2020). Hence, motivated by the low number of studies that use genomics to assess parapatric species systems, we tackled the question of their delimitation based on two powerful genomic approaches, ddRAD sequencing and target enrichment. We compared the two methods making use of phylogenetic, population genetic and species delimitation methods, because species delimitation essentially acts at the interface of phylogenetic and population genetic levels. Apart from comparison of the results and usefulness of ddRAD sequencing versus target enrichment, we provide insights into the conceptual aspects of the delimitation of parapatric taxa with admixture in order to advance the stability and consistency of their delimitation.

Sample preparation
Samples were collected from various locations across Europe and were representative for the range of the targeted taxa, including their contact zone (Table 1). Initially, a set of 81 specimens was used for ddRAD analyses (Tahami et al., 2021) and a subset of 60 specimens was selected for target enrichment analyses. These included specimens of M. athalia and M. celadussa, as well as one specimen putatively attributed to Melitaea caucasogenita Verity and one Melitaea britomartis Assmann as outgroup taxa. Because some specimens were females and it was not clear how well they can be identified based on genitalia, DNA barcodes were used for a priori identification and labelling of specimens. We also categorized specimens occurring in the supposed contact zone using the biodecrypt approach (Platania et al., 2020), which implements dedicated functions to identify areas of sympatry among cryptic taxa based on a subset of identified specimens (details are given in the Supporting information). As per this analysis, we inferred 11 specimens as belonging to the contact zone (Table S2). Genomic DNA was extracted from two thirds of the thorax from either ethanol preserved or dry specimens. DNA extraction was done using the QIAGEN DNeasy Blood and Tissue kit following the protocol provided by the manufacturer. DNA extracts were visualized on 1% agarose gels.

Target enrichment laboratory procedures
Target enrichment bait design followed Mayer et al. (2021) where the final probe kit targets 2953 CDS regions in 1753 nuclear genes. This kit was developed with BaitFisher version 1.2.8 (Mayer et al., 2016) and is referred to as the LepZFMK 1.0 kit. The DNA concentration of each sample was quantified using a Promega Quantus fluorometer, and the initial fragment lengths were measured with a Fragment Analyser (Agilent Technologies Inc.). About 100 ng absolute amount of DNA was taken for further processing as per standard Agilent protocol. The genomic DNA was subjected to random mechanical shearing to an average size between 250 and 300 bp followed by an end-repair reaction and ligation of adenine residue to the 3 0 end of the blunt fragments (A-tailing) to allow ligation of barcoded adaptors using the

Target enrichment bioinformatics
Adaptor sequences and low-quality regions were trimmed from demultiplexed data with fastq-mcf (Aronesty, 2011). The reads were mapped against the reference gene alignments of the bait regions generated during bait design with the BWA-MEM algorithm in bwa 0.7.17 (available from bio-bwa.sourceforge.net) with the minimum seed length set to 30. Reads for which the mapping was successful were then extracted from the resulting SAM file using a custom Perl script (Dietz et al., 2019) and mapped against a full coding sequence with BWA as described above. Diploid consensus sequences of the regions matching the reference were generated with samtools 1.6 (Li et al., 2009) and bcftools 1.6 (https://github.com/samtools/bcftools).
The consensus sequences were further converted into alignments by another custom script (Dietz et al., 2019) to generate the individual gene alignments. This was followed by a gap removal from these alignments. In order to minimize the amount of missing data, which can potentially introduce biases, we generated two datasets: one with the loci that are present in at least 50% of the specimens (TE30) and another with loci that are present in all the specimens (TE60). These alignments were then manually checked in Geneious version 6.1.8 to make sure that there were no misalignments. Finally, individual loci alignments were concatenated for the initial phylogenetic analyses using FASconCAT-G (Kück & Longo, 2014).

SNP calling
We extracted SNPs from the sorted BAM files generated during mapping in the TE30 dataset. SNPs were called using the samtools mpileup option piped together with the bcftools call option. Then filtering was done using bcftools, keeping one randomly chosen SNP per locus, filtering out indels and multiallelic SNPs. The resulting dataset contained a total of 3164 SNPs, which were used for further analysis.
For ddRAD data, 18,383 SNPs were generated after the assembly and filtering where only those SNP sites for which data was present for at least 20 samples (m20) were taken into consideration.

Phylogenetic analyses
For the target enrichment data, the best partitioning scheme for the concatenated dataset was found using IQTREE version 2.0.3 (Chernomor et al., 2016;Minh, Schmidt, et al., 2020b) with option-m TESTMERGEONLY to resemble PartitionFinder and rcluster algorithm (Lanfear et al., 2014), with the rcluster percentage set to 10, under the AICc criterion. The best partitioning scheme was then used as an input to set up a partitioned analysis in IQ-TREE. We used the ultrafast bootstrap approximation with 1000 replicates (Hoang et al., 2018). We also performed a SH-like approximate likelihood ratio test (Guindon et al., 2010) with 1000 bootstrap replicates using the -alrt option. To further reduce the risk of overestimating branch supports, the -bnni option was used. For ddRAD data, the TVM+F+I+G4 model of sequence evolution was used to reconstruct the ML tree in IQTREE using the ultrafast bootstrap approximation to compute 1000 replicates (Tahami et al., 2021). For each of the TE30, TE60, and ddRAD datasets, the NNI searches were performed using 20 best initial trees. The resulting phylogenetic trees were visualized using FigTree (https:// github.com/rambaut/figtree/releases) and rooted on M. britomartis.
We compared the resulting trees using the tipdiff function in the R package treespace (Jombart et al., 2017). This function finds the number of differences in ancestry of the tips of two different trees with the same tip labels.

Species tree analyses
Taking into consideration the gene tree-species tree discordance and incomplete lineage sorting that is common in datasets with multiple loci, we inferred a species tree using the summary-based coalescent method ASTRAL-III v. 5.7.3 (Zhang et al., 2018), which is statistically consistent under the multispecies coalescent framework. Model selection on each individual gene alignment from the TE60 dataset was performed using ModelFinder (Kalyaanamoorthy et al., 2017) and tree inference was done in IQTREE. The resulting output gene trees were used as an input for ASTRAL, which generated a species tree along with the quartet score. This score is the fraction of the induced quartet trees in the input set that occur in the species tree. The

Gene and site concordance factor
We calculated gene and site concordance factors (gCF and sCF) in IQTREE version 2.0.3 (Minh, Hahn, & Lanfear, 2020a) using the same set of gene trees used as an input for ASTRAL. gCF is calculated for each branch of a species tree as the fraction of decisive gene trees concordant with this branch and sCF as a fraction of decisive alignment sites supporting that branch (Minh, Hahn, & Lanfear, 2020a

Population genetics
To get an initial idea of genetic clusters of related species present in the dataset, a principal component analysis (PCA) was carried out on the target enrichment SNP dataset and on all genotypes as well as unlinked SNPs obtained from ddRAD dataset using the dudi.pca function from the R package adegenet (Jombart & Ahmed, 2011). We also performed a STRUCTURE analysis (Pritchard et al., 2000) to compare the admixture patterns inferred from ddRAD data and the target enrichment data. For this, our target enrichment SNP data in vcf format was converted to structure format (.str) using PGDSpider version 2.1.1.5 (Lischer & Excoffier, 2012). We tested five putative numbers of clusters, K = 1-5, with 10 iterations for each K. To determine the optimal number of genetic clusters (K), we used the ΔK method in STRUC-TURE HARVESTER (Earl & vonHoldt, 2012;Evanno et al., 2005) with 500,000 generations for the Markov chain and a value of 100,000 as burn-in. The same parameters were used for the target enrichment and the ddRAD dataset. We then aligned the cluster assignments of K = 2, K = 3, and K = 4 across all the 10 replicates in CLUMPP (Jakobsson & Rosenberg, 2007) and used DISTRUCT (Rosenberg, 2004) to visualize the patterns of admixture from aligned clusters for both the datasets.

Species delimitation
We performed a coalescent-based Bayes factor species delimitation implemented in SNAPP (Bryant et al., 2012) using the BFD* method (Leaché et al., 2014) on both target enrichment and ddRAD datasets.
For ddRAD data, we used unlinked SNPs to test the species delimitation scenarios in SNAPP. We generated a subset of 30 taxa (Table S1) and tested two alternate scenarios. In the first scenario, the Balkan line-  (Kass & Raftery, 1995) were calculated as the difference between current taxonomy and each of the alternate scenarios, which is then used to assess the strength of species delimitation models. As per the BF scale, 0 < BF <2 is not worth more than a bare mention, 2 < BF <6 indicates positive evidence, 6 < BF < 10 represents strong support, BF > 10 indicates decisive support. The calculation of Bayes Factors and ranking of different models followed the steps mentioned in Leaché and Bouckaert (2018).
In addition to the SNAPP analysis, we also tested the alternate species assignments using the tr2 program, a multilocus species delimitation method that finds the best delimitation based on a distribution model of rooted triplets (Fujisawa et al., 2016). We used rooted individual gene trees from the TE60 dataset as input. The two models we tested, which correspond to the two-species and three-species hypotheses respectively, were compared against a null model (which assumes the presence of a single species) based on -log (likelihood) scores.

Overview of the datasets
After the read filtering step, an average of 1.7 million reads was recovered across all the specimens for the target enrichment dataset ( with the average amount of missing data dropping to 0.60% (Table 2).
After removing the loci with zero or few variable sites, 1002 loci were retained. For the ddRAD dataset, the average number of loci retained after assembly was 5071 (Table 2), but the percentage of missing data was much higher than for the target enrichment dataset (average 78.9%, Table 2).

Target enrichment phylogenetics
Initial identification and labelling of our specimens followed the information provided by their mitochondrial COI barcodes. Two separate trees were generated for the two target enrichment datasets, TE30 and TE60. In both cases, six specimens of M. athalia originating from the Balkans, mainly the countries Albania, Bulgaria, Serbia and Greece (Balkan clade, atha14F407, atha14F660, atha14F666, atha14B773, atha14E859, atha14E853) were recovered as a distinct lineage that was sister to the rest with both ultrafast bootstrap and SH-aLRT support values of 100% (Figures 1 and 3a). However, this Balkan clade did not include all the specimens from the Balkan region that we T A B L E 2 Overview of number of loci and missing data for the target enrichment and ddRAD datasets

ASTRAL species tree and concordance factors
The general patterns found in the ASTRAL species tree were congru-

Comparisonphylogenetics
We compared a phylogenetic tree obtained from the TE60 dataset with a phylogenetic tree generated using ddRAD data, excluding the taxa that are not present in the target enrichment dataset (Figure 3a).
The general phylogenetic relationships were the same in both datasets.
Both methods suggested the presence of a distinct Balkan lineage, which in both cases included the same six samples (Figure 3a

Comparisonpopulation genetics
The PCA plots for both target enrichment and ddRAD showed a clade of M. athalia from the Balkan, that was found to be genetically distinct from all other studied specimens, including some other specimens from Balkan (Figure 3b). The clusters for the rest of the M. athalia and M. celadussa were found to overlap to some extent in PCA plots for both datasets (Figure 3b). The analysis of genomic admixture using  (Figure 4b).
F I G U R E 2 (a) ASTRAL species tree based on the target enrichment TE60 dataset and rooted with M. britomartis where numbers on the branches are quartet support scores. Contact zone specimens are indicated with the asterisks (b) sCF versus gCF plot for the TE60 dataset

Comparison -species delimitation
Based on MLE (Table 3), the three-species scenario achieved the highest rank among the two scenarios tested for both datasets (

DISCUSSION
Using a parapatric pair of Melitaea butterflies as a model system, we compared two genomic approaches for their utility in elucidating patterns of genetic structure and admixture, and investigated whether F I G U R E 3 (a) Comparison of maximum-likelihood phylogenetic trees obtained from the TE60 dataset (on the left) and ddRAD dataset (on the right). Both trees are rooted with Melitaea britomartis. Numbers on the branches indicate SH-aLRT/UFBoot support values calculated based on 1000 replicates. Contact zone specimens are indicated with the asterisks (b) PCA plots for target enrichment SNP dataset (on left, PC1 and PC2 with variances 6.93% and 4.57%, respectively) and ddRAD dataset (on right, PC1 and PC2 with variances 3.46% and 3.23%, respectively).
genome-wide datasets could provide means for well-informed and robust species delimitation under parapatry. Both sequence capture and RADseq have been independently explored for species delimitation before (Gueuning et al., 2020;Pante et al., 2015;Smith et al., 2014), but to the best of our knowledge, our study is among the first to focus on species delimitation in a parapatric system (Linck et al., 2019 being a rare exception). Additionally, only few studies have compared two genomic approaches in the same study system (Harvey et al., 2016;Leaché et al., 2015;Manthey et al., 2016), all of them utilizing UCEs as a primary sequence capture method. Consequently, there has been little discussion over the benefits and drawbacks of various genomics methods in species delimitation. In our study system, both the ML inference and ASTRAL species tree inference recovered the same set of major clades in both the target enrichment and ddRAD approaches, thus providing a consistent phylogenetic picture for the group. M. celadussa was long considered as a subspecies of M. athalia (Higgins, 1955) but was recently given a full species status (Leneveu et al., 2009;Wiemers et al., 2018). We  (Table 2). For the species delimitation analyses, both the three-species and two-species scenarios received a decisive support by target enrichment and ddRAD, suggesting that both datasets supported each of the two scenarios equally. However, testing of alternate species assignments using tr2 delimitation for target enrichment TE60 data favoured the three-species hypothesis over the twospecies one. Thus, no single hypothesis was consistently supported by all of the different methods.

Target enrichment versus ddRADbenefits and pitfalls
In target enrichment methods the sequencing efforts are concentrated on a pre-defined set of loci, and therefore high coverage and little missing data is expected . Different target enrichment methods mainly differ in the gene regions they target and, as researchers can select loci at desired levels of divergence (Banker et al., 2020), they have been shown to be generally efficient in elucidating affinities at both deep and shallow phylogenetic scales (Bagley et al., 2020;Espeland et al., 2018;Hamilton et al., 2016).
Using a recently developed target enrichment kit that targets about 2900 CDS regions , we were able to recover many gene loci across the individuals even after stringent filtering criteria (1002 gene loci for TE60) which shows the capture experiment to be successful. We found that the number of loci captured was sufficient to uncover the phylogenetic relationships of Melitaea irrespective of whether loci were concatenated and assumed to have the same genealogical history, or if they were allowed to have independent histories and analysed within a summary coalescent framework using ASTRAL. This demonstrates that the target enrichment loci contain sufficient phylogenetic information that could be useful to delimit closely related taxa. Also, the amount of missing data in both target enrichment datasets was very low compared to the ddRAD dataset.
RADseq methods are inexpensive and quick compared to target enrichment and do not need genomic resources to design probes.
However, they are most useful at shallow scales of divergence, as the number of homologous loci obtained decreases with phylogenetic distance (Lee et al., 2018;Wagner et al., 2013). Furthermore, the locus dropout effect also increases with divergence, leading to non-random patterns of missing data (Arnold et al., 2013). RADseq methods also require high quality material, that is, better preserved tissue to successfully extract sufficient amounts of DNA, while it is possible to recover genomic data from old museum material using target enrichment Mayer et al., 2021).
It has been suggested that the method of data collection should not influence the results of phylogenetic analyses at intermediate levels of divergence (Manthey et al., 2016). From our comparison of target enrichment and ddRAD, this should also be the case at an alpha taxonomic level. Therefore, the choice of method to tackle species delimitation at such levels could be entirely up to the researcher based on resources available and quality of the samples. Here, some practical aspects are worth taking into consideration: target enrichment is generally more expensive than RADseq because of the costs associated with the library preparation and purchasing enrichment probes (Harvey et al., 2016). The target enrichment laboratory work tends to be slower due to additional hybridization and enrichment steps although with the newer kits it is now significantly faster. However, as the loci obtained from RADseq are random and not known, different RAD datasets are hard to combine and are not comparable due to their unique nature. Different RAD datasets are also likely to be composed of loci with different levels of conservativeness (Lee et al., 2018). Target enrichment has an important advantage in terms of cross-compatibility, since the targeted loci remain fixed. As the loci being recovered are already known in target enrichment and usually have low levels of missing data, this could be an advantage in identifying a universal set of markers useful for species delimitation. It would also enable a comparison across different species groups, as well as combining different datasets, and hence provide means for standardized delimitation of, for example, allopatric populations and other settings where delimitation of species is inherently difficult.
Recently, the use of a standard set of nuclear markers for species delimitation and identification was proposed (Eberle et al., 2020), which can be studied in and compared across all animals and would allow disentangling recently diverged lineages. This idea was tested by Dietz et al. (2021) using empirical data from several metazoan lineages. They found that so-called Universal Single-Copy Orthologs (USCOs) performed better than DNA barcodes to delimit closely related species, irrespective of the assembly approach or tree reconstruction method used . A similar idea of having a unified set of loci for phylogenomic and population genetic studies has also been explored by Singhal et al. (2017), but exclusively for squamates. Basing species delimitation on a standard set of molecular markers would be an important step toward a more stabilized taxonomy, particularly under conditions where delimitation is bound to be arbitrary. This concerns the delimitation of allopatric and parapatric populations, as well as asexual strains, but would likely turn out to provide efficient means for a delimitation in all settings.

The conundrum of parapatric species delimitation
Finding universally applicable criteria for species delimitation is rendered extremely challenging particularly due to the complexity of biological systems (Eberle et al., 2020) and the semantics around the meaning of the term 'species ' (de Queiroz, 2007). We find reaching a consensus over the definition of species not foreseeable and hence do not discuss this further here, but we are slightly more optimistic that the disagreement over the epistemological aspects of species delimitation could be overcome by adopting efficient and standardizable approaches such as those presented by Eberle et al.
Parapatric taxa constitute an unusually complex case for species delimitation, because they typically show frequent hybridization in the contact zone (Bull, 1991;Hewitt, 1988) and genetic admixture that may extend far beyond the zone of contact (Johnson et al., 2015;Osada et al., 2010). Parapatric taxa tend to be morphologically and ecologically very similar, and morphologically intermediate individuals may occur (Guiller et al., 2017;Saino & Villa, 1992;Slender et al., 2017). From a temporal perspective, parapatric systems are presumably usually young-although initial differentiation may be much older (Ebdon et al., 2021)-and may have undergone either slow merging of populations or increased differentiation, the latter being promoted by disruptive selection and reinforcement (Barton & Hewitt, 1985). These complexities also characterize our study system, which we assume is resulting from a postglacial secondary contact of populations differentiated in two refugia during the last glaciation, and possibly in earlier ones (Tahami et al., 2021).
As the delineation of parapatric taxa is inherently arbitrary and largely subjective, delimiting admixing parapatric taxa in a consistent

CONCLUSIONS
The emergence of high-throughput genetic tools is revolutionizing systematic research at all phylogenetic levels. Many platforms and protocols to generate genomic-scale datasets are available, each with their specific strengths and shortcomings. This poses a question: which particular method provides the best means to address the research question at a given phylogenetic level? In this work, we focused on answering this question at the interface of population and phylogenetic levels, using a particularly challenging system of two parapatric butterfly species with frequent introgression and widespread admixture as a model. While both ddRAD sequencing and target enrichment methods provided largely congruent pictures of the evolutionary history of our study system, the latter has the benefit of providing higher levels of scalability. Compared to ddRAD datasets, sequence capture is characterized by a dramatically lower degree of missing data due to a smaller locus dropout effect and sequencing of a predefined set of loci at a high coverage. Presently, there is no consensus over the principles for delimitation of parapatric taxa. We argue that this is inherently based on arbitrary criteria and reaching a widely accepted consensus is desperately needed. As patterns of admixture like those observed here are more a rule than an exception in parapatric systems, we propose to regard the admixing parapatric taxa in a consistent way. Considering them regularly either as subspecies or full species would promote taxonomic stability, but both solutions would also be characterized by some shortcomings.