Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Large geographic distance versus small DNA barcode divergence: Insights from a comparison of European to South Siberian Lepidoptera

  • Peter Huemer ,

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Supervision, Validation, Writing – original draft, Writing – review & editing

    p.huemer@tiroler-landesmuseen.at

    Affiliation Naturwissenschaftliche Sammlungen, Tiroler Landesmuseen Betriebsges.m.b.H., Innsbruck, Austria

  • Paul D. N. Hebert,

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Resources, Software, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Centre for Biodiversity Genomics, University of Guelph, Guelph, Canada

  • Marko Mutanen,

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Resources, Validation, Writing – original draft, Writing – review & editing

    Affiliation Ecology and Genetics Research Unit, University of Oulu, Oulu, Finland

  • Christian Wieser,

    Roles Conceptualization, Data curation, Funding acquisition, Investigation, Resources, Validation, Writing – review & editing

    Affiliation Landesmuseum Kärnten, Klagenfurt, Austria

  • Benjamin Wiesmair,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Resources, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Naturwissenschaftliche Sammlungen, Tiroler Landesmuseen Betriebsges.m.b.H., Innsbruck, Austria

  • Axel Hausmann,

    Roles Conceptualization, Formal analysis, Investigation, Validation, Writing – original draft, Writing – review & editing

    Affiliation Section Lepidoptera, Bavarian State Collection of Zoology, Munich, Germany

  • Roman Yakovlev,

    Roles Conceptualization, Funding acquisition, Investigation, Project administration, Validation, Writing – original draft, Writing – review & editing

    Affiliations Ecology Department, Altai State University, Barnaul, Russia, Tomsk State University, Tomsk, Russia

  • Markus Möst,

    Roles Conceptualization, Data curation, Formal analysis, Resources, Software, Visualization, Writing – review & editing

    Affiliation Department of Ecology, University of Innsbruck, Innsbruck, Austria

  • Brigitte Gottsberger,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Validation, Visualization, Writing – review & editing

    Affiliation Department of Botany & Biodiversity Research, University of Vienna, Vienna, Austria

  • Patrick Strutzenberger,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Validation, Writing – review & editing

    Affiliation Department of Botany & Biodiversity Research, University of Vienna, Vienna, Austria

  • Konrad Fiedler

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Department of Botany & Biodiversity Research, University of Vienna, Vienna, Austria

Abstract

Spanning nearly 13,000 km, the Palearctic region provides an opportunity to examine the level of geographic coverage required for a DNA barcode reference library to be effective in identifying species with broad ranges. This study examines barcode divergences between populations of 102 species of Lepidoptera from Europe and South Siberia, sites roughly 6,000 km apart. While three-quarters of these species showed divergence between their Asian and European populations, these divergence values ranged between 0–1%, distinctly less than the distance to the Nearest-Neighbor species in all but a few cases. Our results suggest that further taxonomic studies may be required for 16 species that showed either extremely low interspecific or high intraspecific variation. For example, seven species pairs showed low or no barcode divergence, but four of these cases are likely to reflect taxonomic over-splitting while the others involve species pairs that are either young or show evidence for introgression. Conversely, some of the nine species with deep intraspecific divergence at varied spatial levels may include overlooked species. Although these 16 cases require further investigation, our overall results indicate that barcode reference libraries based on records from one locality can be very effective in identifying specimens across an extensive geographic area.

Introduction

In many cases, DNA barcoding can be an effective tool for both specimen identification and species discovery. In animals, a 648 base pair segment of the mitochondrial cytochrome c oxidase subunit 1 (COI) gene has been adopted as the barcode region [1], [2]. Numerous researchers have added data to BOLD, the Barcode of Life Data Systems (www.boldsystems.org), which at present includes more than 6 million barcode records from about 550,000 operational taxonomic units (i.e. BINs–see [3]). Currently, more than 22,000 registered users are accessing these records. Despite varied coverage among taxonomic groups and regions, these data are increasingly useful to address diverse research questions in ecology and evolutionary biology.

One important issue that needs further investigation relates to the performance of barcode-based species identifications across large distances. In particular, since species’ distributions vary from narrow endemism to global occurrence, it needs to be assessed whether DNA barcodes from one site or region can be used to identify specimens of the same species from distant localities. This is especially important in the Palearctic region because its elongate axis spans more than 13,000 km and many species are thought to occur from the Atlantic Ocean in the west to the Pacific Ocean in the east. For the same reason, the Palearctic region is ideal to quantify the influence of geographic distance on intraspecific variation under relatively comparable conditions (similar ecotypes). In recent years, a few studies on Lepidoptera have examined the congruence of DNA barcodes across larger geographic distances including 1,000 species shared by Fennoscandia and Central Europe [4], butterflies from Central Asia [5], and 1,500 species of Noctuoidea in North America [6]. However, these studies still are the exception and in contrast to the present paper either only cover a single taxonomic sub-group of Lepidoptera or a comparatively small geographic distance. Moreover, most prior work has examined patterns of sequence variation at a national or regional level [7], [8], [9].

Ideally, DNA barcodes from specimens collected at a single locality would enable the identification of conspecifics from the entire species distribution. This might not be the case if intraspecific sequence variation within widespread taxa is greater than interspecific differences. In other words, identification problems will arise whenever intraspecific variation blurs the ‘barcode gap’ which is critical to assign specimens to their correct species, either a Linnaean name or an Operational Taxonomic Unit (e.g. BIN). In such cases DNA barcodes fail to correctly identify species and additional diagnostic characters, particularly morphological traits and high density genetic markers, have to be considered to firmly identify species.

Our study is the first to examine patterns of DNA barcode variation across a very large geographic range for a broad set of Lepidoptera (102 species, 22 families) shared by Europe and South Siberia. Specifically, we ascertain levels of barcode divergence between putatively conspecific specimens from southern Siberia, i.e. Russian Altai, and Europe, particularly Northern Europe and the Alps. Although higher intraspecific variation within populations spanning Siberia and Europe compared to the respective populations from each region considered separately can be expected, the magnitude of this variation will determine whether an effective system for DNA barcode-based identifications can be based on a narrowly parameterized reference library. To examine this matter, we compared intraspecific divergences between populations of 102 species from Siberia and their divergences to the 5,016 species (41,583 specimens) in a carefully validated dataset of European Lepidoptera [10]. We also ascertained if intraspecific distances are lower in species with a near-continuous Euro-Siberian distribution than in those with a disjunct arctic-alpine or central-Asian-alpine distribution. Finally, we asked if patterns of isolation by geographic distance as measured by COI barcode sequences are influenced by overall sequence divergence or distribution type.

Material and methods

Taxon sampling strategy

This study examined two Palaearctic sub-regions separated by a distance of about 6,000 km: central/northern Europe with a focus on the Alps and Finland, and South Siberia (Altai Republic, Russia), supplemented by a few reliably identified specimens from other areas (Fig 1).

thumbnail
Fig 1. Geographic origin of the voucher specimens for the 102 sequenced species of Eurasian Lepidoptera.

Map created with SimpleMappr (http://www.simplemappr.net).

https://doi.org/10.1371/journal.pone.0206668.g001

Whereas DNA barcode coverage for lepidopteran taxa is generally high for species from central and northern Europe, only few records are available from South Siberia. We therefore sought to obtain specimens of >100 species shared by these regions. We focused on species with a disjunct arctic-alpine and South Siberian-alpine distribution based on the expectation that they would be likely to show higher intraspecific barcode variation.

Species identification was exclusively based on morphological traits.

In general, we analyzed three specimens from South Siberia for each of these species to estimate intraspecific divergence, but only two specimens were available for 18 species whereas for 15 species the number of voucher specimens ranged between 4 and 8. The average number of successfully sequenced specimens per species from Asia was 3.24. By comparison, the number of sequenced specimens was much higher for most European representatives of these species with 16.34 sequenced specimens per species on average. Existing specimens from museum collections were analyzed where possible and were supplemented with material from an expedition to the Russian Altai Mountains from late July to mid-August 2016 [11]. A permit was not required for the Altai specimens as no protected species were collected. Collections in other countries were made in compliance with current legislation. In Finland, permits were issued by the Finnish Centre for Economic Development, Transport, and the Environment to MM under permissions VARELY/441/07.01/2012 and LAPELY/275/07.01/2012, while collecting permits were not necessary for scientific research in Austria/Tyrol. The Nagoya protocol was not applicable because our European material was collected before October 12, 2014 and because the protocol has not been ratified by Russia.

Most sequences considered in this study derive from specimens held in the Tiroler Landesmuseum Ferdinandeum, Innsbruck, Austria; the University of Oulu, Oulu, Finland; the Bavarian State Collection of Zoology, Munich, Germany; and another 25 specimen depositories. Wherever possible, data were supplemented by publicly available sequences in BOLD ([12], see http://www.boldsystems.org).

DNA sequencing

For freshly collected specimens, a single leg was removed and placed in a 96-well lysis plate that was submitted for analysis to the CCDB (Canadian Center for DNA Barcoding, University of Guelph, Canada) where DNA extraction, PCR amplification, and sequencing were performed following standard high-throughput protocols [13].

Altogether, 315 specimens of 102 South Siberian species that also occur in Europe were sequenced. Moreover, we examined previously published 1,682 sequences (>500bp) [10] from specimens of the same species from sites in Europe including Finland (423), Austria (410), Germany (329), Russia (315), and 19 other countries (520) (Fig 1). Information regarding the institutions hosting each publicly available specimen, sample and process IDs and GenBank accession numbers are available in S1 Table. Further details on each specimen, including complete voucher data, and images are available on BOLD [12] in the public dataset “Lepidoptera of Altai Mountains (DS-LEPEUALT)” under the DOI: 10.5883/DS-LEPEUALT.

Data analysis

The extent of intraspecific sequence variation in the COI sequences for each species was estimated using the Kimura-2-parameter (K2P) model of nucleotide substitution using analytical tools on BOLD v4.0 (http://www.boldsystems.org) and MEGA v.6 [14]. There has been an interesting debate over the choice and justification of K2P and other distance measures used in barcoding analyses (e.g., [15]), however, the ‘best method’ depends on the dataset under consideration and the effects of different distance measures and models on the distances and identification success are generally small (e.g., [16]). Therefore a consequence of model choice on the main results of this specific work is unlikely and we applied the K2P method as implemented in BOLD. For each species we obtained four estimates of intraspecific divergence by calculating the arithmetic mean for all pairwise distances (K2P) among conspecific individuals within the following spatial contexts: (a) ‘total intraspecific’ (mean distance for all data for each species); (b) ‘within Europe’ (mean distance for all European samples); (c) ‘within Asia’ (mean distance for all South Siberian-Central Asian samples); and (d) ‘inter Europe-Asia’ (mean distance within each species for all pairs of specimens from Europe vs. Asia).

Furthermore, we examined the potential impact of distribution type on intraspecific divergences. For this analysis, each species was assigned to one of two categories: (a) those with largely continuous distributions across Eurasia, i.e. with known gaps <500 km; and (b) those with highly disjunct distributions, i.e. with gaps between known populations >2,000 km. These two categories basically reflect what has been termed Euro-Siberian versus arctic-alpine and South Siberian-alpine distribution patterns in biogeographic studies [17].

We compared mean intraspecific sequence divergences across the three spatial levels (intra-Europe, intra-Asia, inter-Europe-Asia) using a non-parametric Friedman ANOVA of ranks because of uneven variance and sequence numbers for the 102 species. Total mean intraspecific barcode divergence between the two types of species distributions was compared using a Mann-Whitney U-test. In addition, we examined the strength of isolation by distance within every species. For this purpose, we calculated a Mantel correlation coefficient for the matrix of geographic distances between sampling localities and the K2P distance matrix for every species using the Geographic Distance Correlation tool in BOLD. These correlation coefficients were then tested for contingency upon distribution type or overall intraspecific sequence divergence using a Mann-Whitney test and a Spearman rank correlation, respectively. Statistical analyses were performed using Statistica 8.0 (StatSoft Inc.).

Finally, we compared the mean and maximum intraspecific divergence for each of the 102 species with its Nearest-Neighbor (NN) distance, because a gap between intraspecific and interspecific variation is essential for DNA barcoding to be effective in specimen identification. For this purpose we used the DS-MARKALL dataset (dx.doi.org/10.5883/DS-MARKALL). It includes >500 bp sequence records for 41,583 specimens representing 5,016 species of Lepidoptera [10]. We limited comparisons to this dataset because it is both comprehensive and identifications are very reliable. Sequences from the present study and from DS-MARKALL were pooled, and a barcode gap analysis was then carried out on BOLD using the K2P model. This analysis estimated the minimum genetic divergence to the NN and both the mean and maximum intraspecific divergences for each species.

Results

Sequenced species

We collected 1,997 sequences >500 bp from the 102 species. Among them, 54 sequences were not barcode compliant according to the standards in BOLD, i.e. a minimum sequence length of 500 bp, less than 1% ambiguous bases, the presence of two trace files, a minimum of low trace quality status, and the presence of a country specification in the record as set out by the Consortium for DNA Barcoding (CBOL), most likely due to partially degraded DNA. Nevertheless, these 54 sequences were still considered in the analysis as they were correctly placed with their conspecifics in an initial NJ tree. The seven families with the largest numbers of sequences were Noctuidae (551), Geometridae (389), Erebidae (175), Tortricidae (157), Nymphalidae (146), Gelechiidae (144), and Lycaenidae (133).

Intraspecific barcode divergences

Intraspecific barcode divergence was generally <1% with a mean (± SD) of 0.68 ± 0.67% (median: 0.43%; range: 0.00 to 3.46%) for the 102 species. As expected, there were highly significant differences among the three regional comparisons (Friedman ANOVA: χ22df = 77.82; p<0.0001). Divergences were lowest within the Asiatic samples as expected because they originated from few collecting sites with low numbers of specimens, while divergences within Europe averaged higher, and those between the European and Asiatic samples were highest (Fig 2, Table 1). In post-hoc comparisons, all three pairwise comparisons were highly significant (Wilcoxon-tests, p<0.007).

thumbnail
Fig 2. Mean intraspecific sequence divergences for 102 Lepidoptera species in geographic comparisons.

Boxplots (median, interquartile range, total range) of mean intraspecific sequence divergences (Kimura-2-Parameter) for 102 Lepidoptera species: total intraspecific divergences (mean distance for all data for each species), and intraspecific divergences at three geographic levels: intra-Asia (mean distance for all South Siberian samples); intra-Europe (mean distance for all European samples); inter- Europe-Asia (mean distance within each species for all pairs of specimens from Europe vs. Asia).

https://doi.org/10.1371/journal.pone.0206668.g002

thumbnail
Table 1. Mean intraspecific barcode divergences (% Kimura-2P-distances) for 102 Lepidoptera species from Europe and South Siberia and for the geographic comparisons, and distribution type.

https://doi.org/10.1371/journal.pone.0206668.t001

Relationship between distribution type and intraspecific barcode divergences

Contrary to expectation, total intraspecific divergence values were only slightly larger in species with disjunct as opposed to those with continuous distributions (Mann-Whitney test: z = 2.09; p = 0.036; Fig 3). Species with continuous ranges (n = 83) had an average intraspecific sequence divergence of 0.63± 0.66% (median: 0.37%; range = 0.00–3.46%), while those with disjunct distributions (n = 19) showed a divergence of 0.89± 0.70% (median: 0.54%; range = 0.18–2.16%).

thumbnail
Fig 3. Mean intraspecific sequence divergences for 102 Lepidoptera species in different distribution types.

Boxplot (median, interquartile range, total range) of total mean intraspecific sequence divergences (Kimura-2-Parameter) for 102 Lepidoptera species from Europe and South Siberia, comparing species with continuous versus disjunct distributions.

https://doi.org/10.1371/journal.pone.0206668.g003

Factors affecting isolation by distance within species

As expected, the extent of sequence divergence between members of a species was often related to the distance between their sites of collection. However, the extent of this isolation-by-distance effect was highly variable among species. Sequence divergences in 56 of the 102 species showed no association with distance, while 13 species showed a weakly significant Mantel correlation (p<0.05) and 33 species showed a strong relationship (p<0.01). Evidence for isolation-by-distance was stronger in species with disjunct (mean Mantel r = 0.59±0.32) than continuous distributions (mean Mantel r = 0.28±0.27; Mann-Whitney test: z = 4.19, p<0.0001; Fig 4). In species with disjunct distributions, the extent of isolation-by-distance was only weakly and non-significantly related to overall sequence divergence (Spearman rank correlation: rS = 0.40, p = 0.087), and this relationship was even weaker and also non-significant for species with continuous ranges (rS = 0.20; p = 0.073). The strength of isolation-by-distance patterns within species did not co-vary with the maximum distance between sampling sites (rS = -0.005, p = 0.96), but it was negatively related to the number of sequences available for a taxon (rS = -0.27, p = 0.007).

thumbnail
Fig 4. Relationship of intraspecific sequence divergence and geographic distance.

Relationship between mean overall intraspecific sequence divergence and the extent of isolation by distance (as quantified by the Mantel correlation coefficient, r), with species patitioned according to their type of distribution. Species with disjunct distributions (blue circles) tended to show stronger isolation-by-distance (i.e. higher r values) than species with continuous distributions (orange triangles), and this pattern was marginally stronger in species with higher overall levels of intraspecific sequence divergence.

https://doi.org/10.1371/journal.pone.0206668.g004

Relationships between interspecific and intraspecific divergences

Nearest Neighbor distances (K2P) for the 102 species averaged 4.52%, but ranged from 0.00–12.98%. By comparison, maximum intraspecific divergence values averaged 1.69% (range = 0.00–7.32%) while mean intraspecific variation values averaged 0.68% (range = 0.00–3.46%). Therefore, the gap to the NN species averaged 2.73-fold the maximum intraspecific variation (Wilcoxon test: z = 7.22, p<0.0001), and 6.90-fold the mean intraspecific variation (z = 8.47, p<0.0001). While the barcode gap was clear in most cases, divergence to the NN was either absent or less than intraspecific variation in 12 cases (Figs 5 and 6, Table 2). The four cases (Table 2) which completely lacked interspecific divergence may reflect taxonomic over-splitting or introgression, as discussed in Mutanen et al. (2016) [10].

thumbnail
Fig 5. Mean intraspecific sequence divergences for 102 Lepidoptera species in relation to nearest neighbor.

Barcode sequence distances to the nearest neighbor species in relation to mean intraspecific distances for 102 species of Palearctic Lepidoptera. The straight line indicates where distance to nearest neighbor equals the respective intraspecific distance, viz. for species above this line the ‘barcode gap’ does exist.

https://doi.org/10.1371/journal.pone.0206668.g005

thumbnail
Fig 6. Maximum intraspecific sequence divergences for 102 Lepidoptera species in relation to nearest neighbor.

Barcode sequence distances to the nearest neighbor species in relation to maximum intraspecific distances for 102 species of Palearctic Lepidoptera. The straight line indicates where distance to nearest neighbor equals the respective intraspecific distance, viz. for species above this line the ‘barcode gap’ does exist.

https://doi.org/10.1371/journal.pone.0206668.g006

thumbnail
Table 2. Nearest-Neighbor distances (% K2P) for 102 species of Lepidoptera as well as the mean and maximum intraspecific divergences for the new records obtained in the present study and DS-MARKALL dataset.

https://doi.org/10.1371/journal.pone.0206668.t002

Discussion

Our analysis of DNA barcode sequences from a phylogenetically diverse group of Lepidoptera from Asia and Europe revealed that intraspecific divergences increased with sampling intensity and distance. However, intraspecific divergences in most species remained low with mean K2P divergences averaging 0.68% and exceeding 2.5% in 23 species of the complete sample. However, divergence was >2.5% in just 9 of the 102 species in one or more of the three spatial levels of our analysis. By comparison, the species with a higher divergence than 2.5% showed a mean sequence divergence of 4.62% to European populations of 5,016 species of Lepidoptera. This result corroborates patterns from earlier studies on North American [6] and European Lepidoptera [4], confirming that the barcode region of COI is an efficient tool for species identification, given that the databases are of high quality, even when the reference sequences used for species identification derive from sites far distant from the locality under study. Irrespective of their origin, most sequences could be unambiguously allocated to a taxonomically defined species although several cases of high intraspecific divergence may reflect overlooked species (as discussed later). Conversely, 4 of the 7 species pairs (Crambus perlella/monochromella, Crocallis elinguaria/albarracina, Epinotia trigonella/indecorana, Coenonympha tullia/rhodopensis) that either lacked or possessed very limited (<0.5%) divergence from their NN may indicate taxonomic over-splitting rather than the failure of DNA barcoding to discriminate valid species (see [10]). For three other species pairs (Setina irrorella/aurita, Boloria titania/chariclea and Perizoma hydrata/affinitata), the low NN values suggest a recent divergence of valid, morphologically well-defined species or recent mitochondrial introgression. For example, an earlier study suggested that the low NN divergence between P. hydrata and P. affinitata resulted from mitochondrial introgression from P. hydrata to P. affinitata [18].

Our comparisons of European and South Siberian populations revealed regional sequence divergence in the respective region in about half the species, but most values were well below 2%. In addition, regional barcode variation was similar in species with disjunct distributions and in those with continuous ranges, indicating substantial gene flow in both cases. In part, this may reflect the fact that current distributions of Euro-Siberian Lepidoptera largely result from range expansions in the brief interval since the last glacial maximum, i.e. within less than 15,000 years [19], [20]. However, when intraspecific sequence divergences were examined using an isolation-by-distance approach, they were slightly stronger in species with disjunct ranges.

Despite our limited sampling, some species (e.g. Elachista bedellella, Boloria napaea, B. titania and Plebejus orbitulus) showed clear divergence between South Siberian and European populations (see Table 1). In addition, populations of some species from northern Europe clustered with those from Asia rather than from central Europe (e.g. Xestia speciosa). This pattern likely indicates that formerly glaciated areas in northern Europe were sometimes recolonized by lineages from Asia. All these intraspecific patterns need to be examined in more detail by increased sampling effort in intermediate areas, and should be cross-checked using morphology and nuclear markers to clarify phylogeographic histories. Yet, for the purpose of species identification, we did not encounter any significant barriers, even in these taxa.

High intraspecific divergences–potential cryptic diversity

High intraspecific barcode divergences (> 2–3%) may be indicative for the existence of overlooked species of Lepidoptera, but may also be due to mitochondrial introgression from a sister species [21]. Therefore, all such cases should be analyzed in more detail by examining divergence patterns at nuclear loci and morphological characters. We detected high intraspecific divergences (> 2.5% max divergence) between European and Asian populations for 9 of the 102 species (Table 3). Six of these species have a disjunct distribution, suggesting the possible existence of cryptic species in South Siberia versus Europe. In three other species (e.g. Coscinia cribraria), barcode variation was high even within Europe without an obvious geographical pattern. The remainder of this section discusses these nine species in more detail. All of them group into two or more different BINs [3] (S1 Table), operational taxonomic units which in Lepidoptera are frequently but not always congruent with species boundaries (e.g. [7], [22]). In fact deep barcode splits may be caused by pseudogenes, Wolbachia infection, hybridization etc. [23] and these cases need to be analysed using an integrative approach (e.g., [24]).

thumbnail
Table 3. Nine Euro-Siberian species of Lepidoptera with a max intra-specific K2P distance for COI >2.5% between Asia and Europe.

https://doi.org/10.1371/journal.pone.0206668.t003

1. Caryocolum pullatella (Tengström, 1848) (Gelechiidae).

C. pullatella is a Holarctic species that is widespread in northern Europe, but restricted to isolated localities in the Alps and Balkans [25]. As its Palearctic populations include two DNA barcode clusters with allopatric distributions (central/south-east Europe versus north Europe-South Siberia), this may indicate cryptic diversity. The situation potentially gains further complexity when North American specimens are considered as they include additional BINs and requires further assessment.

2. Coscinia cribraria (Linnaeus, 1758) (Erebidae).

This morphologically variable species is widely distributed across the Palearctic. Numerous forms and subspecies have been described, including ssp. sibirica (Staudinger, 1892) from the Altai Mountains which was recently synonymized by Dubatolov (2010) [26]. However, Witt & Ronkay (2011) [27] suspected that sequence data would indicate the existence of a species complex. Current DNA barcode sequences are assigned to five clades; specimens from Altai belong to the same BIN as those from northern and central Europe. As the clusters within Europe do not show a clear phylogeographic pattern, sequence variation may indicate introgression or the impacts of Wolbachia infection [23].

3. Dicallomera fascelina (Linnaeus, 1758) (Erebidae).

D. fascelina is almost continuously distributed in temperate Eurasia, extending from northern Spain east to Korea, although absent from the Mediterranean region and the British Isles. Several subspecies have been recognized. Populations from the Altai region have been attributed to the nominotypical subspecies, but the clear differences in their external morphology and genitalia [28], coupled with their barcode divergence, suggest they represent a cryptic species.

4. Eana osseana (Scopoli, 1763) (Tortricidae).

E. osseana is a widespread Holarctic species, restricted to mountainous areas at the southern limits of its distribution. DNA barcodes indicate two divergent BINs, one from Europe, and a second from the Altai Mountains. As three additional BINs are known from North America, the species requires integrative revisionary work.

5. Eulithis prunata (Linnaeus, 1758) (Geometridae).

This species is almost continuously distributed in temperate Eurasia, but is restricted to mountainous areas in the southern parts of its range. Hausmann & Viidalepp (2012) [29] found high COI sequence divergence in E. prunata, with distances reaching 5.9% and at least six divergent haplotypes in Europe and Turkey. South Siberian populations have been assigned to the ssp. leucoptera (Djakonov, 1929), but it may represent a distinct species given its deep barcode divergence from other populations.

6. Gazoryctra ganna (Hübner, 1804) (Hepialidae).

G. ganna is an arctic-alpine species with a disjunct distribution. It occurs in the Alps and High Tatra Mountains, northern Finland, and European Russia, as well as at isolated localities to the Far East [30]. Moderate sequence divergence exists between northern and central European populations [8] while those from the Altai Mountains show high sequence divergence from both European clusters. Because of their differing flight times (late afternoon in Asia versus early morning in Europe) and slightly different phenotypes, the Asian specimens likely represent an overlooked species.

7. Ochsenheimeria urella (Fischer von Röslerstamm, 1842) (Ypsolophidae).

O. urella is widely although locally distributed in central and northern Europe, including European Russia. A previously doubtful record from the Far East [30], together with our record from Altai [11], indicates a much wider distribution in Asia. Members of this species are placed in two BINs, one shared by the Alps and Finland, and the other by Finland and the Altai Mountains.

8. Pontia callidice (Hübner, 1800) (Pieridae).

P. callidice shows a disjunct distribution in the high mountains of Eurasia from the Pyrenees to the Himalayas, and in the subarctic Tundra from the Ural Mountains to the Far East. Linked to their geographic isolation, populations show considerable variation in wing patterns and have been assigned to several subspecies. The nominotypical subspecies occurs in the high mountains of Europe (Pyrenees and Alps). Della Bruna et al. (2004) [31] assigned populations from the Altai to spp. hinducucica Verity, 1811 (type locality Hindu Kush), whereas Tshikolovets et al. (2009) [32] listed spp. kalora (Moore, 1865) from Altai (type locality NW Himalaya). Korb & Bolshakov (2016) [33] listed ssp. halasia Huang et Murayama, 1992 from SW Altai (described from Halasi, [Chinese] Altai). Despite this nomenclatural uncertainty, the DNA barcode results indicate that specimens from the Alps belong to a very distinct barcode cluster from those in Russia (Altai), Kyrgyzstan and Tajikistan.

9. Scrobipalpula diffluella (Frey, 1870) (Gelechiidae).

In the Palearctic, the genus Scrobipalpula includes a complex of closely related species with disputed taxonomy [25]. S. diffluella shows a typical boreo-montane distribution with most records from northern and central Europe, extending to the southern Urals. Specimens of the newly detected population from the Altai show close morphological similarity with European material, but clear barcode divergence, suggesting cryptic diversity.

Conclusions

This study on a phylogenetically diverse sample of Lepidoptera across a wide geographic range within the Palearctic region corroborates the utility of DNA barcode data for enabling both species identification and species discovery. For most species, unequivocal identifications could be established for samples from a widely distant region (the Russian Altai mountains), even though available reference data largely derived from regions in north and central Europe. On the other hand, in a few ‘species’ taxonomically known since Linnean times, patterns of sequence divergence suggest the possibility of unrecognized cryptic species diversity and demand further assessment using an integrative taxonomic approach. Hence, this study exemplifies the usefulness of well curated DNA barcode libraries whose power and versatility will expand as more sequence data are collated under strict quality standards.

Supporting information

S1 Table. Accession numbers and BINs.

List of species names, sample-IDs, process-IDs (from BOLD database), GenBank Accession numbers, BINs, and Institution/collection storing vouchers.

https://doi.org/10.1371/journal.pone.0206668.s001

(PDF)

Acknowledgments

We are grateful to the entire team at the Canadian Centre for DNA Barcoding (CCDB, Guelph, Canada) for carrying out the sequence analyses. Barbara Reischl kindly assisted with part of the graphs. The authors thank the Department of Innovation, Research and University of the Autonomous Province of Bozen/Bolzano for covering the Open Access publication costs.

References

  1. 1. Hebert PDN, Cywinska A, Ball SL, deWaard JR. Biological identifications through DNA barcodes. Proceedings of the Royal Society London B. 2003;270:313–321.
  2. 2. Hebert PDN, Hollingsworth PM, Hajibabaei M. From writing to reading the encyclopedia of life. Philosophical Transactions of the Royal Society B. 2016;371:20150321
  3. 3. Ratnasingham S, Hebert PDN. A DNA-based registry for all animal species: The Barcode Index Number (BIN) System. PLoS ONE. 2013;8(7):e66213. pmid:23861743
  4. 4. Huemer P, Mutanen M, Sefc KM, Hebert PDN. Testing DNA barcode performance in 1000 species of European Lepidoptera: Large geographic distances have small genetic impacts. PLoS ONE. 2014;9(12):e115774. pmid:25541991
  5. 5. Lukhtanov VA, Sourakov A, Zakharov EV, Hebert PDN. DNA barcoding Central Asian butterflies: increasing geographical dimension does not significantly reduce the success of species identification. Molecular Ecology Resources. 2009;9:1302–1310. pmid:21564901
  6. 6. Zahiri R, Lafontaine JD, Schmidt BC, deWaard JR, Zakharov EV, Hebert PDN. Probing planetary biodiversity with DNA barcodes: The Noctuoidea of North America. PLoS ONE. 2017;12(6):e0178548. pmid:28570635
  7. 7. Hausmann A, Godfray HCJ, Huemer P, Mutanen M, Rougerie R, van Nieukerken EJ, et al. Genetic patterns in European geometrid moths revealed by the Barcode Index Number (BIN) System. PLoS ONE. 2013;8(12):e84518. pmid:24358363
  8. 8. Mutanen M, Hausmann A, Hebert PDN, Landry J-F, deWaard JR, Huemer P. Allopatry as a Gordian knot for taxonomists: Patterns of DNA barcode divergence in Arctic-Alpine Lepidoptera. PLoS ONE. 2012;7(10):e47214. pmid:23071761
  9. 9. Zahiri R, Lafontaine JD, Schmidt BC, deWaard JR, Zakharov EV, Hebert PDN. A transcontinental challenge–A test of DNA barcode performance for 1541 species of Canadian Noctuoidea (Lepidoptera). PLoS ONE. 2014;9(3):e92797. pmid:24667847
  10. 10. Mutanen M, Kivelä SM, Vos RA, Doorenweerd C, Ratnasingham S, Hausmann A, et al. Species-level para- and polyphyly in DNA barcode gene trees: Strong operational bias in European Lepidoptera. Systematic Biology. 2016;65:1024–1040. pmid:27288478
  11. 11. Huemer P, Wieser C, Wiesmair B, Sinev SY, Wieser C, Yakovlev RV. Schmetterlinge (Lepidoptera) des Altai-Gebirges (Südsibirien, Russland)–Eindrücke einer internationalen Expedition im Spätsommer 2016. Carinthia II. 2017;207(127):527–564.
  12. 12. Ratnasingham S, Hebert PDN. BOLD: The Barcode of Life Data System (http://www.barcodinglife.org). Molecular Ecology Notes. 2007;7:355–364. pmid:18784790
  13. 13. deWaard JR, Ivanova NV, Hajibabaei M, Hebert PDN. Assembling DNA Barcodes: Analytical Protocols. In: Cristofre M, editor, Methods in Molecular Biology: Environmental Genetics. Totowa: Humana Press Inc.; 2008.
  14. 14. Tamura K, Stecher G, Peterson D, Filipski A, Kumar S. MEGA6: Molecular Evolutionary Genetics Analysis version 6.0. Molecular Biology and Evolution. 2013;30:2725–2729. pmid:24132122
  15. 15. Srivathsan A, Meier R. On the inappropriate use of Kimura‐2‐parameter (K2P) divergences in the DNA‐barcoding literature. Cladistics. 2012;28(2):190–194.
  16. 16. Collins RA, Boykin LM, Cruickshank RH, Armstrong KF. Barcoding's next top model: an evaluation of nucleotide substitution models for specimen identification. Methods in Ecology and Evolution. 2012;3:457–465.
  17. 17. Schmitt T. Molecular biogeography of Europe: Pleistocene cycles and postglacial trends. Frontiers in Zoology. 2007;4(1),11.
  18. 18. Hausmann A, Haszprunar G, Hebert PDN. DNA barcoding the geometrid fauna of Bavaria (Lepidoptera): Successes, surprises, and questions. PLoS ONE. 2011;6(2):e17134. pmid:21423340
  19. 19. Hewitt GM. Post‐glacial re‐colonization of European biota. Biological Journal of the Linnean Society. 1999;68:87–112.
  20. 20. Varga Z. Post-glacial dispersal strategies of Orthoptera and Lepidoptera in Europe and in the Carpathian basin. In: Reemer M, van Heksdingen PJ, Kleukers RMJC, editors. Changes in ranges: invertebrates on the move: Proceedings of the 13th International Colloquium of the European Invertebrate Survey; 2001 Sep 2–5; Leiden, Netherlands. Leiden: European Invertebrate Survey–The Netherlands; 2003. p. 93–105.
  21. 21. Hebert PDN, deWaard JR, Landry JF. DNA barcodes for 1/1000 of the animal kingdom. Biology Letters. 2009;6:359–362. pmid:20015856
  22. 22. Ortiz AS, Rubio RM, Guerrero JJ, Garre MJ, Serrano J, Hebert PDN, Hausmann A. Close congruence between Barcode Index Numbers (bins) and species boundaries in the Erebidae (Lepidoptera: Noctuoidea) of the Iberian Peninsula. Biodiversity Data Journal. 2017;(5):e19840. pmid:28852323
  23. 23. Werren JH, Baldo L, Clark ME. Wolbachia: master manipulators of invertebrate biology. Nature Reviews Microbiology. 2008;6:741–751. pmid:18794912
  24. 24. Mally R, Huemer P, Nuss M. Deep intraspecific DNA barcode splits and hybridisation in the Udea alpinalis group (Insecta, Lepidoptera, Crambidae)–an integrative revision. ZooKeys. 2018;746:51–90.
  25. 25. Huemer P, Karsholt O. Gelechiidae II. In: Microlepidoptera of Europe 6. Stenstrup: Apollo Books; 2010.
  26. 26. Dubatolov VV. Tiger-moths of Eurasia (Lepidoptera, Arctiinae). Neue Entomologische Nachrichten. 2010;65:1–106.
  27. 27. Witt TJ, Ronkay L, editors. Lymantriinae–Arctiinae, including phylogeny and check list of the quadrifid Noctuoidea of Europe. In: Noctuidae Europaeae 13. Sorø: Entomological Press; 2011.
  28. 28. Trofimova TA. Systematic notes on Dasorgyia Staudinger, 1881, Dicallomera Butler, 1881, and Lachana Moore, 1888 (Lymantriidae). Nota Lepidopterologica. 2008;31(2):273–291.
  29. 29. Hausmann A, Viidalepp J. Larentiinae I. In: The Geometrid Moths of Europe 3. Vester Skerninge: Apollo Books; 2012.
  30. 30. Sinev SY, editor. Catalogue of the Lepidoptera of Russia. St. Petersburg: KMK Scientific Press; 2008.
  31. 31. Della Bruna C, Gallo E, Sbordoni V. Pieridae part I. Guide to the butterflies of the Palearctic region. Milano: Omnes Artes; 2004.
  32. 32. Tshikolovets VV, Yakovlev RV, Kosterin OE. The Butterflies of Altai, Sayans and Tuva (Southern Siberia). In: The butterflies of the Palaearctic Asia 7. Pardubice-Kiew: Vadim V. Tshikolovets; 2009.
  33. 33. Korb SK, Bolshakov LV. A systematic catalogue of butterflies of the former Soviet Union (Armenia, Azerbaijan, Belarus, Estonia, Georgia, Kyrgyzstan, Kazakhstan, Latvia, Lituania, Moldova, Russia, Tajikistan, Turkmenistan, Ukraine, Uzbekistan) with special account to their type specimens (Lepidoptera: Hesperioidea, Papilionoidea). Zootaxa. 2016;4160:1–324. pmid:27615908