GAPeDNA: Assessing and mapping global species gaps in genetic databases for eDNA metabarcoding

Environmental DNA metabarcoding has recently emerged as a non‐invasive tool for aquatic biodiversity inventories, frequently surpassing traditional methods for detecting a wide range of taxa in most habitats. The major limitation currently impairing the large‐scale application of eDNA‐based inventories is the lack of species sequences available in public genetic databases. Unfortunately, these gaps are still unknown spatially and taxonomically, hindering targeted future sequencing efforts.


| INTRODUCTION
Aquatic ecosystems are increasingly impacted by human activities, threatening their biodiversity and causing major disruptions in their functioning (Cinner et al., 2016;Link & Watson, 2019;Reid et al., 2019).
Marine systems are under severe defaunation with numerous local species extinctions (McCauley et al., 2015) and also experiencing the highest rates of biodiversity changes under the combined effects of climate change and direct human impacts (Blowes et al., 2019).Freshwater ecosystems are even more at risk, with fishes being among the most threatened vertebrates due to habitat degradation or exotic species introduction (Collen et al., 2014).In this context, efficient non-invasive methods are urgently needed to accurately monitor aquatic biodiversity including rare, highly mobile and elusive species in order to set appropriate conservation management.
Among the many ways to survey aquatic biodiversity, eDNA metabarcoding has recently emerged as a promising approach, frequently surpassing traditional inventory methods in detectability potential (Boussarie et al., 2018;Carraro, Hartikainen, Jokela, Bertuzzo, & Rinaldo, 2018;Stat et al., 2019;Valentini et al., 2016).Exogenous DNA released by animals in the environment, through shed skin, mucus or faeces, can be retrieved by filtering water and amplified via polymerase chain reaction (PCR) using universal primers (Ficetola, Miaud, Pompanon, & Taberlet, 2008).High-throughput sequencing of the amplified DNA fragments provides a list of sequences over which corresponding species can be assigned by comparison with available genetic databases like the European Nucleotide Archive (ENA) (Dickie et al., 2018;Kanz et al., 2005).
However, the major limitation currently impairing the large-scale application of eDNA inventories is the incompleteness of species sequences available in public genetic databases, considerably reducing the breadth of detected biodiversity.Historically, eDNA studies have primarily focused on well-known species-poor freshwater systems (Jerde, Wilson, & Dressler, 2019), but recently, eDNA biodiversity inventories have spread all over the globe, across a wide range of ecosystems encompassing less studied and more diverse taxa and habitats (Cilleros et al., 2019;Jerde et al., 2019;Yamamoto et al., 2017).A recent study on European aquatic systems shows that genetic coverage varies widely among taxonomic groups, databases and the level of monitoring (Weigand et al., 2019) with, for example, European freshwater fish lacking genetic coverage on the 12S mitochondrial marker for 64% of the 627 species.
Teleostean fishes represent the largest group of vertebrates with more than 32,000 species ("www.fishbase.org," ) and a total biomass estimated at 0.7 Gt (Bar-On, Phillips, & Milo, 2018).They represent the most extensively studied taxonomic group using eDNA with up to 60% of the publications on vertebrates (Tsuji, Takahara, Doi, Shibata, & Yamanaka, 2019) and play a significant role in carbon cycling (Wilson et al., 2009) and food security (Hicks et al., 2019).
Despite their cultural, commercial and ecological importance, fish populations are increasingly depleted or threatened due to overfishing (Anticamara, Watson, Gelchu, & Pauly, 2011) and habitat alterations (Collen et al., 2014).Surprisingly, the extent to which genetic reference databases cover fish biodiversity for the most widely used metabarcoding primers is unknown, while it ultimately determines the amount and the composition of species potential revealed by eDNA surveys.This kind of information is currently available, albeit scattered across different databases, but we still lack a tool facilitating the assessment and visualization of genetic species coverage for a given region, a given taxon and a given primer pair.
Here, we filled this gap by developing a user-friendly, flexible and interactive web interface linking reference genetic databases to regional species lists.Using regional freshwater and marine fish checklists, we assessed geographical variations in species diversity coverage versus gap for different metabarcoding primer pairs.Then, we highlighted the geographical bias in genetic coverage and disparities according to the native and conservation status of species (IUCN), providing valuable recommendations for future eDNA investigations at global scale.

| Interactive web interface: GAPeDNA
To facilitate the global assessment and visualization of regional gaps in genetic databases for environmental DNA metabarcoding, we developed a user-friendly interactive web interface called GAPeDNA (https://shiny.cefe.cnrs.fr/GAPeDNA/, Figure 1), using the shiny R package (Chang, Cheng, Allaire, Xie, & McPherson, 2019).This interface allows researchers and stakeholders to easily locate gaps in the reference genetic databases at global scale for a selection of fish metabarcoding primers.A virtual PCR using the selected primers is performed on a selected online genetic database.The list of the amplified species is then compared to a spatialized checklist to generate the percentage of species referenced in each spatial unit or area (e.g.basins and ecoregions for freshwater and marine fishes, respectively) (Figure 1a).This percentage is then displayed with an interactive global map in GAPeDNA.This interface is flexible and can display results for several primer pairs per taxon and several spatial units, and allows the user to choose between several options (Figure 1b).We present the application for fish, but users are encouraged to suggest new taxa, which requires to have (a) at least one primer pair targeting the taxa using metabarcoding and (b) globally georeferenced species checklists.It also allows to visualize which species are actually sequenced for a given primer when clicking on the area of interest, under which conservation status (i.e.IUCN category) these species are, and extract this information as a comma-separated values (CSV) file.Users can thus quickly grasp information regarding sequencing priorities depending on their research interest.

| Genetic sequence database and genetic coverage by markers
To illustrate the distribution of species coverage, we used the European Nucleotide Archive (ENA) (Kanz et al., 2005) (release 138, FIGURE 1 Illustration of the process for generating map and data in the GAPeDNA web application (a) and details on the interface (b).User's spatial choices are in blue and green, genetic choices are in green, and visual displays are in red downloaded in January 2019) as the genetic reference database for fish species.This database was formatted using obiconvert from the OBITOOLs toolkit (Boyer et al., 2016) to run in silico PCRs (i.e.virtual PCR based on primer affinity to sequences).Yet, primer sequences need to be present within the sequence fragment deposited online to be detectable using this in silico approach.
An extensive literature search was conducted to identify the most commonly used primer pairs targeting fish for metabarcoding on ISI Web of Science with the following keywords: "fish" AND "metabarcoding" AND "primer" AND "environmental DNA".We discarded primer pairs not primarily targeting fish, only targeting a restricted group of fish or containing errors.Following this filtering, we retained 23 primer pairs from 18 papers (Table S1), from five regions in the mitochondrial genome (hereafter referred as markers), namely 12S, 16S, 18S, COI and CytB.All primer pairs were used individually to run in silico PCRs using ecoPCR from OBITOOLS (Boyer et al., 2016), with three mismatches allowed.All species amplified by each primer pair were compared to the regional fish checklists of both marine and freshwater environments, to obtain the percentage of species coverage by spatial unit and by primer.
Fish names obtained from GenBank were checked and updated using FishBase as the sole reference.We further discarded four primer pairs with low performance (global fish coverage < 0.05%) to avoid bias when comparing markers (Table S2), so we proceeded with a total of 19 primer pairs on 4 markers, as the only primer pair located on the 18S rDNA marker was discarded.The successful virtual amplification of a species by a primer pair is conditional to (a) species presence in the public genetic database and (b) the primer ability to amplify the sequence.Hence, primer pairs lacking universality for fish sequence amplification show an overall low coverage, even if located on a genetic marker with a larger sequence coverage in online database, as they are unable to amplify those due to primer specificity.

| Global species checklists and status
The checklist for freshwater fish was extracted from a global-scale database of fish diversity at the basin scale (Tedesco et al., 2017).
The authors reviewed a large body of information from 1,436 distinct sources over 3,119 drainage basins, covering more than 80% of Earth surface and comprising 14,953 fish species, so 90% of all freshwater fishes were recorded in FishBase (www.fishbase.org).
Although all biogeographic realms are well represented, some regional gaps remain in the database due to the scarcity of information or the probable low number of freshwater taxonomists in some regions like South-East Asia.The global diversity of marine fishes was assembled using OBIS (OBIS Ocean Biogeographic Information and System, (n.d.)) and regional checklists (Albouy et al., 2019;Pellissier, Heine, Rosauer, & Albouy, 2018), including manual verification to remove taxonomic classification errors.It contains available occurrence data for all marine teleost and agnathan fishes, so a total of 14,202 species representing 82% of all marine fish species were recorded in FishBase.The original spatial resolution was a 1° grid for all marine environments.For visualization and interpretation purposes, this grid was then coerced at two supplementary biogeographic spatial scales according to Marine Ecoregions (Spalding et al., 2007) (a) at the province scale, with 62 distinct units, and (b) at the ecoregion scale, with 232 distinct units.Latitudes and longitudes were computed as the centroid of each polygon at the finest resolution for both environments using the R package sf (Pebesma, 2016), and land areas were removed using polygons from Natural Earth Data (https://www.naturalear thdata.com/).Areas were calculated using the Mollweide equal-area projection and presented in figures using the Robinson projection.
For freshwater environments, a species is considered as non-indigenous in a given basin only if this species is able to complete its entire life cycle and harbours self-sustaining populations in that basin (Tedesco et al., 2017).A species is considered as indigenous when never occurring as non-indigenous in any basin following the original data (Tedesco et al., 2017).We acknowledge that some of the species classified as indigenous may have been introduced in another basin but have still not been identified, detected or been referred as such into global databases.However, our dataset represents currently the most recent and precise data on non-indigenous freshwater species at the global scale (Tedesco et al., 2017).For marine systems, we used the information supplied in FishBase and only considered species flagged as "introduced," excluding species categorized as "questionable" or "non-settled." Regarding the conservation status of species, we retrieved data from the redlist R package (Chamberlain, n.d.) to assign each species from both freshwater and marine environments into an IUCN Red List category.The abbreviation "DD" represents Data Deficient, "LC" represents Least Concern, and all Threatened or Near-Threatened categories were grouped under the "Threatened & NT" status.We excluded species identified as "EX" for Extinct and "EW" for Extinct in the Wild.Where no data were available, we assigned the value "NA."FIGURE 2 Global and latitudinal distributions of freshwater fish species richness on log scale (a, b), coverage by online genetic database for the Miya primer pair targeting the 12S mitochondrial rDNA region (c, d), the Kocher primer targeting the cytochrome B mitochondrial rDNA region (e, f), the DiBattista primer targeting the 16S rDNA region (g, h) and the Ward f2 primer targeting the COI mitochondrial region (i, j).The number of species along latitude (b) is log 10 -scaled and obtained from the finest resolution, here by basin.Global latitudinal patterns of all primer pairs are given in Figures S5 and S6, and the global distribution maps are reproducible and interactive using the web application (https://shiny.cefe.cnrs.fr/GAPeDNA/).Primers were chosen to represent the most used primer pair for each genetic marker in fish eDNA studies (Tsuji et al., 2019)

| Global distribution of genetic database completeness and gaps
The 3,119 freshwater drainage basins, located across all continents (except the poles), largely varied in terms of surface, from 2 km 2 to 5,888,4 1 7 km 2 (Amazon) with a mean of 31,996 km 2 (SD = 209,732 km 2 ).Their species richness ranged from 1 to 2,273 with a mean of 33 species (SD = 71), with an increasing number of species towards the equator following the classical latitudinal gradient (Figure 2a,b).Across the 232 marine ecoregions, species richness also greatly increased towards the equator, from 14 species (East Antarctic) to 3,937 species (South China Sea Oceanic Islands; Figure 4a,b).Marine ecoregion area varied from 19,000 km 2 (Puget Trough, Northern America) to 2,647,573 km 2 (Hawaii) with a mean of 588,862 km 2 (SD = 460,459 km 2 ), and no correlation between area and fish species richness was observed (Figure S1).
Global coverage of fish species in GenBank largely varied according to both the marker position along the mitochondrial genome and among primers for a given position (Figures 2 and 3), with a global coverage for freshwater species ranging between 7% for COI Ward and 26% for 16S McInnes, and a coverage for marine species between 4% for Thomsen Cytb cb and 30% for Shaw 16S (Table S2).For a given primer pair, species coverage also greatly varied along the latitudinal gradient, with a U-shaped relationship peaking in high absolute latitudes for most of the primers in freshwater systems.For example, the 16S McInnes primer pair had a mean coverage of 89% between 48° and 52° latitude (84 basins) and only 40% between −2° and 2° latitude (54 basins).This contrast was also marked for primers targeting the 12S mitochondrial rDNA region.For example, the 12S Miya primer pair covered 83% of the fish checklist in high latitudes (between 48° and 52°), but only 23% close to the equator (between −2° and 2° latitude, Figure 2d).The Cytb from Thomsen 2cbl and 2deg (Figures S5 and   S6) covered, respectively, 13% and 18% of the fish checklists, but showed no geographical gradient.
In marine ecosystems, the latitudinal gradient in species coverage was less pronounced with several primer pairs showing a steady decrease in coverage with decreasing latitude (Figure 3).
Tropical fish assemblages along the equator were less sequenced than northern temperate assemblages, but were generally more sequenced than in negative latitude ecoregions towards the south pole, as opposed to freshwater systems.Only the 12S Bylemans primer pair, covering 13% of marine fishes, showed no geographical pattern (Figure S6).

| Genetic coverage of native versus. Nonindigenous species
Environmental DNA can be used to track non-indigenous species in ecosystems.However, only the primers located on the 12S and 16S had a mean species coverage superior to 50% for all 605 identified non-indigenous freshwater fishes (Figure 4a).For the primers on the COI and Cytb, less than half of all non-indigenous fishes were amplified and sequenced.Only two primers, both on the 16S, had a coverage for more than 60% of non-indigenous species, while none had a coverage above 57% for the 12S primers.
However, these species still had an overall larger coverage in databases compared to native species, the maximum for native species being 31% for a 16S marker and 15% or 19% for 12S and Cytb markers, respectively.
For the marine fishes, we identified 196 species as non-indigenous in at least one region of the marine realm, two times less than the 605 species identified in freshwater.However, global patterns of coverage were similar (Figure 4b), albeit with a wider coverage of marine non-indigenous species compared to their freshwater counterparts (maximum 12S coverage of 69% versus 57%).Overall, for both categories, non-indigenous species were more sequenced than indigenous species, but 20% to 80% of fish species remain to be sequenced depending on the genetic marker.

| Genetic coverage of fish species with different IUCN conservation status
Most of freshwater fish species were not evaluated (NA, 45.9% of total) or Least Concern (LC, 33.2%).However, 1,758 species (11.7%) were classified as threatened by including Vulnerable (VU), ENdangered (EN) and CRitically endangered (CR) species or Near Threatened (NT) categories of the IUCN Red List (www.iucnredlist.org, Figure S2).The genetic database coverage of fish species according to their IUCN status showed consistent patterns for all markers (Figure 5a and 5c).Species classified as Least Concern (LC) were always more represented in genetics databases compared to non-evaluated (NA), data deficient species (DD) (Figure S3) or threatened species (T & NT).Freshwater basins where the most threatened species remain to be sequenced using the 12S Miya primer pair were mainly located around the equator with 79 species in the Congo Basin and 63 species in the Mekong Basin or in the Northern Hemisphere with a maximum of 72 species in the Mississippi Basin (Figure 5b).These basins also host the highest number of threatened species, independent of reference filling (Figure S4).

FIGURE 3
Global and latitudinal distributions of marine fish species richness (a, b), coverage of online genetic database for the Miya primer pair targeting the 12S mitochondrial rDNA region (c, d), the Kocher primer targeting the cytochrome B mitochondrial rDNA region (e, f), the DiBattista primer targeting the 16S rDNA region (g, h) and the Ward f2 primer targeting the COI mitochondrial region (i, j).The number of species along latitude (b) is log-scaled and obtained from the finest resolution, here by a 1° grid.Global latitudinal patterns of all primer pairs are given in Figures S5 and S6, and the global distribution maps are reproducible and interactive using the web application (https://shiny.cefe.cnrs.fr/GAPeDNA/).Primers were chosen to represent the most used primer pair for each genetic marker in fish eDNA studies (Tsuji et al., 2019)

(j)
In marine environments, 3.5% of all species were classified under an IUCN Red List status compared to 11.7% in freshwater systems (Figure S2), and around the same proportion of fishes were unevaluated or data deficient (49% versus.55% for freshwater).Genetic coverage was systematically higher for threatened species compared to Least Concern (LC) species, albeit never exceeding 50% for any primer or ecoregion (Figure 5).Species listed as LC consistently had a higher coverage than unevaluated or data deficient species (Figure S3).Marine ecoregions hosting the most threatened species remaining to be sequenced using the 12S Miya primers were also located around the equator, particularly in the Caribbean with a maximum of 42 species in the south-western Caribbean ecoregion or in the Eastern Coast of Africa with a maximum of 32 species in the Delagoa ecoregion (Figure 5d).

| Genetic markers and primer selection
eDNA is currently limited by the scarcity of species sequences available in online public genetic databases.We provide here a spatialized global assessment of fish sequence coverage and gaps in databases, using published eDNA primers, and displayed on an online, semi-automated and flexible application called GAPeDNA.
Our study considers all existing markers and most primers capable of theoretically amplifying fish species by in silico PCR, regardless of their performance, avoiding a bias in the choice of a genetic marker or primer pair.The marker and primer selection must be motivated by their efficiency to detect the targeted taxa owing to their specificity and sensitivity.A general consensus is emerging in fish eDNA studies towards the use of 12S primers (Collins et al., 2019;Weigand et al., 2019).Primers located on the 12S mitochondrial region have been recognized as the best to specifically amplify fishes, unlike COI primers which lack specificity, resulting in low fish detectability (Valentini et al., 2016).Unfortunately, we show that the 12S still has a very low species completeness in genetic databases, with strong spatial disparities.With the goal to sequence a maximum of species, it is crucial to reach a consensus in the genetic marker selection to join efforts towards a globally coordinated sampling strategy for this genetic marker.Once species gaps in the 12S sequences will be almost filled, it would pave the way to install eDNA metabarcoding as a robust and standard monitoring and inventory tool, capable of fish identification to the species level in every location.

| Mapping species coverage gaps to improve eDNA monitoring
The global diversity of both freshwater and marine fishes is not well covered in public genetic databases.Globally, we show a higher coverage around high latitude in the Northern Hemisphere consistent across the genetic markers and primers while tropical areas, which host more species, have more species gaps in public sequence databases (Figures 2 and 3).For freshwater fishes, the genetic species coverage exhibits a clear U-shaped pattern for almost all markers along the latitudinal diversity gradient (Hillebrand, 2004) (Figure 2), with a minimum percentage of sequenced species around the equator.For marine fishes, species coverage declines with declining latitude, and the minimum percentage of species sequenced is around the low latitudes of the Southern Hemisphere where marine fish diversity is the lowest.
The location of the gaps may drive the future sampling efforts required to fill them.Tropical environments are under-represented in public sequence databases and will require a costly, time-consuming and globally coordinated efforts to both describe and sequence the numerous species left to be discovered, as well as sequence the numerous species already described (Juhel et al., 2020;Pinheiro, Moreau, Daly, & Rocha, 2019).Environmental DNA is settling as an efficient inventory tool that can overcome hurdles encountered when sampling in tropical ecosystems.In many large water bodies, such as the Mekong or the Amazon, water turbidity prevents visual census leaving the eDNA the only non-invasive monitoring method (Cilleros et al., 2019;Yamamoto et al., 2017).The need to fill species gaps is urgent in these environments as they are experiencing major turnover in species identities with unknown consequences on ecosystem functioning and resilience (Magurran et al., 2018).
Tropical marine ecosystems are biodiversity hotspots, particularly the Coral Triangle (Barlow et al., 2018;Myers, Mittermeier, Mittermeier, Da Fonseca, & Kent, 2000).Tropical countries also tend to have a high dependency to fish resources (Andrello et al., 2017;Barange et al., 2014), stressing the importance of securing a sustainable exploitation of fish which requires monitoring assessments and correct evaluations of biodiversity as these both aspects are intimately linked (Duffy, Lefcheck, Stuart-Smith, Navarrete, & Edgar, 2016;Lefcheck et al., 2019).For instance, crypto-benthic fishes (<5cm) have been recently shown to contribute massively to coral reef functioning (Brandl et al., 2019), particularly by feeding fish consumed by humans, but they are still poorly inventoried.
Tropical countries are also projected to undergo among the most severe environmental impacts related to human population expansion and climate change (Barlow et al., 2018), highlighting the importance of conducting ecological studies and setting appropriate conservation programs.For instance, mesophotic reefs (30-150 metres depth) are still poorly known while they potentially host very different species assemblages that can be also affected by climate change (Lesser, Slattery, Laverick, Macartney, & Bridge, 2019;Rocha et al., 2018).Their exploration will require new eDNA-based protocols (fish sampling for reference database and water filtering) that must complement visual surveys that remain limited at this depth.
Yet, there is a clear publication bias with the most diverse ecosystems being the least studied in ecology (Hickisch et al., 2019).So, the efforts to achieve genetic database completeness are massive but necessary in such highly diverse environments in order to tackle major conservation challenges like the protection of vulnerable but still poorly described biodiversity.

| Environmental DNA metabarcoding to monitor non-indigenous species
Among the numerous threats that all aquatic environments are currently facing lies non-indigenous species, which have the potential to disrupt entire ecosystems when declared as invasive (Albins & Hixon, 2013;Bax, Williamson, Aguero, Gonzalez, & Geeves, 2003;Clavero & García-Berthou, 2005).For example, the Nile perch (Lates nicotilus), introduced in the 1950s in the Lake Victoria, drove around half of the hundreds of native Cichlid fish species to extinction through predation and competition (McGee et al., 2015;Witte et al., 1992).As traditional methods struggle to detect those species at an early stage of installation, eDNA offers an important potential for early detection below the traditional detection threshold (Dougherty et al., 2016;Hunter et al., 2015).Yet, a successful detection of species introduction relies on database completeness for those species.We show that, even among fish species identified as non-indigenous in freshwater ecosystems, up to 30% are currently missing in the best curated 16S database (Figure 4).For the genetic marker 12S, a maximum of 55% of non-indigenous species are sequenced per basin, twice as much as native species.It was expected that more non-indigenous species would be genetically referenced compared to native ones since referencing species occurrence outside their native range necessarily assumes their observation and a large proportion of introductions being intentional for recreational fishing (Leprieur, Beauchard, Blanchet, Oberdorff, & Brosse, 2008), making tissue for genetic sequencing easily available.
We highlight here that despite a higher coverage for non-indigenous species (Figure 4), the potential of eDNA to detect invasion events and provide early warning signals is still limited while crucial for mitigating deleterious effects (Vander Zanden, Hansen, Higgins, & Kornis, 2010).

| Sequencing threatened species to support their monitoring
Environmental DNA has a great potential in biodiversity conservation, addressing the constraints of detecting elusive or low-abundant species missed by traditional surveys.The proportions of threatened species estimated by the IUCN Red List (11% of freshwater and 3% of marine fishes) are likely underestimated as 48% of fish species are unevaluated while 7 to 9% are Data Deficient (Figure S2).Although the fate of Data Deficient species remains largely unexplored, they form the category with the least coverage in public genetic databases and are estimated to hide a large proportion of already threatened species (Bland, Collen, Orme, & Bielby, 2015).Even among threatened species, less than 50% have referenced sequences across all genetic markers, and surprisingly, their coverage is lower than Least Concern species for freshwater fishes.This can be due to the high number of threatened freshwater fishes, mainly located in hard-toexplore tropical regions (Collen et al., 2014).
Most threatened freshwater fishes live in large tropical basins such as the Congo, the Mekong or the Amazon (Figure 5).However, the Mississippi Basin, although located in a well-developed and science-leading country, the United States, where conservation measures and monitoring programs are well established, hosts 72 threatened species that are not sequenced for a 12S primer pair.So, efforts to complement genetic reference databases must be widespread and are not only related to the level of species richness or economic development, as often assumed.

| Interactive online application to support eDNA metabarcoding studies
We developed the user-friendly web app interface GAPeDNA to synthetize this large amount of information and make it easily accessible, even without any coding skills.It allows users to select a taxonomic group (at the moment, only freshwater and marine fish are available), the spatial unit or area, the genetic markers of interest and the corresponding primers to evaluate their global spatialized species coverage in public genetic databases, and have access to the corresponding list of species per spatial unit and status (IUCN).This permits the assessment of species remaining to be sequenced for a given spatial zone and sets priorities for sequencing.Although this study is focused on fish as an example, any new taxa can be added to GAPeDNA, providing necessary information is given: 1) primers suited for metabarcoding and 2) global spatialized species checklists.
This can thus expand the reach and potential of this tool within the metabarcoding scientific community and managers using eDNA for ecological surveys.
As the adoption of eDNA metabarcoding as a standard and robust monitoring approach worldwide depends on its ability to identify organisms at the species level, we hope that our tool and its potential as demonstrated by the fish example included here will encourage researchers, managers, foundations and institutions to work towards a joint effort for a global sequencing effort targeting taxa of interest to enhance eDNA metabarcoding inventories.

FIGURE 4
FIGURE 4 Percentage of species coverage (a) in marine systems for non-indigenous (196) and native species (12,290) and (b) in freshwater for nonindigenous (605) and native species (14,348) depending on the marker position.Each triangle represents a primer pair

FIGURE 5
FIGURE 5 Percentage of coverage according to two IUCN categories: Least Concern (LC) or Threatened and Near Threatened (NT) for all primer pairs and global gap in threatened species not sequenced illustrated for the Miya 12S primers in (a, b) marine systems and (c, d) freshwater systems.Each dot represents one primer pair.The threatened category includes the categories Vulnerable (VU), Endangered (E) and CRitically endangered (CR).The categories Not Evaluated (NA) and Data Deficient (DD) were not represented.All the categories are displayed in Figure S3