Metabarcoding on planktonic larval stages: an efficient approach for detecting and investigating life cycle dynamics of benthic aliens

High-throughput sequencing (HTS) technologies offer new promise to support surveillance programs targeting marine non-indigenous species (NIS). Metabarcoding might surpass traditional monitoring methods, for example through its ability to detect rare species, a key feature in early detection of NIS. Another interest of this approach is the identification of organisms difficult to identify based on morphology only (e.g., early developmental stages), making it relevant in the context of management programs. Because many marine benthic NIS have a biphasic bentho-pelagic life cycle, targeting their pelagic larval stages in zooplankton may allow early detection and assessment of their establishment and potential spread. We illustrate this approach with an analysis of bulk-DNA retrieved from a time-series of zooplankton samples collected over 22 months in one bay in Brittany (France). Using HTS of amplicons obtained with two markers (COI and 18S) and a metabarcoding approach, 12 NIS were identified and their temporal larval dynamics were monitored. Importantly, we chose to focus on a closed list of species, from four metazoan classes encompassing 52 NIS reported within the study area or nearby seas, with molecular references available or obtained locally for 42 of them. The use of a custom-designed database allowed the detection of three NIS otherwise not detected when using public databases. Interestingly, NIS known to have a short-lived larval stage were detected (e.g., the bryozoan Bugula neritina or the tunicate Corella eumyota ). For two molluscs Ruditapes philippinarum and Crepidula fornicata , metabarcoding results were compared to those obtained using traditional methods (i.e., barcoding of individual larvae and morphology, respectively) to show the reliability of the approach in detecting and assessing the extent of their reproductive periods. Our results also revealed that the Pacific oyster Crassostrea gigas , a notorious invasive species, failed to reproduce in the study bay, showing that metabarcoding on larval stages also provides information regarding the establishment success (or failure) of NIS. While metabarcoding has its limitations and biases, this study demonstrates its effectiveness for surveillance of targeted NIS, notably to support management strategies like the European Marine Strategy Framework Directive (MSFD).


Introduction
The number of marine non-indigenous species (NIS) has been increasing globally since the beginning of the 20 th century. This trend is an outcome of increasing maritime traffic and trade, and is expected to last (Sardain et al. 2019;Seebens et al. 2017). NIS can cause a wide variety of ecological (e.g., biodiversity loss, changes in ecosystem dynamics) and economical (e.g., infrastructure maintenance, aquaculture losses) damages (Molnar et al. 2008), which entail a wide range of actions, from prevention to longterm management (Simberloff et al. 2013). Monitoring NIS is therefore crucial in order to set-up handling strategies adapted to the different phases of the invasion sequence (Blackburn et al. 2011). Early detection will promote action at the earliest stage, during which NIS control is likely to be the most efficient, particularly in the marine environment (Ojaveer et al. 2015). On the other hand, monitoring NIS establishment and spread will allow long-term management and evaluation (e.g., reinvasion after eradication; Simberloff et al. 2013).
In coastal areas, shipping (commercial trade and leisure boating) and aquaculture are the most important introduction pathways (Molnar et al. 2008;Nuñez et al. 2014). Consequently, ports and aquaculture facilities, with their numerous artificial substrates, are points-of-entry for NIS and promote the settlement of new species (Bishop et al. 2015b;Connell 2001;Glasby et al. 2007), especially encrusting and sessile fauna (Firth et al. 2016). These infrastructures and facilities act as bridgeheads for the escape of newly introduced species into nearby natural habitats (Airoldi et al. 2015;Firth et al. 2016), where most ecological damage is observed. The lag phase between the primary introduction (arrival) of new NIS into artificial habitats and their escape into the wild is variable across species, imposing the need for regular temporal surveys in nearby natural environments. Such surveys could be achieved by applying NIS detection tools to existing long-term monitoring programmes (Ojaveer et al. 2015).
One particular feature of many marine coastal invertebrate species is the existence of a biphasic, bentho-pelagic life cycle, during which the benthic adult stage alternates with a pelagic larval phase (Mileikovsky 1971), living in the plankton for hours, weeks, sometimes months (Shanks 2009). In marine benthic NIS such a pelagic larval stage plays a major role at all steps of the invasion process (i.e., introduction, establishment and spread, sensu Blackburn et al. 2011). Given their small size, larvae can be transported in ballast water (Carlton and Geller 1993) or they can be released from adults within hull fouling communities (e.g., species brooding their embryos before releasing swimming larvae like some barnacle species) and thus can be major actors of primary introductions. Larvae may also facilitate the long-term establishment of introduced species by promoting the demographic reinforcement of their local populations through reproduction and recruitment (Dunstan and Bax 2007). Finally, they are the main vector for natural dispersal (Cowen et al. 2007), thus playing a major role in secondary spread and expansion of NIS in novel introduction areas. This is illustrated by notorious invasive species with long-lived larval stages, such as the green crab Carcinus maenas (Linnaeus, 1758) (Pringle et al. 2011;Tepolt et al. 2009).
Targeting larvae in monitoring programs may provide key information about NIS introduction status. When sampling in close vicinity of entry points such as harbours, the identification of larvae belonging to a formerly unreported species delivers early detection of a newly arrived NIS. Likewise, the presence of larvae assigned to an already reported NIS will prove its reproductive ability in the novel environment, its potential for spread, and will suggest that this species is now established. Moreover, monitoring the larvae of a target NIS over time may allow a better understanding of its reproductive biology in the introduced area. In particular, it may shed light on the period and environmental conditions that favour its reproduction, as well as its reproductive effort (abundance of larvae). The collected data could support predictive models, such as ecological niche models. Finally, because NIS larvae are also non-indigenous within the local plankton, observing and counting them may allow to better assess their potential impact within the pelagic community, a largely understudied topic.
Monitoring programs targeting larvae are quite rare and regular programs monitoring zooplankton usually neglect invertebrate larvae, or consider them as broad taxonomic groups, like "lamellibranch larvae" as a whole (e.g., Southward et al. 2005). Identifying species at the larval stage is indeed challenging, especially with traditional methods based on larval sorting and identification with morphological criteria, a time-consuming task which requires well-trained taxonomists. Difficulties may also arise from a lack of description for the larval stages of some taxa, but even when they exist, many species are indistinguishable from each other based on simple morphological criteria. This is especially true for groups like bivalves (Garland and Zimmer 2002), or in taxa comprising cryptic species which are numerous among marine invertebrates (Appeltans et al. 2012). To overcome these issues, several single-species DNA-based tools have been used for the identification of NIS larvae (e.g., Darling and Tepolt 2008;Harvey et al. 2009;Le Goff-Vitry et al. 2007;Sánchez et al. 2015). However, these approaches allow the identification (and sometimes quantification) of only one or a few target species. The DNA metabarcoding approach consists of the high-throughput sequencing (HTS) of selected barcodes obtained from environmental or bulk-DNA, and their taxonomic assignment based on a reference sequence database. It can identify simultaneously a large number of species from many specimens, thus being a promising method to study NIS larvae in the plankton Cristescu 2014;Elbrecht and Leese 2015;Valentini et al. 2016;Viard and Comtet 2015;Zaiko et al. 2015Zaiko et al. , 2018. Moreover, DNA metabarcoding is more sensitive than traditional approaches when detecting rare organisms, a key advantage when investigating newly arrived species at low environmental concentrations. When dealing with larvae, metabarcoding has proven efficient for detecting very few individuals, down to a single larva, in plankton and sediment samples containing a wide array of eukaryotes (Pochon et al. 2013;Sun et al. 2015;Zhan et al. 2013).
Several studies have shown the power of bulk-DNA metabarcoding from plankton samples to assess zooplankton biodiversity and describe the structure of pelagic communities (Abad et al. 2016(Abad et al. , 2017Bucklin et al. 2016;Chain et al. 2016;Deagle et al. 2017;Harvey et al. 2017;Lindeque et al. 2013;Lopez-Escardo et al. 2018). Some of them highlighted the complementarity between morphological identification and molecular HTS approaches at various taxonomic levels (e.g., Harvey et al. 2017;Lindeque et al. 2013), but most of them did not focus on larvae of benthic species. Nonetheless, they conceded that such methods are useful for the identification of larval stages from both pelagic and benthic species, when not possible with other approaches (e.g., Mohrbeck et al. 2015, and references above). To our knowledge, only one study clearly focused on bivalve larvae (Jung et al. 2018), and a few others further demonstrated the interest of plankton DNA metabarcoding for NIS identification at the larval stage, from both pelagic (Abad et al. 2016) and benthic species Brown et al. 2016;Zaiko et al. 2015). In these papers, the main objectives were the early detection of NIS in areas where they had not yet been reported, and the evaluation of metabarcoding as an efficient tool to detect NIS in areas where they had already been reported.
In this study, we focused entirely on the larval stages of benthic NIS (1) to evaluate the use of plankton DNA metabarcoding for the detection of new introductions and (2) to assess the establishment, expansion and reproductive features of already identified benthic NIS. For this purpose we used a metabarcoding approach on a 22-month plankton time series, to identify larvae of potential and known NIS, targeting four taxonomic groups.

Sampling and DNA extraction
The samples used in this study are part of a zooplankton time-series survey, which started in 2004, and is conducted in the bay of Morlaix (48°40′11.1″N; 3°53′9.7″W), Brittany, France. Zooplankton samples are collected bi-monthly around time of high tide (± 30 min) during neap tides. Sampling is performed using a vertical haul from the bottom to the surface with a modified WP2 plankton net (UNESCO 1968) with a mesh size of 63 μm and a mouth area of 0.25 m². A flow meter (KC Denmark A/S, model 23091) is attached at the centre of the net aperture to determine the volume of water filtered. Samples are preserved in 96% ethanol and stored at room temperature. For this study, to get insights about seasonal patterns of larval presence, and taking into account the large diversity of marine invertebrates reproductive modes, we used samples collected every two weeks from March 2012 to September 2012, and once a month until December 2013, for a total of 29 sampling dates.
Total DNA was extracted from each plankton sample using the PowerWater® DNA Isolation Kit (MoBio). The manufacturer's protocol was slightly modified by adding a drying step after filtration of the samples in order to evaporate all residual ethanol, and all volumes of reagents used were doubled. All equipment was either autoclaved or placed under UV light for 15 minutes before use. One blank was produced following the exact same extraction protocol on ultrapure water (18 MΩ, 0.22 μm) to ensure the absence of cross-contamination when processing the samples. Extracted DNA was stored at −20 °C.

Molecular analyses
For each sample, and the extraction blank, amplicons were generated in triplicates to minimize PCR biases. We used the primers SSU_FO4 and SSU_R22 from Fonseca et al. (2010) targeting a ca. 400-bp portion of the V1-V2 region of the small subunit rRNA (18S) coding gene, and the primers mlCOIintF (Leray et al. 2013) and jgHCO2198 (Geller et al. 2013), following Leray et al. (2013), amplifying a 313-bp portion of the Cytochrome Oxidase I (COI) coding gene (mitochondrial DNA). In addition to the 90 PCR replicates from extracted samples, three PCR blanks were performed to check for any contamination at this step. Every PCR replicate of every sample and every blank was individually tagged using eightnucleotide sequences (tags) added at the 5'-end of both forward and reverse primers. Each replicate was thus identified by the unique combination of its forward and reverse tag. For 18S, each reaction volume (25 μL) contained 0.5 U Phusion® High-Fidelity DNA polymerase (New England BioLabs), 1X reaction buffer, 200 μM dNTPs, 1 μM of each tagged primer and 2 ng DNA template. Amplification involved an initial denaturation step at 98 °C for 30 s, followed by 30 cycles at 98 °C for 10 s, 57 °C for 30 s and 72 °C for 30 s, and followed by a final extension step at 72 °C for 10 min. For COI, each reaction volume (25 μL) contained 1X Multiplex PCR Master Mix (QIAGEN), 1 μM of each tagged primer and 2 ng DNA template. Amplification involved an initial denaturation step at 95 °C for 5 min, followed by 35 cycles at 94 °C for 50 s, 57 °C for 90 s and 72 °C for 30 s, and followed by a final extension step at 68 °C for 10 min. PCR products were then purified using NucleoSpin® Gel and PCR Clean-up kit (Macherey-Nagel) and their concentration was measured using fluorescence by picogreen™. All PCR products of a same marker were then pooled at equimolar concentrations. Paired-end sequencing of the amplicons was performed by the company FASTERIS (Switzerland) using MiSeq Illumina technology (2x300 bp).

Data processing
After sequencing, reads were processed using the OBITools v1.2.11 pipeline . Briefly, paired reads were assembled and then grouped by replicate sample (i.e., demultiplexed) according to their unique tag pair. The primers and tags were then removed and all sequences outside of a specified size range were discarded (350-450 bp for 18S and 300-320 bp for COI). Finally, singletons (sequences present only once and in only one of the PCR replicates) were discarded. PCR and sequencing errors were detected using obiclean with a value of 2 for the parameter -d (maximum number of differences allowed for two reads to be considered deriving from one another) and a value of 0.025 for the parameter -r (threshold ratio between counts of two reads under which the less abundant is classified as an error deriving from the more abundant). These values were determined following in silico tests on an artificial dataset (see Supplementary material Figure S1 for details). The end point of these processing steps is a set of unique variants (i.e., sequences all different from one another and expected to be the "true" sequences that were present in the extracted samples).

Taxonomic assignment
Given the occurrence of numerous errors and missing data in public databases (Briski et al. 2016;Harris 2003), a custom-designed database was used for taxonomic assignment. This database was specific to the study area and curated with caution. A list of 670 species across the four target taxonomic groups (ascidians, bryozoans, bivalves and gastropods) was considered for inclusion in our custom-designed database. These species were native, non-indigenous or cryptogenic. The native species list came from "Fauna inventories of the Station Biologique de Roscoff" (available at http://www.sb-roscoff.fr/fr/observation/biodiversite/especes/inventaires/ inventaires-de-la-faune-et-de-la-flore-marines) and was completed with new data from local surveys and inventories. The non-indigenous and cryptogenic species list was composed of species that had already been reported in the study bay or were known to be present in the surrounding areas, based on both reports made by French authorities for the European Marine Strategy Framework Directive reporting, and local surveys and inventories (e.g., Bishop et al. 2015b). All available 18S and COI reference sequences associated with these taxa were retrieved from public databases (GenBank: Benson et al. 2013;BOLD: Ratnasingham and Hebert 2007;SILVA: Quast et al. 2013;PR2: Guillou et al. 2013) and manually checked. Only good quality sequences, with no more than one ambiguous base, were included. In order to increase taxonomic coverage, incomplete sequences were also added, where a length of at least 75% of the complete sequence was present. Reference sequences produced locally were also included, in Table 1. Number of taxa (with the number of NIS among them indicated in bold and in parentheses), number of unique variants and number of reads identified at different taxonomic levels (species, genus, family or class) for each marker (18S and COI), within the four taxonomic groups of interest (Gymnolaemata, Gastropoda, Bivalvia, Ascidiacea) and using a BLAST ® approach against a custom-designed database. The number of taxa (at the species, genus and family levels) with at least one reference sequence in the custom-designed database is given in the "Taxa with available references column", with the number of reference for NIS in bold and in parentheses.

18S
COI Taxa  particular for taxa with missing data in the above cited public databases (see Data availability section). For a better discrimination of known or expected NIS, reference sequences for species belonging to the same genus were also included. Finally, all species names were checked with the World Register of Marine Species (WoRMS; WoRMS Editorial Board 2019) to ensure that all names used were valid. In the case of the Pacific oyster, both Magallana gigas and Crassostrea gigas are accepted names, so we chose to use Crassostrea gigas (Thunberg, 1793) as advised by Bayne et al. (2017). In the end, our custom-designed reference database was composed of 408 and 3131 sequences covering 314 and 410 species, for 18S and COI, respectively (Table 1). Our dataset included 42 different NIS, 35 with reference sequences for both markers; the remaining seven lacked a reference for 18S (Table 1,  Table S1). Unique variants resulting from the OBITools pipeline were then compared to sequences from our custom-designed reference database using BLAST® (Altschul et al. 1990). Only unique variants found in at least two PCR replicates of the same sample were considered for taxonomic assignment. Only alignments covering at least 99% of the reference sequence were considered. Variants were then assigned to the species whose reference sequence had the highest identity percentage. If two or more alignments had the same identity percentage, the variant was assigned to the lowest common taxonomic level. Identity thresholds were defined for each marker in order to consider an assignment to the species level as valid. To select an appropriate threshold, we explored the barcoding gap of our target taxa for the two markers. To that end, each reference sequence was aligned with all the references in the database using the EMBOSS needle global alignment tool (version: 6.5.7.0; Rice et al. 2000). For each alignment, the identity percentage was calculated using R 3.4.4 (R Core Team 2018) and the values of intraspecific and interspecific identity were plotted as a density curve using the R package ggplot2 (Wickham 2016). For COI a clear gap between intra-and interspecific identity was observed at 92% ( Figure S2B), a threshold thus selected for species assignment. For 18S, the distribution of the intraspecific identity is aggregated towards 100% with a clear gap visible at 98% ( Figure S2A). However, to avoid false positive, we chose to apply a more conservative threshold of 99% for this marker. Note that the number of NIS detected using 18S was the same with both thresholds (not shown).
To assess the value of using a custom-designed database for NIS detection, we also performed a taxonomic assignment with two other methods frequently used in the literature and based on public databases, either with a BLAST® approach or by using the ecotag command implemented in the OBITools pipeline. For the first method, unique variants were aligned against the GenBank nucleotide database (nt accessed 11.01.19) using BLAST® (v.2.7.1+). Only alignments with at least 99% (18S) or 92% (COI) identity and 99% query cover were considered. Each unique variant was then assigned to the taxon name of the reference sequence with the highest identity percentage. If several reference sequences had identical identity percentages, the unique variant was assigned to the lowest common taxonomic level. For the second approach, we used the ecotag command of the OBITools package ) with a reference database created using the ecoPCR tool available in the same package by retrieving all sequences from GenBank, BOLD, Silva or PR2, and by adding all reference sequences produced locally. In the ecotag approach, unique variants are aligned with references using a global alignment algorithm, thus requiring only full-length reference sequences to be included in the database. Then, unique variants are assigned to the taxonomic level corresponding to the lowest common ancestor between all reference sequences that are closer to each other than to the selected unique variant . As no minimum identity is required, all unique variants are assigned to a taxon (although sometimes at a high taxonomic level, such as family or class). To summarize, three assignment approaches were used: (1) BLAST® against our custom-designed database, (2) BLAST® against GenBank nt, (3) ecotag against its specific database containing only full-length sequences available.

Tests for amplification failure
Following preliminary results and discrepancies observed between the two markers (see results), and to better interpret the results, we investigated potential taxonomic amplification failures. These tests were performed on the 24 target NIS for which we had access to tissue or DNA. PCR was carried out with the primers mlCOIintF and jgHCO2198 for the COI marker and the primers SSU_FO4 and SSU_R22 for the 18S marker, following the same protocols (see above). Amplicons were observed on a 1.5% agarose gel.

Cross-validation with traditional methods
In order to compare the results from the metabarcoding approach to more traditional ones, two additional datasets were used. They consisted in the parallel sampling, identification and quantification of larvae associated to the bivalve Ruditapes philippinarum (Adams and Reeve, 1850) and the gastropod Crepidula fornicata (Linnaeus, 1758) during the year 2012.
For R. philippinarum larvae, a traditional barcoding approach was used. Briefly, an additional monthly plankton sample was collected at the study site, from March to November 2012, using the same sampling protocol. In the laboratory, each sample was adjusted to a volume of 150 mL with 96% ethanol. Bivalve larvae were then sorted in three 5 mL subsamples (10% of the sample) under a dissecting microscope aiming to randomly sort at least 100 bivalve larvae, when possible, depending on their overall abundance. Finally, 64-200 bivalve larvae were obtained for each sample to be identified through individual barcoding. The DNA of single larvae was extracted following the method of Lasota et al. (2013). A 550 bp portion of the 5'-end of the 18S coding gene was amplified by PCR using the primers Myt18S-F (Espiñeira et al. 2009) and 18S-571-R (5'-CACCAGACTTGCCCTCCA-3'; C. Roby and T. Comtet, unpublished). Each reaction volume (25 μL) contained 1 U Thermoprime Plus DNA polymerase (Abgene), 1X reaction buffer, 200 μM dNTPs, 3.5 mM MgCl 2 , 0.4 μM of each primer, 0.01 mg mL -1 bovine serum albumin, and 2 μL DNA template. Amplification involved an initial denaturation step at 94 °C for 4 min, followed by a 6cycle touch-down at 94 °C for 40 s, 62-57 °C for 40 s, 72 °C for 1 min, followed by 30 cycles at 94 °C for 40 s, 57 °C for 40 s, 72 °C for 1 min, and followed by a final extension step at 72 °C for 10 min. PCR products were then sequenced in both directions by Sanger sequencing. Overall we obtained an amplification-sequencing success of 64% of all sorted larvae. Obtained sequences were then compared using BLAST® (Altschul et al. 1990) to reference sequences from GenBank and sequences produced locally for local species. Only alignments with 100% cover were considered. A larval sequence was assigned to a species if it differed by less than 2 base pairs (bp) (99.8% identity) from reference sequence(s) of a single species.
In all other cases (difference of 2 bp or more and/or several possible species), then the larval sequence was assigned to a family. This 2-bp threshold was based on Blaxter et al. (2005) and empirical observations. In this study, we only focused on larvae assigned to R. philippinarum, and results were expressed as their relative abundance compared to all bivalve larvae.
For C. fornicata, a traditional morphology-based approach was used. To estimate larvae abundances, additional mesozooplankton samples were collected using a WP2 plankton net with a 200-μm mesh size (UNESCO 1968) towed vertically from the bottom to the surface at each sampling date in 2012. Immediately after collection, samples were preserved in 96% ethanol. In the laboratory, larvae of C. fornicata were identified and sorted using a dissecting microscope, based on morphology, following early descriptions by Werner (1955), Thiriot-Quiévreux and Scheltema (1982), and using laboratory-reared reference larvae obtained during previous works (e.g., Taris et al. 2010). Our ability to correctly identify C. fornicata larvae was validated in a previous study with specific microsatellite loci (Riquet et al. 2017). Since larvae abundances for this species are usually low in the study bay, we were able to process the whole sample at each date. Pearson correlation coefficients between larvae concentrations and metabarcoding results for all 2012 samples were calculated using the stats package implemented in R 3.4.4.

Overall taxonomic assignment
After sequencing, 6,686,364 and 4,865,347 pairs of raw reads were obtained for 18S and COI, respectively. A total of 5,608,992 and 3,571,274 successfully passed the pairing, demultiplexing and primer removal steps, for 18S and COI, respectively. Only 30,639 (18S) and 1,032 (COI) reads were removed because they did not satisfy the size requirements. Finally, 1,670,807 and 1,022,459 reads corresponding to singletons were discarded, for 18S and COI, respectively. The few reads (n = 20) assigned to the 6 blank samples were discarded when checking for singletons. At the end of the processing steps 2,503,893 (37% of the initial number of reads) and 1,450,532 (30%) reads were retained for 18S and COI, respectively. They were composed of 48,037 (18S) and 10,662 (COI) unique variants that were used for taxonomic assignment.
Out of these unique variants, 556 and 2,144 were assigned to an accepted species, genus or family within the four taxonomic groups of interest, when compared to the reference sequences in our custom-designed database, for 18S and COI, respectively (Table 1). These accounted for a total of 317,070 and 395,533 processed reads meaning that ca. 13% and 27% of the 18S and COI datasets, respectively, were assigned to one of the target taxa ( Figure 1A). With our method, all COI unique variants were assigned to the species level Figure 1. A. Proportion of reads assigned to each of the target classes with the three tested assignment methods, namely BLAST ® against a custom-designed database (Bc), BLAST ® against the GenBank nt database (Bnt), and the ecotag tool from the OBITools suite (E). The number of reads assigned to Ascidiacea and Gymnolaemata are too low to be noticeable. B. Number of species identified with each method for the four target classes (Ascidiacea: As, Bivalvia: Bi, Gastropoda: Ga, Gymnolaemata: Gy). The proportion of NIS (dark colour) and native species (light colour) is indicated.
whereas some 18S unique variants were only assigned to higher levels (genus, family or class; Table 1). Overall, within the four taxonomic groups of interest, 86 and 79 taxa were identified at the species level, with 18S and COI respectively (Table 1). Most identified species belonged to Bivalvia (18S) and Gastropoda (COI) ( Figure 1A, B), which are the two classes with the highest number of species with a reference sequence in our database (Table 1).
Using alternative methods for taxonomic assignment (i.e., BLAST® against the GenBank nt database or assignment with the OBITools suite) did not change the proportion of reads assigned to the four target classes for COI (ca. 25% of reads assigned) ( Figure 1A). For 18S, alternative methods allowed an increase in the proportion of reads assigned to a taxon (up to 40% with ecotag) ( Figure 1A). However, the number of species identified with 18S was similar across the methods, although the distribution Table 2. Non-indigenous species (NIS) in the four target classes detected by 18S, COI or both in at least one plankton sample collected in the bay of Morlaix. For each marker, the number of unique variants is given, with the number of reads in parentheses. For each NIS, the type of dispersal mode (Dispersal) is indicated with short and long disperser describing species with a life cycle including a pelagic larval stage lasting less or more than 2 days, respectively. 'Reported' indicates if the species has previously been reported in the study bay. For each marker, the total number (in bold) of reference sequences, retrieved from public databases or produced locally (in parenthesis), available in the custom-designed database is specified (N ref ). Individual DNA was tested for amplification failure with the COI primers (see Figure S3 for amplification results). In case of COI amplification failure, N/A was added to the COI detection column.

Class
Species 18S  a To avoid false positives, we only kept taxa identified in more than one PCR replicate; however it is noteworthy that three reads (from two samples) were assigned to Botrylloides spp. and four reads (from two samples) to C. gigas in only one PCR replicate. b Although not reported in the study area fauna inventories, R. philippinarum is known from nearby bays and breeding trials were conducted in the 70's near our sampling location.
of assigned species per class and/or status species varied ( Figure 1B). With COI, the number of species identified was a bit higher due to the number of identified gastropod species, the other class and/or status being similar ( Figure 1B).

Identification of non-indigenous species
Of the 42 NIS of interest for which reference sequences were available, 12 were detected with at least one marker using our custom-designed database ( Table 2). Most of them belonged to ascidians (4) and bivalves (4), whereas two were gastropods and two bryozoans. Most species exhibit a pelagic phase with either long-lived (5) or short-lived (6) larvae. The only identified species with no pelagic larval stage is Crepipatella dilatata (Lamarck, 1822). Among these 12 NIS, nine were already reported in the study area and two had no previous record. In addition, Ruditapes philippinarum was not present in the Roscoff fauna inventories but has been observed recently. The 12 NIS were identified by at least one unique variant with 100% identity (99% cover) to a reference sequence, except three with 99% identity with an 18S reference sequence, namely Crassostrea gigas, Mercenaria mercenaria (Linnaeus, 1758) and Crepipatella dilatata. Ten out of the 12 NIS were identified based on 18S. Among these ten species, six were detected by 18S only. Three of them (the ascidians Asterocarpa humilis (Heller, 1878) and Corella eumyota Traustedt, 1882, and the gastropod Crepipatella dilatata) were unable to be amplified by the COI primer pair (Table 2, Figure S3). Botrylloides violaceus Oka, 1927 andB. diegensis Ritter andForsyth, 1917 were both detected by COI but could not be identified to the species level with 18S. However, three reads corresponding to three unique variants were assigned to the genus Botrylloides. These Botrylloides sequences were observed in only one replicate of each of two sampling dates . Those of February 2013 were congruent with the observation of COI reads assigned to B. diegensis, whereas those of November 2013 were not. Conversely, no 18S reads assigned to Botrylloides were identified in March and May 2012, when COI reads were assigned to B. violaceus. Six species were identified with COI, four of which also identified with 18S.
The Manila clam Ruditapes philippinarum was the most represented NIS with 32,677 and 20,936 assigned reads for 18S and COI, respectively (Table 2). It was also the most abundant among all bivalves (native and non-indigenous) representing 13% (18S) and 63% of reads assigned to this class. A large number of COI reads (10,757) were also assigned to the slipper limpet Crepidula fornicata, whereas only 203 reads were assigned to this species with 18S. Except for the Pacific oyster Crassostrea gigas, the species with long-lived larvae (spending on average 2-5 weeks in the water column) were associated with a large number of reads, ranging from 319 to 32,677 over all samples. Conversely, the six species with short-lived larvae and the species with direct development were represented by a small number of reads (less than 130 per species over all samples; Table 2).
With the two alternative assignment methods, the number of detected NIS was 25% to 50% lower than the one achieved with our custom-designed database ( Figure 1B), and no extra NIS were identified. The use of ecotag brought the lowest number of NIS (6) with five species assigned with 18S (A. humilis, C. eumyota, C. fornicata, Mya arenaria Linnaeus, 1758, and R. philippinarum) and four with COI (Bugula neritina (Linnaeus, 1758), C. fornicata, M. arenaria, and R. philippinarum). As compared to ecotag, the BLAST® approach against the GenBank nt database brought the same six NIS as well as C. gigas and Mercenaria mercenaria with 18S and B. violaceus with COI.

Temporal variations
For the soft-shell clam M. arenaria, the Manila clam R. philippinarum and the slipper limpet C. fornicata, the temporal window over which the species were detected was the same for 18S and COI (Figure 2). In addition, for these taxa as well as the hard-shell clam M. mercenaria, identified by 18S only, the read distribution exhibited a seasonal pattern. They occurred in the plankton mainly during summer and autumn (e.g., July to October for R. philippinarum) and were almost absent in winter except for C. fornicata which was detected from March to October each year. These results were in contrast to those obtained from direct developers and short dispersers which were observed in a single or few samples (e.g., February 2013 for the ascidians A. humilis and B. diegensis, Figure 2). The only long-disperser species with no clear seasonal pattern was the Pacific oyster C. gigas, which was identified with only twelve 18S reads in a single sample in September 2012 ( Figure 3A), and no COI reads. Note however that 628 18S reads assigned to the genus Crassostrea, which could presumably be assigned to C. gigas, were observed from July to October 2012, and in September 2013 ( Figure 3B). As a comparison, its native counterpart, the European flat oyster Ostrea edulis Linnaeus, 1758, was identified with thousands of reads (18S: 5,751; COI: 2,316) from June to October each year ( Figure 3C).

Comparison of the metabarcoding results with other datasets
The metabarcoding results were compared with data obtained through either barcoding of individual larvae (Manila clam R. philippinarum) or morphological identification of larvae (slipper limpet C. fornicata). Whatever the method considered, R. philippinarum was observed in three samples only, namely August, September and October 2012 (Figure 4). The 18S reads   assigned to the Manila clam R. philippinarum accounted for between 18% and 62% of the number of reads assigned to bivalves in the samples where the species occurred ( Figure 4A). This proportion was even higher when considering COI, ranging from 67% to 95% across samples ( Figure 4B). These temporal variations matched well those displayed by the number of larvae identified using traditional barcoding, even if in this case the proportion of clam larvae was lower, ranging from 16% to 30% ( Figure 4C).
In the metabarcoding datasets C. fornicata was identified in all samples from March to October 2012 with both markers (except the 28.08.12 and 10.10.12 with 18S; Figure 5A, B), which was congruent with the observation of larvae from this species based on morphological identification over the same year ( Figure 5C). No correlation between larval counts and the number of reads was observed, neither for 18S (r = −0.07, p = 0.81) nor COI (r = 0.29, p = 0.29).

Discussion
DNA metabarcoding is predicted as a promising approach to identify NIS at all steps of the introduction process (e.g., Comtet et al. 2015;Darling et al. 2017;Zaiko et al. 2018), and this study provides evidence supporting this statement. We examined zooplankton samples collected over 22 consecutive months, targeting larvae of benthic NIS. By using two DNA markers, 18S (nuclear DNA) and COI (mitochondrial DNA), and a custom-designed database, we identified 12 NIS among the four targeted taxonomic groups (Table 2). Our method proved to be efficient in detecting both long and short dispersers. Moreover, three of the identified species had never been recorded in the study area and could be novel introductions, previously unnoticed ancient introductions, or false positives. Finally, seasonal variations inferred from read distributions of several species were consistent with their known reproductive periods.

Zooplankton metabarcoding: an efficient tool to detect non-indigenous benthic species
Among the four classes studied, 42 NIS were targeted, present either in the study bay or in neighbouring areas, notably along the French and English Western English Channel coasts. Twelve of them (29%) were identified using our approach. However, only 17 species out of 42 were previously reported in the study bay, and one of them (Ciona robusta Hoshino and Tokioka, 1967) was observed only once in 2012 and never observed since then despite regular surveys (Bouchemousse et al. 2017). Thus, 75% of these 16 expected NIS were actually detected. Zooplankton metabarcoding thus appears as a powerful tool for NIS detection provided that multiple markers and a custom-designed database are used.
Only four of the 12 NIS were identified with both COI and 18S. This result agrees with previous studies which highlighted the usefulness of combining multiple markers in metabarcoding approaches to detect NIS (Borrell et al. 2017;Zhang et al. 2018). The detection success was higher with 18S (10 species) as compared to COI (6 species). COI is commonly used for (meta)barcoding studies (Bucklin et al. 2011;Comtet et al. 2015), and has been suggested as the barcode of choice for metazoans (Andújar et al. 2018;Hebert et al. 2003). This is in part due to its high polymorphism allowing the identification at the species level, but also to the availability of a large number of reference sequences, representing species from all over the world (Porter and Hajibabaei 2018). However, the lack of conserved primer binding sites can cause amplification failure and taxonomic bias that may prevent its use for broad biodiversity surveys (Clarke et al. 2014;Deagle et al. 2014). Out of the 24 NIS tested for amplification success, one fourth could not be amplified (n = 3) or had a weak signal (n = 3; Figure S3), which might have prevented their detection. Furthermore, PCR failure did not apply consistently across all species of a family and did not appear to be correlated with phylogeny (Table S1, Figure S3). This makes predicting the lack of detection of a given taxon very challenging. Moreover, additional PCR biases can arise when DNAs from several species are competing with one another, so obtaining amplicons from individual DNA does not ensure the correct detection of a species in a mixture. Proposed ways of preventing these issues are the design of specific primers, targeting a particular taxonomic group, which could be used in multiplexes to increase taxonomic coverage (Kelly et al. 2017;Zhang et al. 2018), or the combination of COI with more conserved markers. The 18S gene has been suggested as another barcode for high-throughput sequencing-based assessment of biodiversity (e.g., Abad et al. 2017;Zhan et al. 2014). It is highly conserved across species, allowing the design of universal sets of primers targeting different regions and a broad range of taxa, as illustrated in Figure S3. However, this lack of variability may impair taxonomic resolution. In our study, several closely-related species could not be differentiated with this marker (e.g., species from the genus Botrylloides). Again, taxonomic groups are not affected in the same way. Out of the six NIS identified solely with 18S (Table 2), three (Corella eumyota, Asterocarpa humilis and Watersipora subatra (Ortmann, 1890)) are efficiently discriminated by this marker (Figures S4 and S5). As they have all been reported in the study area, we assumed that they were actually present in our samples. Thus 18S can be a useful complement to COI, but it is critical to verify its discriminating power to avoid drawing erroneous conclusions about the presence of a NIS.
Our ability to identify a species not only depends on the amplification efficiency and discrimination power of the chosen markers, but also on the quality of the reference sequences. Building a manually-checked customdesigned reference database (with good quality sequences and reliable identification) comprising hundreds of species is time-consuming, and one might wonder if the reliability of the results obtained is worth the effort. Taxonomic assignment is commonly based on public databases, like the nucleotide collection of GenBank (nt), which are recognized to include mistakes (e.g., Harris 2003), with risks of false assignments or failure to detect target NIS. One common example of such risks is the description of a new species within a species complex. This is the case of the tunicates Ciona intestinalis (Linnaeus, 1767) and C. robusta, which were reported in our study area (Bouchemousse et al. 2016b). Formerly known as C. intestinalis type B and type A, respectively , they were recently reclassified as C. intestinalis and C. robusta based on molecular and morphological evidences (Gissi et al. 2017, and references therein). No reliable assignment to the species level would have been possible since some references obtained before the taxonomic revision are still falsely attributed to C. intestinalis (e.g., AK173399.1). More specialized and curated databases (e.g., BOLD, Silva) must be preferred for species identification, if references are available for the targeted taxa. Besides, the completion of the reference database is a key condition for an accurate detection of NIS (Briski et al. 2016;Comtet et al. 2015) and producing new reference sequences for known or potential NIS is worth the effort. In our results, three NIS were not identified when using BLAST® with GenBank nt (Crepipatella dilatata, Watersipora subtorquata (d'Orbigny 1852), and Botrylloides diegensis). No public reference was available for our targeted markers, and they were identified based on our locally-produced references. Another species, Bugula neritina, was also not detected with the 18S marker when using the nt public database. Divergent lineages often exist within accepted species (Pante et al. 2015), and species complexes have been uncovered in many invasive species, such as Bugula neritina (Fehlauer-Ale et al. 2014) or Watersipora spp. (Mackie et al. 2012). Obtaining local references is therefore important to ensure correct assignment, particularly when using a threshold value. For example, the sequences we produced locally for B. neritina diverged by 3% from the one available in GenBank (AF499749.1), which most likely originated from a Chinese population. This difference, higher than the threshold chosen for assignment with 18S, prevented the identification of this NIS when using only public references. Finally, the alignment method used for taxonomic assignment matters, as shown here by using two alignment tools (BLAST® versus ecotag command from OBITools). ecotag required building a composite reference database composed solely of full-length sequences, which lead to a reduction in the number of available references and the number of species with references. By using this approach, six NIS were lost as compared to the BLAST® procedure against the custom-designed database. Although particularly adapted to biodiversity studies as it allows assignment at higher taxonomic levels (e.g., family) and not only at the species level, ecotag thus appeared less efficient here because the available references (including some of our locally-produced reference sequences) comprised incomplete (i.e. shorter) sequences.

An efficient method to detect both long and short dispersers
Zooplankton metabarcoding has proved efficient in identifying species with both a long and a short pelagic larval duration. Amongst the 16 expected NIS, only three (Crepidula fornicata, Crassostrea gigas, and Mya arenaria) release long-lived planktotrophic larvae. These species are expected to be more easily detected in low-frequency sampling strategies because of their extended pelagic duration. All three were detected. The 13 other NIS with a previous record either have short-lived lecithotrophic larvae (larvae that do not spend more than 48h in the water column), or are directdevelopers (one species, Tritia neritea (Linnaeus, 1758)). Six of these, all short dispersers, were identified (two bryozoans and four ascidians; Table 2). This demonstrated the efficiency of the method to detect species releasing short-lived larvae, as also observed by Stefanni et al. (2018). Note however that their detection does not necessarily imply the presence of larvae in our samples, but may result from remains of benthic individuals (e.g., fragments of branching bryozoans), rafting colonies (e.g., Worcester 1994 for colonial ascidians), or post-settlement life stages resuspended from the bottom (e.g., Hamel et al. 2019;Valanko et al. 2010).
Besides amplification failures ( Figure S3), our inability to detect the remaining seven already reported short-dispersive or direct-developing NIS might, in part, be explained by the mismatch between their biological features and our sampling strategy. Most of the undetected NIS are commonly reported and/or preferentially found on artificial substrates of the nearby marinas or on ropes and cages in aquaculture facilities (e.g., Airoldi et al. 2015;Bishop et al. 2015a, b;Bouchemousse et al. 2016a;Glasby et al. 2007;Simkanin et al. 2012), including in the study bay (Authors, personal observations; Laurent Lévêque, Station Biologique of Roscoff, personal communication). The scarcity of this type of artificial substrate at our sampling site might explain their absence in this part of the bay. Moreover, short-lived larvae of these species might not disperse far enough to reach the sampling site. An alternative explanation is provided by the example of the tunicate Styela clava Herdman, 1881. We expected to detect this species since it has often been observed in natural rocky reefs nearby the sampling site. Its weak amplification for COI might have precluded its detection with this marker but some sequences were expected to be assigned to Styela clava with 18S. One possible explanation could be the discrepancy between sampling time and the reproductive behaviour of the species, since it has been shown to spawn at the end of the day, with a maximum abundance of tadpole larvae in the middle of the following day (Bourque et al. 2007). This suggests that adapting the sampling time to existing knowledge of the target NIS reproductive features, especially for those with short-lived larvae, and periodic and short spawning events, should be considered to improve the detection capacity of the approach.

Detection of previously unreported species
Three species, two with long-lived larvae and one direct-developer, with no previous local record were identified in this study. Among them, the Manila clam Ruditapes philippinarum was detected in 12 samples with either one of the markers. Since data from individual barcoding confirmed the presence of its larvae in additional samples collected at the same time as our metabarcoding data (August to October 2012; Figure 4), we are confident that the detection of this species is not the result of a false positive. Reads assigned to R. philippinarum were numerous, and dominated those assigned to bivalves (60-80%, depending on the marker; Figure 4), which questioned the origin of these larvae. This species was imported to France for aquaculture purposes (Flassch and Leborgne 1994), but is not cultivated in the study bay, although some trials have been conducted in the 1970's (Flassch 1988). Some individuals have been observed but no substantial population has been reported until now (authors' personal observations). The larvae could have originated from an unknown farmed population in the bay or from an overlooked local established population. They could also come from neighbouring populations (e.g., Caill-Milly et al. 2014) or from a more distant source. The study bay has a ferry terminal with regular connections with ports from northern Spain and southern England, which may favor larval transport and release via ballast water. In particular the Spanish ports of Bilbao and Santander, connected to Roscoff in the bay, are known to harbour R. philippinarum (Bidegain et al. 2015;Zorita et al. 2013).
For the two other unreported species that have been detected (Crepipatella dilatata and Mercenaria mercenaria), doubt may be raised. M. mercenaria was only detected with 18S despite the lack of PCR failure with COI ( Figure S3). M. mercenaria is a bivalve of commercial interest, imported from North America to South Brittany in the 20 th century for aquaculture purposes (Marteil 1956). It has then been reported in several parts of Europe, the closest to our study bay being in Southampton (Ansell 1963) and in the Gulf of Morbihan where it appears to have successfully established (Goulletquer et al. 2002). It may also have reached the nearer bay of Brest (C. Paillard, University of Brest, personal communication). It is thus possible that this species has arrived in the bay of Morlaix, at least as dispersing larvae, especially since the periods of detection are in agreement with its known reproductive period (Ansell et al. 1964;Ansell and Lander 1967). However, the phylogenetic tree of the family Veneridae ( Figure S6) showed that the 18S marker used does not allow reliable discrimination between M. mercenaria and closely-related species (e.g., Dosinia corrugata (Reeve, 1850) or Clausinella fasciata (da Costa, 1778)), some being present in the bay of Morlaix. Focusing on the direct-developer Crepipatella dilatata, the results are even more challenging. Only three reads corresponding to the same unique variant were assigned to this species with 99.45% identity to the closest 367 bp 18S reference sequence (produced locally). However, the identity percentage with Crepidula fornicata, a NIS already reported and particularly abundant in the study bay (Rigal et al. 2010) reached 98.9%, a high value just below our selected threshold for 18S (99%). In addition, C. dilatata is native to the SE Pacific and has been reported as an introduced species only along the Atlantic coasts of northern Spain (Collin et al. 2009), which makes the presence of C. dilatata in our samples unlikely. Additional markers would be needed to ascertain the presence vs. absence of the two NIS cited above. A candidate for further study is the mitochondrial gene 16S, which seems to provide a good balance between taxonomic resolution and detection breadth for the study of marine invertebrates (Kelly et al. 2016(Kelly et al. , 2017.

Read counts as a proxy for larval abundance and NIS reproductive success
Detecting reads of a specific benthic NIS in a zooplankton metabarcoding dataset most likely indicates the presence of its larvae, although an alternative origin (shed cells, mucus, see above) could not be discarded. When resulting from the presence of larvae, read occurrence would be indicative of the reproductive status of populations that have most likely established in the vicinity of the sampling area. In such cases, the temporal distribution of reads can further allow to investigate the species reproductive dynamics. In the three benthic NIS with long-lived larvae and large number of reads (i.e., M. arenaria, R. phillipinarum, C. fornicata), such distribution suggested a seasonal pattern congruent with their known reproductive cycle. For example, our results, with both markers, suggested that larvae of the soft-shell clam Mya arenaria occurred in the plankton from May to October 2012, with two distinct periods, one in May-June, and the other in July-September ( Figure 2). This result is in agreement with its known reproductive periods in various parts of its distribution range (either native or introduced; e.g., Brousseau 1987;Cross et al. 2012;Warwick and Price 2009;Winther and Gray 1985). Results from 2013 showed a similar pattern, with lower read numbers. This might reflect interannual variability in the reproduction of Mya arenaria, as observed in other populations (e.g., Strasser and Günther 2001).
The match between the reproductive periods inferred from our metabarcoding data and those known for the targeted species is ascertained when looking at the well-studied slipper limpet (Crepidula fornicata). This species has been established at our sampling point for decades (Dupont et al. 2003;Le Cam and Viard 2011;Rigal et al. 2010). Mean concentrations of C. fornicata larvae at our sampling location averaged monthly between 2005 and 2011 indicated that larvae occurred from March to October ( Figure S7; Leroy 2011;Rigal 2009). These observations are congruent with our metabarcoding results ( Figure 5; Figure S7). Further, the mean distribution of COI reads associated with C. fornicata was similar to the mean distribution of larval concentrations ( Figure S7). This is in agreement with the results obtained in the meta-analysis of Lamb et al. (2019) who found that read counts loosely correspond to relative occurrence of species in the samples. However, some discrepancies are visible. In both datasets an abundance peak is observed in May, whereas a second peak is observed in September or in August, based on the metabarcoding and morphological datasets, respectively. In addition, when comparing larval abundances of the slipper limpet C. fornicata based on morphological identification to the number of reads assigned to this species in 2012 ( Figure 5), no correlation was found with either marker or even between both markers. This result is congruent with what have been found in other studies (e.g., Elbrecht and Leese 2015;Pochon et al. 2013;Sun et al. 2015) and compels us to nuance our data interpretation. Many elements can explain why we did not observe a correlation between read and larval counts for C. fornicata. For instance, the comparison between read abundances and individual counts can be arguable since the size of individuals will influence the quantity of DNA available after extraction, and thus their relative abundance in metabarcoding assays (Elbrecht and Leese 2015;Elbrecht et al. 2017;Harvey et al. 2017). Also, the presence of potential primer biases in the DNA mixture would add even more discrepancies in the read count vs. species abundance correlation, as demonstrated by Elbrecht and Leese (2015) or Piñol et al. (2019). Since 18S did not exhibit any amplification failure ( Figure S3), it is expected to display a better correlation with biomass as demonstrated by Clarke et al. (2017) in zooplankton samples. However, our results did not show a better signal with this marker.
The mechanisms cited above might also explain the lack of correlation between data collected for both 18S and COI in the case of C. fornicata, or the discrepancies observed between metabarcoding and single-larva barcoding for the Manila clam R. philippinarum ( Figure 4). Our data illustrated a clear seasonal distribution (Figure 2) in the range of the known reproductive periods over this species' introduced distribution range (Laruelle et al. 1994). However, single-larva barcoding showed that 30% of bivalve larvae were assigned to this species, whereas its relative proportion with metabarcoding reached 60-80%. The experimental design might also contribute to this lack of correlation, such as different extraction efficiencies, the normalisation of DNA concentrations prior to sequencing, or the type of sequencing platform. These points should be taken into account in further studies (Kelly et al. 2014;Lamb et al. 2019).
It is important to point that all the above statements rely on the assumption that traditional methods are devoid of any biases and give accurate estimates of species presence and abundance, which metabarcoding should match with. However, they have their own limitations and biases that may contribute to the discrepancies reported (e.g., Kelly et al. 2017). For instance, in the case of C. fornicata, the samples used for the two approaches were collected with different nets with a different mesh size. Because of the size of C. fornicata larvae (from 400 μm; Pechenik and Lima 1984), we did not expect a major effect of the mesh size on larval abundance, although we cannot reject this hypothesis to explain the observed discrepancy. In addition, because samples with the two nets were usually collected with a fifteen-minute interval, we cannot discard the possibility of short-term variability in larval abundance between the samples.
As the only long disperser for which no seasonal pattern was recovered, the case of the Pacific oyster Crassostrea gigas is interesting. This species was identified by a low number of reads, in September 2012 (Figure 3). This result was confirmed by the single-larva barcoding approach (only five larvae of C. gigas were observed, one in August, and four in October; data not shown), which suggested that almost no larvae were present. This finding was unexpected since numerous oyster farms are located in the bay of Morlaix, producing more than 5,000 tons of C. gigas per year (Comité Régional Conchylicole -Bretagne Nord, personal communication). Furthermore, a previous work reported the presence of Pacific oyster larvae at a station located 6 km from our sampling site in August and September 2007 (Philippart et al. 2012). Although C. gigas has become an invader in natural habitats in many places along the North and South Brittany coasts, it is absent from natural habitats in the Bay of Morlaix (Le Berre et al. 2009). A likely explanation is that the temperatures prevailing in the bay prevent either spawning or larval survival for this species. Several studies have shown that the proliferation of C. gigas occurred when summer temperatures were high (Diederich et al. 2005;Dutertre et al. 2010), and that spawning requires a threshold temperature of 18-20 °C (Dutertre et al. 2009(Dutertre et al. , 2010Enríquez-Díaz et al. 2008). Further, Rico-Villa et al. (2009) showed that larval development is optimal above 22 °C and is slowed at 17 °C, even if larvae can survive at 15 °C (His et al. 1989). The temperatures observed in the bay of Morlaix at the time of sampling did not exceed 18 °C (Figure 3), and could have limited the spawning of farmed Pacific oysters and the survival and development of produced larvae. Another possible explanation is that the larvae are exported off the bay, a hypothesis suggested to explain the low proliferation of C. fornicata at this same location (Rigal et al. 2010). Finally, the increase in the proportion of infertile triploid Pacific oyster in French production might have contributed to the low number of larvae produced by local farming. In contrast, the native flat oyster (Ostrea edulis), which only accounts for a low percentage of the produced oysters in the bay, has been identified with a high number of reads in both datasets ( Figure 3). The seasonal pattern displayed by metabarcoding results is in agreement with the known reproductive period of this species (Eagling et al. 2018), which is known to spawn at temperatures lower than those required by C. gigas (Mann 1979). Compared to its Pacific counterpart, the flat oyster (Ostrea edulis) seems to be able to reproduce in the bay, and the larvae might either be coming from nearby oyster farms or from a local established natural population.

Conclusion
DNA metabarcoding from bulk zooplankton samples is effective to detect benthic NIS. Overall our results suggested that the use of a customdesigned database combined to a BLAST®-based alignment was the more efficient approach for NIS detection and to avoid false assignments. This pinpoints the need for NIS-dedicated databases as advocated by Darling et al. (2017) or Dias et al. (2017), especially since using metabarcoding as a tool to detect NIS is becoming more popular. Focusing on the larval stage allowed us to go beyond the simple presence of a given NIS and helped us to, firstly, evaluate the reproductive success of the identified species and, secondly, to show that one notorious and conspicuous NIS (C. gigas) apparently do not produce larvae in the study area. In addition, for the long dispersers where enough reads were obtained, a reproductive window could be defined. Interestingly it was always consistent with the known breeding period of the species concerned. In this context, the use of DNA metabarcoding on zooplankton surveys would be a valuable tool to support surveillance programmes. Despite these findings, our study also highlighted important points to take into account in future studies, including sampling frequency (to increase the likelihood of detecting short dispersers) or the importance of using multiple and complementary markers.

Data availability
Reference sequences locally produced were deposited in GenBank (accession nos. MN066416 -MN066544 and MN073504-MN073507 for 18S and accession nos. MN064582-MN064634 for COI). In addition, the custom-designed database (fasta format), a file providing the species list and their Genbank accession number and the metabarcoding pipeline are available on Dryad Digital Repository at https://doi.org/10.5061/dryad. hj53jd8.

Supplementary material
The following supplementary material is available for this article: Table S1. Details concerning the 42 target non-indigenous species. Figure S1. Use of an in-silico mock community to determine the values for -d and -r parameters used in the obiclean tool. Figure S2. Frequency distribution of pairwise sequence identity between and within species 18S and COI markers. Figure S3. Agarose gel illustrating the amplification success of 24 NIS with both 18S and COI primers. Figure S4. Phylogenetic tree of the Styelidae family inferred from 18S sequences. Figure S5. Phylogenetic tree of the Bugulidae family inferred from 18S sequences. Figure S6. Phylogenetic tree of the Veneridae family inferred from 18S sequences. Figure S7. Monthly variations in abundance of slipper limpet (Crepidula fornicata) larvae based on morphological identification, and of reads assigned to C. fornicata based on metabarcoding data using the COI marker.