A 3- year plankton DNA metabarcoding survey reveals marine biodiversity patterns in Australian coastal waters

Aim: To use a long- term collection of bulk plankton


| INTRODUC TI ON
Human use of terrestrial ecosystems has accelerated rates of animal extinction and habitat loss throughout the Holocene (Di Marco et al., 2018;Jackson et al., 2001;McCauley et al., 2015). However, until recently, the inaccessibility of many marine environments has limited the exploitation of many marine faunas-particularly in deep seas and oceanic regions (see McCauley et al. (2015)). This protection ended with industrialization and the advent of commercial fisheries (Lotze et al., 2006;McCauley et al., 2015). Anthropogenic climate change also threatens marine species, and the ongoing transformation of marine environments in response to rising ocean temperatures is expected to have profound ecological (Edwards & Richardson, 2004;Gattuso et al., 2015;Griffiths et al., 2017) and economic and societal consequences (Beaugrand et al., 2002;Wernberg et al., 2016). Ocean acidification, caused by increased atmospheric CO 2 concentrations, further compounds these effects (Gattuso et al., 2015;Richardson, 2009). Taken together, these human impacts demonstrate the pressing need for effective marine management, including monitoring the state and trends of ecosystems and protecting of marine life for future generations (Edwards et al., 2010;Gattuso et al., 2015).
For practical reasons, many marine monitoring programmes concentrate on larger organisms, such as fish, seabirds and marine mammals (Lenanton et al., 2017;Richardson, 2009;Richardson et al., 2012). Yet these animals, for the most part, have long lifespans and comprise the higher end of the marine food chain, meaning they respond relatively slowly to ecosystem change. By contrast, zooplankton communities respond rapidly to changing conditions and are orders of magnitude more abundant than higher trophic levels, and make a more significant contribution to ocean nutrient cycling (Richardson, 2009). Zooplankton communities comprise mostly microscopic and short-lived animals that drift in ocean currents. They are a critical trophic link between the marine autotrophs (phytoplankton) and higher trophic levels that are typically harvested or valued by humans. Because of the essential role zooplankton have in marine ecosystems, their sensitivity to environmental change and relatively easy collection, zooplankton surveys have been used in marine monitoring for almost 100 years (Batten et al., 2019;Edwards et al., 2010;Richardson, 2009).
Through necessity, morphology has historically been the chief method for zooplankton species-level identification, but the process is time-consuming, prone to error and is difficult for juvenile life stages and damaged specimens Lindeque et al., 2013) and requires considerable taxonomic expertise.
New automated methods such as ZooScan (Gorsky et al., 2010;Rohner et al., 2015) and Laser Optical Plankton Counters (Herman et al., 2004) have been developed, but neither of these methods can identify zooplankton to a species level (Everett et al., 2017). DNAbased species identification has been used for many years as a tool to identify individual taxa. This barcoding process is accurate, relatively rapid and negates the need for taxonomic expertise. However, this approach is not feasible for taxa as small and varied as zooplankton.
The development of metabarcode sequencing, advanced computer technologies and reference sequence databases, together with progressive analytics, allows for the identification of a multitude of taxa from just one sample of DNA extracted from an environmental substrate. This metabarcoding approach is much more suitable for targeting zooplankton taxa and can be used for endangered, cryptic, reclusive or invasive taxa.
As a highly diverse complex mixture, genomic DNA extracted from a bulk sample (such as plankton) is but one type of environmental DNA (Taberlet et al., 2018). While the majority of the DNA extracted from a bulk plankton sample is organismal, much of it is extra-organismal. This is especially the case where the plankton is collected by filtering a large amount of water through a drop net or filtering across a transect, a process which has since been described as large-volume eDNA sampling (Suter et al., 2020). Ongoing research using eDNA metabarcoding technology has demonstrated the ability to obtain useful genetic data from a range of marine environmental substrates such as seawater (Alexander et al., 2019;Andruszkiewicz et al., 2017;Stat et al., 2017), scat Casper et al., 2007;Deagle et al., 2005;Hardy et al., 2017) and sediment (Guardiola et al., 2015;Morard et al., 2017). However, only a relatively small number of metabarcoding studies of biodiversity use plankton as a substrate (Alberti et al., 2017;Berry et al., 2019;Deagle et al., 2017;Gimmler et al., 2016;Lindeque et al., 2013;Richardson et al., 2019;Suter et al., 2020). This is, in part, because many existing barcode markers lack the balance between the taxonomic coverage and the resolution required to thoroughly characterize this highly phylogenetically diverse substrate (Brown et al., 2015;Clarke et al., 2017;Lindeque et al., 2013); zooplankton samples commonly include specimens from a dozen phyla and high species diversity .
In 2019, Berry et al. demonstrated that a multi-assay approach to metabarcoding could be used to map seasonal changes and heatwave responses across a 5-year period using collections of zooplankton from a single site in Western Australia. Here, we extend the spatial extent of analysis and take six metabarcoding assays (Table S1) and apply them to 90 plankton samples taken from nine Australian National Reference Stations (NRS) over a 2-year period.
The NRS network has been maintained by the Integrated Marine Observing System (IMOS) since 2008, as an integral part of a national programme of environmental monitoring (Lynch et al., 2014).
The locations of the NRS range from tropical to temperate areas, and encompass the six marine biogeographical regions that surround K E Y W O R D S bulk plankton, environmental DNA, marine biodiversity, metabarcoding, zooplankton Australia ( Figure 1). The NRS range also includes Australia's two principal poleward-flowing currents; the East Australian Current and the Leeuwin Current, on the western side of the country (Lynch et al., 2014). Our aim is to further test the capacity of DNA (using a multi-assay approach) to characterize the spatial and seasonal patterns found within the zooplankton communities across Australia and investigate links with concurrent abiotic data collected as part of the IMOS programme. In doing so, we seek to test the efficacy of the method for use in the ongoing monitoring of zooplankton communities.

| Sampling details and subsequent DNA extraction
Samples were collected from nine IMOS National Reference Stations ( Figure 1), with at least one site to represent each Australian marine biogeographical area. Samples were collected and stored with the forethought that they would be used in the future for genetic testing. Three sites are situated north of the Tropic of Capricorn and six are south. The marine environments sampled vary from tropical/ equatorial (Darwin-mean SST 28.6°C) to temperate (Maria Islandmean SST 15.3°C- Figure 1). Since mid-2010, sampling has taken place monthly at most of these sites and seasonally at Ningaloo and Esperance (Lynch et al., 2014). We selected 90 seasonal samples from the IMOS collection from 2012 to 2014, averaging 6-12 samples per site (a maximum of three samples from each season per site).
Plankton samples were collected using a 0.6-m-wide, 3-m-long drop net (Eriksen et al., 2019) with a 100 μm mesh, which falls at 1 ms −1 . The seabed depth at each site varied between 20 and 100 m (Lynch et al., 2014) and the drop net samples to a depth of 5 m above the seabed. Plankton is collected on the downward drop only-the net is closed as it is recovered to the surface.
Seawater was used to wash the samples into the cod end of the drop net for transfer to the sample jar. After collection, samples were kept on ice until filtered to remove excess water and stored at −80°C immediately upon return to the lab. After subsampling, they were kept at −20°C. Between sampling, the nets were cleaned in freshwater, before drying and storage.
Thawed plankton samples were homogenized using a handheld blender (OMNI Tip™ Homogenizer) and a hard tissue probe. About 40 μL of each sample was then digested and extracted with the F I G U R E 1 The Australian Integrated Marine Observing System (IMOS) sampling sites. The map includes adapted Köppen climate classifications, marine biogeographical areas and, for each site, the sampling years (in brackets) and mean sea surface temperatures (Av SST). tissue protocol from the DNAeasy Blood and Tissue kit (Qiagen).
Modifications included twofold quantities of the reagents used before the washing stage and a double elution of AE buffer (200 μL total). Four extraction controls were created during the extraction stages to be used throughout data production. DNA extracts were kept at −20°C. All samples were extracted and processed within laboratories designed for trace and environmental DNA.

| Amplification of barcodes, library build and sequencing
Six metabarcoding assays (Table S1) were applied to 1 in 10 dilutions of each DNA extract. For each assay, DNA extracts were assigned fusion-tagged primers incorporating assay-specific primers, Illumina adaptor sequences and unique combinations of six to eight base MID (Multiplex IDentifier) tags-over 500 unique combinations. The MID tags identify sequences to a particular assay and sample. To avoid cross-contamination, the MID tag combinations had not previously been used for marine samples. Fusion tags were used to limit the risk of tag jumping by the use of only one round of PCR (Schnell et al., 2015). and 1 U of Taq polymerase Gold (ABI, USA), 2 μL of DNA and made up to 25 μL with PCR grade water. PCR master mixes with tagged primers were prepared within an ultraclean environment to prevent contamination. DNA was added in a pre-PCR laboratory and the PCR reactions occurred in a third dedicated laboratory. Non-template and extraction controls were included on each PCR plate, only those controls which amplified were sequenced. PCR reactions were duplicated (to allow for stochasticity) and cycled at 95°C for 5 min followed by 50 cycles of 95°C for 30 s; the primer-specific Ta (annealing temperature; Table S1) for 30 s and 72°C for 45 s; and a final extension of 72°C for 10 min.

| Production of operational taxonomic units (OTUs)
The process of creating OTUs was repeated for each assay.
Sequences were separated into their sample of origin by their MID tags using Geneious R8 (Kearse et al., 2012). Initial filtering was conducted by ensuring the MID tags, gene-specific primers and sequencing adapters were all present in each sorted sequence without error. Sequences failing this step were removed from future analyses. Primers, adaptors and MID tags were detached from the remaining sequences. Each sequence was renamed to reflect its sample of origin. The filtered, trimmed and renamed sequences were then combined into one large assay-specific file for further quality filtering and clustering into OTUs.
USEARCH v8 (Edgar, 2010) was used for quality filtering and the clustering of OTUs. Sequences were filtered based on the quality (Q) scores inherent within fastq sequences, using a fastq filter-E_max > 0.5-and formed into groups of identical sequences, removing those with an abundance of <0.1% of the total number of unique sequences across all samples. OTUs were clustered for each assay using a 97% similarity. USEARCH's cluster-otus command also identifies chimeric sequences, enabling them to be located and removed from the data. Low-abundance filtered sequences were then mapped back onto existing OTUs to ensure the inclusion of all relevant data.
Any low-abundance unmapped amplicons were discarded from further analysis. This process, while potentially eliminating some scarce taxa, ensures the removal of possible erroneous amplicons. OTUs were then assigned to samples and converted to presence/absence data ready for potential identification and statistical analysis. The quality filtered sequence data are available on Data Dryad (https:// doi.org/10.5061/dryad.931zc rjns).  [Altschul et al., 1990]) with the parameters: num_descriptions 50-num_alignments 50-dust no and a reward of 1 (for aligned pairs of identical nucleotides). Output files from the GenBank search were imported into MEGAN v5 (MEtaGenome Analyser; [Huson et al., 2011]), restricting possible matches to those with a Lowest Common Ancestor (LCA) minimum score of 100. Reports were limited to the best 5% matches of sequences to the assigned taxon. MEGAN allows the visualization of taxonomic assignments and the option to inspect each match before an assignment is made. OTUs were assigned only where the reference sequence covered the whole of the OTU query sequence. The taxa identified by each assay were combined into one table to allow for comparisons between assays and further analysis. However, where taxa were identified more than once in a sample (by OTUs from more than one assay), these particular taxa were included in the table only once. Some of the lineages for identified taxa were erroneous, this was particularly evident where the same species appeared twice with differing lineages. It was decided to be consistent with assignments and use WoRMS (world register of marine species [Horton et al., 2018]) as the authority for this. Contaminant (non-marine) OTUs were removed prior to analysis. Taxa were reported to the most informative taxonomic level using the following match parameters: species (97 < 100%); genus (95 < 97%); family (93 < 95%); order (91 < 93%); and class (89 < 91%). Where a sequence matches more than one taxon equally, the assignment was dropped to the lowest common taxonomic level.

| Statistical analysis of OTUs
All statistical analyses were performed on the presence/absence of OTU data for sequences obtained for each assay on all 90 samples. For each sample, two measures were calculated: richness, a count of the number of OTUs; and assemblage, the presence/ absence composition of the OTUs in the sample. Statistical analyses were performed using PERMANOVA+ (Anderson et al., 2008) add-on for Primer 7 (Clarke & Gorley, 2015) and R packages labdsv (Roberts, 2016) and vegan (Oksanen et al., 2016).

| Univariate analyses
To explore the univariate relationships between the OTUs from each assay and abiotic variables, we used richness as the response variable in generalized linear models (GLMs). We used a negative binomial error structure using the function glm.nb in the MASS package in R (Venables & Ripley, 2002) with a log-link function (Zuur et al., 2009).
Abiotic variables in the initial model included site, sea surface temperature (SST), salinity, silicate, nitrate and phosphate. Model residuals were visually inspected for homogeneity of variance and normality.
The resulting model was checked with an analysis of variance (ANOVA) and visualized with R package visreg (Breheny & Burchett, 2017).

| Multivariate analyses
We used PERMANOVA (Anderson, 2001) to investigate site, annual and seasonal effects on richness and assemblage measures. To F I G U R E 2 Total Animalia detected in DNA extracted from 90 bulk plankton samples using multi-assay metabarcoding. Each pie slice represents, as a proportion, the number of families detected within an individual phylum (the inner nodes). The terminal nodes represent the orders found and the mid-inner nodes the classes. The rim provides an estimate of the number of families found within each phylum.
illustrate site-based patterns, we used non-metric multi-dimensional scaling (nMDS) in the R package vegan. To further test the site-based patterns found, we applied a cross-validation test (leave-one-out allocation of observations to groups [canonical analysis of principal coordinates (CAP): Anderson et al., 2008]) among sites to the OTU assemblages from each assay. The indicator species that were characteristic (i.e., present in the majority of the samples from one site but mostly absent in the others) of each site were identified using the function indval (indicator value analysis) in the R package labdsv (Roberts, 2016)-this analysis was applied to the presence/absence of taxa identified. We also used PERMANOVA (Anderson, 2001) to explore seasonal changes in the assemblage of OTUs. This analysis used a repeated-measures design with factor season (four seasons) nested within site (nine sites) with sampling years as replicates. Multivariate analyses were performed using distance-based linear models (DistLM) in PERMANOVA+ using the same abiotic explanatory variables as in the univariate analysis. Bray-Curtis similarity matrices were created from the presence/absence of OTU data. The model that best explained the variation in the OTU assemblage for each assay was selected by the adjusted R 2 .

| Power to detect seasonal change
To provide an indication of the average contribution of a sample to each site and each site to the OTUs produced by an assay, the number of OTUs found in each sample was converted to a percentage of the total number of OTUs found at each site, and the number of OTUs found at each site was converted to a percentage of the total number of OTUs created using each assay.

| Overall biodiversity in Australian zooplankton communities
The six metabarcoding assays applied to 90 samples generated >25 million sequences and these produced >2000 individual OTUs. Of these, ~64% could be assigned to an appropriate taxonomic level (Table S2), and over half of those assigned were to a genus or species level (Tables S3-S9). The average number of sequences per sample varied between assays. The 16S Universal assay produced an average of 3200 filtered sequences per sample which, across all samples, never the less produced 370 OTUS. In contrast, the Fish Assay produced an average of close to 50,000 filtered sequences per sample which formed only 202 OTUs across all samples (Table S2). However, the exploratory sequence number and OTU accumulation curves tested for each sample and assay all reached asymptote.

| Families present
In total, >1000 OTUs were assigned to >500 distinct taxa in Kingdom Animalia including almost 300 unique families ( Figure 2).
In Arthropoda alone, we identified over 60 families, primarily belonging to Hexanauplia and Malacostraca. In Chordata, we identified over 70 families, with 90% belonging to Actinopterygii. Over 40 Mollusca families were identified; of these, >80% were Gastropoda.
In Cnidaria, two-thirds of the 17 families identified belong to Hydrozoa. In Echinodermata, 10 families were identified.

| Species identified
Using our multi-assay metabarcoding approach, we detected a broad range of taxa, with almost 250 assigned at the species level. While many assigned species (e.g., Parvocanalus crassirostris; a copepod detected in Darwin, and Xiphonectes tenuipes; a crab detected in Yongala) were found only at one site, many species were common to almost all sites, such as Clausocalanus furcatus (a copepod, detected at all sites except Darwin) and Lensia subtiloides (a Siphonophorae Hydrozoan, detected at all sites except Maria and Kangaroo Islands).
Other species detections were restricted to a region. For example, the predatory sea snail Epitonium replicatum was found north of the tropic of Capricorn, and the genus of anchovy Engraulis was found only to the south of this latitude.
The indicator species analysis (using indval, Table S10) showed that the species most characteristic of particular IMOS stations belonged to many phyla, but Arthropoda and Mollusca featured strongly. For example, the top indicator species for Ningaloo was Metalpheus paragracilis, a small shrimp-like decapod; for Maria Island, it was Maoricolpus roseus, an invasive mollusc (see Table 4); for Port Hacking, it was the offshore copepod Haloptilus longicornis, suggesting an oceanic influence in the area; and for Kangaroo Island, it was Membranipora membranacea, a bryozoan and another invasive species (although this assignment may be contentious; see Table 4).

| Contamination
Overall, some non-marine contamination was noted in some of the samples (Tables S3 and S4)  Note: For the taxa detected by each assay and based on OTUs, DISTLM was used to relate the zooplankton assemblage (Anderson et al., 2008) to environmental variables, and nbGLM to relate the zooplankton richness (Venables & Ripley, 2002) to the same environmental variables. Only those factors included in the best model have probability values.
(21 samples). The source of these could be from the samples (i.e., terrestrial input from the sea surface), contamination of gear or sample handling. Regardless of source, identified non-marine OTUs were removed from any downstream analysis. Controls that amplified during the PCR stage were sequenced but contained no sequences.

| Spatial variation in zooplankton richness
Overall, the nested PERMANOVA showed significant differences among sites in the number of individual OTUs present in each sample (richness-univariate measurement) for all assays.
However, the pairwise PERMANOVA analysis illustrated that not all sites differed, and predominantly the differences in richness were between the southern coldest sites of Kangaroo and Maria Islands and the more northern and warmer sites ( Figure 3 and Table S11).

| Spatial variation in zooplankton assemblages
Every metabarcoding assay showed significant differences among sites in the overall analysis of the OTU composition (assemblagemultivariate measurement; PERMANOVA). The pairwise analysis of assemblage within each sample also showed many significant pairwise differences among all sites (PERMANOVA; p ≤ .001- Table S11) and across almost all assays. The exception to this was the Fish perature. An illustration of the assemblage separation between sites for the other assays may be found in Figure S1.
The cross-validation test supported the assemblage differences between sites for each assay. Here, the OTU data from a single sample could consistently be allocated-using composition alone-to the correct site 80%-90% of the time. The exception was the Fish assay, which was allocated to the correct site 56% of the time (Table S12).

| Seasonal variation in zooplankton richness and assemblage
Seasonal changes in richness were observed in data from all assays, except the Crust and 16S universal assays (PERMANOVA). However, there was little significant seasonal change in richness within each site (Table S13).
While all assays (except Fish) showed overall significant differences in assemblage for seasonality, the analysis of seasonality within each site showed little difference. There were some F I G U R E 5 Power curves from each assay indicating the number of samples required to detect seasonal changes in richness. Thirteen samples per season sampled from Rottnest Island  were used to create the curves (created in R -R-Core-Team, 2015), and effect sizes F calculated from PERMANOVA results. Power levels of 0.8 are indicated with the corresponding number of samples per season required to achieve this. significant seasonal differences, particularly for the taxa detected by the Copepod 3, Mollusca and Cnidaria assays, but the remainder of the assays showed little to no significant response (Table S13).

| Richness: GLM
The general linear modelling (nbGLM) revealed that site was most strongly related to biotic richness across all assays. Sea surface temperature (SST) had a significant impact on the 16S Universal and Fish assays, and for Copepod 3 and Mollusca, silicate and phosphate (respectively) were significant drivers of richness (Table 1 and Figure S2). The variation explained by the best model varied from 26% for the 16S Universal assay-detecting meroplankton-to 69% for the Copepod 3 assay-detecting primarily holoplankton.

| Assemblages: DISTLM
Distance-based linear modelling showed that site and nitrate were the most important predictors of assemblage (Table 1)
Cnidarian assays. The best models used combinations of the measured abiotic variables to explain between 10% and 49% of the variation in the measured assemblages; 10% for the Fish assay (detecting only fish) and 49% for the Copepod 3 assay (detecting primarily holoplankton).

| Power analysis of the number of samples needed to detect spatial differences in richness
The power analysis based on data from Berry et al. (2019) taken over 5 years showed that for most assays, the effect size for seasonal changes was very low, so substantially more sampling than was conducted in this study would be required to detect seasonal changes in richness; the exception to this was the Fish assay. As a result of larger seasonal effects in fish numbers, sufficient power could be achieved to detect seasonal change using the Fish assay with just five samples from each season ( Figure 5). By contrast, Crustacea required 16 samples per season and the remaining assays ranged from 29 to 43 samples per season.

| Assemblage analysis of the number of samples needed to detect spatial differences
Serial analysis of the Rottnest Island Assemblage  showed an increase in significant pairwise seasonal differences with the number of samples used (PERMANOVA).
While five samples per season provided some significant results (second row;

| Contribution of each sample to OTU diversity
The contribution of each sample to the total OTU diversity detected by each assay at each site is shown in Table 3, as well as the site contribution to the total OTU diversity detected by the assay. Copepod 3 produced the most reproducible results from sample to sample, with an average sample contribution to each site from 33% to 45%.
The least reproducible was the Fish assay, where the average sample contribution ranged from 13% to 27%.
On average, each sample contained between 24% and 34% of the OTUs detected at each site depending on the assay. However, the contribution of each site to the total number of OTUs detected by an assay was between 12% and 29%.

| Biodiversity of Australian zooplankton communities
A high diversity of taxa were detected within this temporal and spatial study. Given the size of the dataset, it was not practical to check the biological credibility of each assignment, an issue with large datasets that have been recognized previously . However, an examination using the Atlas of Living Australia (ALA, 2017) revealed that ~76% of the assignments were detected within their expected habitats, ~8% were detected within both known and unexpected areas and ~90% of all assignments were known to inhabit Australian waters, whether or not they were detected in their usual locale. Despite this, there were many animals we did not detect.
Ctenophores, salps, larvaceans and ostracods should all be present TA B L E 3 The total OTU diversity detected by each assay for each site and each sample within the site.

Site contribution to total OTUs (%)
Ningaloo (7)  are not routinely identified morphologically in plankton samples.
One of the more curious taxa was humpback whale (Megaptera novaeangliae). The whale was found in six samples and four sites by two different assays on both the east and west coasts of Australia.
While this baleen whale clearly does not have a planktonic stage to its lifecycle, it does feed on krill, small fish and plankton (Johnson & Wolman, 1984), and the DNA detected could originate either from particulate matter such as faeces or shed cells, or from secondary predation where detritovores, such as krill, feed upon whale waste. M.
novaeangliae has a known near-coastal migration pattern that passes near the sampling sites where it was found and all detections occurred during the whales' regular spring migration time, and there were no detections outside of these times. We noted that one of the sites was at Rottnest Island. In a previous study, Berry et al. (2019) used the same Rottnest Island samples and the Copepod 3 assay but did not report this taxon. We re-examined the unfiltered Copepod 3 sequences from the Rottnest samples amplified in both studies to find that the whale sequences were present in Berry et al. (2019), but at an abundance too low to pass the (stringent) filtering process. Finding whale DNA in 6 of the 90 samples, at time when this taxon is known to be in the area, provides evidence for the use of plankton samples as an option to detect species much larger than zooplankton in the water column.
Another interesting result was the three-spot swimming crab (Portunus sanguinolentus). This commercially harvested sand crab was the top indicator species at North Stradbroke Island. This finding may be a reflection of the crab fisheries in nearby Moreton Bay.
However, while ALA (2017) indicates that the crab's range is across northern half of Australia, North Stradbroke was the only site we detected it. A number of other taxa were found that are of economic, conservational and ecological interest (see Table 4 for a selection).

| Richness
While there were few significant differences in richness among sites, there were some noted among the colder sites of Maria Island and Kangaroo Island, and the more northern sites. On average, Darwin was 15°C warmer than Maria Island. Given that some previous studies have identified SST as a main driver of species richness (Benedetti et al., 2021), it was expected that the differences noted in richness were reflective of this disparity in mean sea surface temperature.
However, our analysis of the relationship between richness and abiotic factors found that site was the most influential abiotic factor (Table 1). While this may be the result of correlation between site and SST, the finding is in accordance with Chaudhary et al. (2021).
The authors in that study undertook the task of reviewing research data as far back as 1955, from over 48,000 marine animal species to determine (among other findings) that, while SST is an important variable, latitude explains more variation in species richness. Here, SST was significant only for the Fish and 16S Universal assays; both of which detect predominately meroplankton.
In this study, we used a maximum of 12 samples taken across 3 years. These resulted in finding little or no seasonal changes in richness across this time. However, samples from a previous study , which used 12 samples per year from one site for 5 years, were used to determine how much sampling would be required to detect seasonality in richness for each assay ( Figure 5).
As expected, the most responsive assay was the Fish assay (5 samples from each season), this was followed by the Crustacea assay (16 samples), both of which detect predominately meroplankton.
This indicates that meroplankton are more likely to show seasonal changes in OTU numbers, as they quite often emerge in summer and only spend a portion of their life within the zooplankton community, usually as juveniles or eggs. While we acknowledge that seasonal patterns will vary in intensity depending on the sites, a similar pattern was noted off the coast of Plymouth by Highfield et al. (2010), where a morphological study examined weekly samples from 1988 to 2007 to show that the occurrence of meroplankton peaked in spring and then again in late summer.

| Assemblage
Given the climatic and spatial distinctions between the sites, it was unsurprising that we found highly significant differences in OTU assemblages between all sites and across all assays (except the Fish assay). The spatial differences noted are consistent with findings from previous zooplankton studies (e.g., Chain et al., 2016). The incorporation of the abiotic factors showed both site and SST to be important drivers for the assemblage of OTUs. However, it was the assays detecting primarily holoplankton (Copepod 3, Mollusca and Cnidaria) that produced the models most responsive to the abiotic factors. This pattern suggests that obligate zooplankton are most affected by environmental factors, at least in the short term. While the assemblages from the Fish assay did not differ between all sites, the pairwise analysis showed significant differences between the Southern and Northern sites. This is a likely consequence of fishes being larger and more mobile than species detected by the other five assays.
There were few significant seasonal differences noted in OTU assemblage across all assays. Our study used a maximum of three samples per season across 3 years (one sample per season per year).
Post hoc analysis demonstrated that this experimental design was not adequate sampling to show seasonal differences for any of the metabarcoding assays ( Table 2). Once again, we examined extended data from the Rottnest Island site  to determine how much sampling might be needed to detect seasonal changes in assemblage using each assay. In that study, assemblage differences between summer and winter in all assays could be reliably detected This study used six independent transects from each port over two seasons (May and December). This result is analogous to our data (Table 2), which shows that five samples per season will produce significant differences between summer and winter. Our data add to the growing evidence and provide important recommendations as to how future environmentally based DNA surveys might be designed and implemented into long-term monitoring programmes.
While the DISTLM results (Table 1), particularly with respect to site and SST, are not unexpected, there are limited studies linking abiotic factors to a wide variety of zooplankton (Mackas et al., 2012) and fewer still that employ a multi-assay metabarcoding approach.
Almost all studies that have related zooplankton to environmental conditions use taxonomy based on microscopy (e.g., Everaert et al., 2018;Labuce et al., 2020;Shi et al., 2020). Here, we confirm that DNA data from bulk zooplankton samples can also be used as a measure of the zooplankton community.

| Sampling design
This Australia-wide audit has implications for the design of future large-volume eDNA studies, whether they be spatial or temporal in nature. The question is how much sampling is necessary to provide an indication of zooplankton diversity at a particular location. The answer is not straightforward and varies depending on whether the aim is to examine assemblage or richness or both ( Figure 5 and Table 2). While it might be easy to advocate for many samples and replicates, the costs and logistics of sampling at sea make a more considered approach necessary. Batten et al. (2019) advocate the globally coordinated use of continuous plankton recorders to monitor plankton populations and provide data for the sustainable management of our oceans. Our position is that an environmental metagenomic approach using bulk plankton samples could provide much of this information without the need for taxonomic expertise.
Further, a valuable synergy would be to use DNA analysis of continuous plankton recorder samples, which has been shown to be an efficient, cost-efficient approach to identifying biodiversity in the epipelagic zone .
The examination of the proportion of the contribution of individual samples and sites provides a glimpse of the percentage of OTUs in a single sample that contributes to the OTU diversity found within a site (Table 3). While these results could be partly attributed to the disparity in sample number within the sites, Ningaloo had only 7 samples and was the second largest contributor of OTU diversity (28%), and Maria Island had 11 samples and was the third smallest contributor (16%). There is also an overlap of OTUs between samples, preventing simple overall conclusions.
The difference between the sample contributions of the Copepod 3 assay (33%-45%) and the Fish assay (13%-27%) also supports the results of the power analysis ( Figure 5), which showed fish needed the least number of samples and Copepod 3 the most to exhibit seasonality in the richness data. Given the results provided by this, and the post hoc analysis using the extended Rottnest data ( Figure 5 and Table 2), it is evident that continuous temporal and spatial sampling is needed to produce a thorough understanding of zooplankton communities. The variation in community composition over time at each of these sites demonstrates the limits of single-point sampling.
TA B L E 4 A selection of commercially and environmentally important taxa detected within the plankton samples. This was the top indicator species for Kangaroo Island. It is a colonial bryozoan that grows on the surface of kelp blades causing them to become brittle and break (Saunders & Metaxas, 2008). It causes defoliation and reduces survival of the host kelp, which then allows the infiltration of other opportunistic species (Saunders & Metaxas, 2008). While the sequences were robust matches to the reference sequences across two genes, many bryozoan species are not found on the GenBank database (Benson et al. (2014)-although ALA (2017)  Potentially threatened An IUCN report on this species states there is insufficient data to determine its conservation status (Pollom, 2016). It is commonly caught as by-catch from trawl fishing and sold for use in traditional Chinese medicine to replace the increasingly costly and rarer seahorse, and is thought to be susceptible to overexploitation and has a planktonic stage in its lifecycle (Vincent Amanda, 2003) Portunus sanguinolentus (three-spot swimming crab) Nth Stradbroke Is 5 of 12 Crustacea 16S Universal

Commercial
The top indicator species for North Stradbroke Island. This is a commercially harvested edible crab that is found across the Indo-Pacific region (Sumpton et al., 1989)

| CON CLUS IONS
Long-term biomonitoring of marine environments is essential to the understanding of the biotic networks they support. It is well recognized that an ecosystem-wide approach to this issue is needed, however, to attempt this using traditional morphological approach is not realistic-multi-assay DNA metabarcoding of bulk plankton samples makes this sizeable task more attainable.
The data attained here highlight the breadth of knowledge that can be achieved from extended spatial and temporal studies and illustrate the ability of multi-gene metabarcoding to reveal clear spatial differences among highly complex zooplankton communities. While the sampling regime was not conducive to detecting significant seasonal changes, the combined data from Rottnest Island  provide a benchmark for experimental design for future DNA-based monitoring programmes using plankton as a substrate.
This study provides insight into the importance of collecting and maintaining accessible data and sample archives; both biotic and abiotic data. Time-stamped data on biotic composition provide a unique resource to reveal community composition patterns and allow changes in them to be measured in light of associated abiotic factors. Broad-scale spatial and temporal studies have become critical to provide much needed evidence for the management and mitigation of the anthropogenic impacts experienced by our oceans.
Further, the increasing speed (automation) and portability of DNA sequencers make the incorporation of metagenomic methods into the marine monitoring toolkit a matter of when, not if.

ACK N O WLE D G E M ENTS
None. Linkage projects (LP160100839 and LP160101508) to explore marine metabarcoding applications.

CO N FLI C T O F I NTER E S T S TATEM ENT
The authors declare that they have no competing interests.

DATA AVA I L A B I L I T Y S TAT E M E N T
The quality filtered sequence data are available on Data Dryad