Environmental DNA metabarcoding of wild flowers reveals diverse communities of terrestrial arthropods

Abstract Terrestrial arthropods comprise the most species‐rich communities on Earth, and grassland flowers provide resources for hundreds of thousands of arthropod species. Diverse grassland ecosystems worldwide are threatened by various types of environmental change, which has led to decline in arthropod diversity. At the same time, monitoring grassland arthropod diversity is time‐consuming and strictly dependent on declining taxonomic expertise. Environmental DNA (eDNA) metabarcoding of complex samples has demonstrated that information on species compositions can be efficiently and non‐invasively obtained. Here, we test the potential of wild flowers as a novel source of arthropod eDNA. We performed eDNA metabarcoding of flowers from several different plant species using two sets of generic primers, targeting the mitochondrial genes 16S rRNA and COI. Our results show that terrestrial arthropod species leave traces of DNA on the flowers that they interact with. We obtained eDNA from at least 135 arthropod species in 67 families and 14 orders, together representing diverse ecological groups including pollinators, parasitoids, gall inducers, predators, and phytophagous species. Arthropod communities clustered together according to plant species. Our data also indicate that this experiment was not exhaustive, and that an even higher arthropod richness could be obtained using this eDNA approach. Overall, our results demonstrate that it is possible to obtain information on diverse communities of insects and other terrestrial arthropods from eDNA metabarcoding of wild flowers. This novel source of eDNA represents a vast potential for addressing fundamental research questions in ecology, obtaining data on cryptic and unknown species of plant‐associated arthropods, as well as applied research on pest management or conservation of endangered species such as wild pollinators.

a novel source of arthropod eDNA. We performed eDNA metabarcoding of flowers from several different plant species using two sets of generic primers, targeting the mitochondrial genes 16S rRNA and COI. Our results show that terrestrial arthropod species leave traces of DNA on the flowers that they interact with. We obtained eDNA from at least 135 arthropod species in 67 families and 14 orders, together representing diverse ecological groups including pollinators, parasitoids, gall inducers, predators, and phytophagous species. Arthropod communities clustered together according to plant species. Our data also indicate that this experiment was not exhaustive, and that an even higher arthropod richness could be obtained using this eDNA approach. Overall, our results demonstrate that it is possible to obtain information on diverse communities of insects and other terrestrial arthropods from eDNA metabarcoding of wild flowers. This novel source of eDNA represents a vast potential for addressing fundamental research questions in ecology, obtaining data on cryptic and unknown species of plant-associated arthropods, as well as applied research on pest management or conservation of endangered species such as wild pollinators.

K E Y W O R D S
arthropods, eDNA, environmental DNA metabarcoding, flowers, grassland, insects, pollinators

| INTRODUC TI ON
Terrestrial arthropods are experiencing massive decline in Europe as well as globally (Collen, Böhm, Kemp, & Baillie, 2012;Dirzo et al., 2014;Nieto et al., 2014;van Swaay et al., 2010), although only a fraction of the species have been assessed and the majority of insects are still undescribed to science (Stork, 2018). As one example, grassland ecosystems are home to diverse taxonomic and functional groups of terrestrial arthropods, such as pollinators, phytophagous insects, and predators, that use nectar and pollen for food sources, and stem and leaf tissue for food and development. These communities harbor endangered species, since many habitats have disappeared or are under significant threat (Habel et al., 2013;Joern & Laws, 2013). Therefore, extensive efforts are being conducted in order to restore European grassland ecosystems and conserve biodiversity (Silva et al., 2008). For instance, pollinators like bees and butterflies represent an important ecological group that has undergone severe decline in Europe, indicating a dramatic loss of grassland biodiversity (Biesmeijer et al., 2006;Goulson, Nicholls, Botías, & Rotheray, 2015;Potts et al., 2010;van Swaay et al., 2013). The vast majority of flowering plants are pollinated by insects and other animals both in temperate regions and the tropics (Ollerton, Winfree, & Tarrant, 2011).
The majority of insect species are herbivores feeding on different parts of plants, and most of these are specialists, relying on one or a few plant species as their main food resource (Price, Denno, Eubanks, Finke, & Kaplan, 2011). However, given the gap in knowledge on existing insect species, and the fact that most species are still undescribed, it is clear that for the majority of plant species in the world, we have only a vague idea about the arthropod communities that they harbor and interact with.
Given the recent success of eDNA metabarcoding for several different complex sample types, we argue that DNA traces may be more frequent in the environment than one would immediately imagine. Here, we propose the hypothesis that arthropods leave DNA traces on flowers after interaction, and we test the extent to which this source of arthropod eDNA can provide useful information on local species occurrences and communities ( Figure 1).  Table S1). The large majority of samples were F I G U R E 1 The longhorn beetle Leptura quadrifasciata-an example of a flower-visiting insect found in this study. We show that eDNA from arthropods are deposited on flowers after interactions. Photo: Ole Martin collected at Vestamager. This locality is old seabed contained in the 1940s and consists mainly of beach meadows, grassland, and young woodland composed of deciduous trees-mainly birch and willow.

| Study sites
The area is approximately 2,000 ha and is the southwestern part of the island Amager east of Zealand ( Figure 2). Flowers of Solidago canadensis were collected at Kristiansminde. This site is a grassland with occurrences of deciduous trees, and surrounded by patches of forest and farmland.  Table S1) were collected. For Centaurea and Daucus, samples were collected both in discrete areas as well as with the two species interspersed in a transect with 10 m distance between each flower (Figure 2c). Before collection, the flowers were thoroughly inspected to ensure that they did not contain any visible animals. Flowers were collected in sterile plastic tubes (5, 15 or 50 ml) using single-use sterile nitrile gloves. All flowers were kept in a box with ice blocks immediately after sampling and stored at −20°C after return from the field (max. 5 hr after sampling). They were kept at −20°C until DNA extraction.

| DNA extraction
DNA extractions were performed in the laboratories at Centre for GeoGenetics, University of Copenhagen, which are dedicated labs for working with samples of low DNA concentration. Regular decontamination routines are in place, including UV-light, and preand post-PCR work is separated. DNA was extracted using Qiagen DNeasy® Blood & Tissue Kit. Lysis was performed in the plastic tubes containing the flowers by adding 540, 900 or 1,800 μL ATL lysis buffer and 60, 100 or 200 μL proteinase K, respectively, depending on the size of the sample (Table S1). Samples were lysed at 56°C with agitation in a rotor for 3 hr. After lysis, samples were mixed on a vortexer for 10 s and a total of 500, 800, or 1,500 μL lysis mixture were retrieved, respectively. Equal amounts of AL buffer and absolute ethanol, corresponding to the volume of retrieved lysis mixture, were added to the tubes and vortexed thoroughly before the samples were applied to spin columns and spun through the membrane filters over several rounds (700 μL per round). Columns were washed by adding first 600 μL AW1 and then 600 μL AW2 buffers. Finally, DNA was eluted in 2 × 60 μL AE buffer, each time with a 15-min incubation step at 37°C before spinning. All spinning steps were performed at 10,000 g. DNA extracts were stored at −20°C.

| PCR amplification
For DNA metabarcoding, we used two different primer sets tar-  Figure S1). Primers were uniquely tagged. Tags were designed using the OligoTag program (Coissac, 2012), and consisted of six nucleotides with a distance of at least three bases between any two tags. Tags were preceded by two or three random bases; NNN or NN (De Barba et al., 2014), and identical tags were used on the forward and reverse primers for each sample.
PCR reactions were carried out in four replicates per sample, Annealing temperature are according to Alberdi, Aizpurua, Gilbert, and Bohmann (2018), and we performed 55 cycles with the initial expectation that arthropod eDNA concentration was low in flower samples.
Fragment sizes were verified on 2% agarose gel stained with GelRed TM . Approximately half of the PCR products were verified.
For each of the two primer sets, PCR products were mixed in four pools each containing one PCR replicate of each sample (5 µl per replicate), such that the same tag was added only once to each pool.
The pools were purified using Qiagen's MinElute PCR purification kit.

| Library building and nextgeneration sequencing
Library building was performed on the purified pools of PCR products using the TruSeq DNA PCR-free LT Sample Prep kit (Illumina).
A total of eight libraries (corresponding to the four PCR replicates from each sample for each of the two primer sets) were constructed (Supporting Information Figure S1). Each pool thus included one replicate of every sample, four DNA extraction blanks, two PCR blanks, and two mock sample replicates.
The manufacturer's protocol was followed with the exception that samples were incubated with the elution buffer over two

| High-throughput sequencing data analyses
Supporting Information Figure S1 shows an overview of the workflow. After de-multiplexing with a custom python script, Illumina sequences were analyzed using DADA2 (Callahan et al., 2016), in order to clean the data from errors generated during PCR and se-

| Accumulation analyses
Species accumulation curves for replicate flower samples and PCR replicates were performed using the function specaccum from the R package vegan v. 2.4-6 (Oksanen et al., 2018). The "exact" species accumulation method was used, which finds the mean species richness across sites/replicates.

| Rarefaction analyses
Rarefaction curves for the four individual PCR replicates of each sample based on number of taxa as a function of sequencing depth were performed using the function rarecurve from vegan v. 2.4-6.

| Differentiation analyses
In order to investigate how well the arthropod communities were differentiated according to plant species, we performed a number of analyses. A redundancy analysis (RDA) of arthropod communities in the different flower samples was performed, using the rda function in vegan, with latitude and longitude of the sample sites as conditioning variables. Additionally, we made a bipartite diagram showing the links between plants and arthropods found in this study, using the R package bipartite (Dormann, Gruber, & Fruend, 2008). Finally, heatmap cluster analyses of the arthropod communities in the different plant species were performed using the R package pheatmaps (Kolde, 2018). Clustering was set to the average-linkage method and was done using Raup-Crick distances calculated with the vegdist function of the vegan R package (Oksanen et al., 2018). Distances were transformed with cube transformation (n 1/3 ) to obtain an appropriate scale for the figure.

| Faunistic data
We investigated how well the species obtained using eDNA cor- analyses due to a rich record of occurrence data at this site. Only species for which there was at least one record in Denmark were included in the analyses.  (Table S5).
No PCR or extraction blanks gave visible bands on the initial gel images. They were sequenced nonetheless, and yielded 1694 and 445,277 reads (COI and 16S, respectively) from the extraction controls in the final data. These reads were all from human (16S) and Cecidomyiidae spp. (COI). The latter sequence was also found in the eDNA COI data (Cecidomyiidae sp.5), but given the comparatively low read number in the blank and since the sequence only occurred in one of four extraction blanks, this was considered an accidental and rare carryover in the extraction process, and not a general contamination. The PCR blanks gave no reads in the final data.

| Arthropod eDNA diversity recovered
Overall, our study uncovered eDNA from several taxonomic and functional groups of arthropods.

| Pollinators
The red-tailed bumblebee (Bombus lapidarius) is common in the study area, where it frequently visits flowers. The species was observed on Centaurea flowers, which was also the species from where

| Predators
The ground beetles Amara spp. are often known to visit flowers for feeding.  (Koch, 2003), were detected on the same sample as eDNA from Aphis sp. were detected (Poll_10). We also detected eDNA from a number of spiders in the families Linyphiidae, Miturgidae, and Anyphaenidae.

| Gall inducers
Gall midges (Cecidomyiidae) were very abundant in the eDNA data.
This family is very diverse, and several species could only be identified to Cecidomyiidae spp., possibly due to incomplete coverage in the database, and the fact that this family may be extraordinarily diverse based on molecular data from the DNA barcoding region (Hebert et al., 2016).

F I G U R E 3
Photos of arthropod families found with eDNA on wild flowers in this study. A representative for each family is shown except the aquatic families Veliidae, Asellidae, and Polyphemidae. *The taxon found in the study is different from the one in the example photo, see Supporting Information Table S2

| Saturation and rarefaction analyses
Accumulation curves for flower sample replicates of individual plant species indicated that greater sampling effort would probably increase recovered diversity (Figure 4). Similarly, there was generally a stepwise increase in number of taxa with the four PCR replicates (Supporting Information Figures S4-S5), indicating that more replicates would increase recovered richness. Rarefaction curves showed that the individual PCR replicates were sufficiently sequenced (Supporting Information Figures S6-S7).

| Differences between the 16S and COI gene
Higher diversity was obtained for the COI gene than for the 16S gene (Supporting Information Table S2, Figures S2-S3). In fact, only seven unique families were obtained with 16S, and only 11 of the 67 families were recovered with both genes (Supporting Information Figure   S2). Nevertheless, the two genes together covered a greater part of the arthropod diversity in the sampled flowers than each of them did in isolation. As an example, bees (Apidae) were only detected with 16S. In the mock sample, two of five species were recovered in COI, while were four of five were recovered for 16S (Table S5). Due to the incomplete 16S dataset, the subsequent analyses of the arthropod communities were performed on the COI dataset.

| Differentiation of arthropod communities by plant species
Results from the cluster analyses and RDA show that the communities of arthropods obtained from eDNA were somewhat, although not perfectly, segregated by plant species (Figure 5, Supporting Information Figure S8). The bipartite plot indicates that flower samples with large surface area such as Apiaceae umbels (Angelica and Daucus) contained the highest diversity, whereas samples collected as single flowers (Echium) had the lowest diversity ( Figure 6). The arthropod families found in the most plants seemed to be abundant groups such as aphids, thrips, plant bugs, gall midges, and sap beetles (Nitidulidae).

| Comparison of eDNA and faunistic records
Of the 135 taxa recovered from eDNA metabarcoding (COI and 16S combined), 93 taxa fulfilled the criteria that they (a) were obtained from Vestamager, (b) could be resolved to species (and in a few cases Genus) level, and (c) had at least one record in Naturbasen. Of these, 55 taxa (59%) have been recorded at the same sampling locality, and 68 taxa (73%) have been recorded in the island of Amager (Table   S2). Two species obtained by eDNA are not previously recorded in Denmark, and for these, it was checked that reference sequences for all known Danish species in the particular genera were found in BOLD.

| D ISCUSS I ON
Terrestrial arthropods represent the majority of life on Earth (Mayhew, 2007;Stork, 2018), and many of them form complex associations with plants (Price et al., 2011). Numerous global research projects are providing new knowledge on these relationships (Bruce, 2015), and even the relatively species-poor communities of Europe harbor thousands of species. The monitoring of species compositions and associations in a wild grassland habitat can thus be very demanding to study, and the challenge is exacerbated by the decline in taxonomic expertise. This necessitates the search for new methods to gain insights to the biological associations between arthropods and plants in terrestrial habitats.
The current study demonstrates that sequencing of eDNA from flowers can be a useful supplement to scientific experiments on terrestrial arthropods using traditional trapping methods, as well as for general biodiversity assessments. Our study uncovered eDNA from arthropods across many different taxonomic and functional groups. We obtained eDNA from pollinators (e.g., bees, butterflies, and hoverflies), predators (e.g., spiders and harvestmen), gall inducers (e.g., gall midges), parasitoids (e.g., braconid and ichneumonid wasps), and several phytophagous insects (e.g., weevils, true bugs, thrips etc.). Additionally, some species were most likely infrequent visitors (e.g., the mayfly Cloeon dipterum, the isopod Philoscia muscorum, and the earwig Forficula auricularia, although the latter is often found in flowers and on leaves). Such a non-invasive approach could become useful for better estimation of species compositions and distributions, long-term changes in abundance (Hallmann et al., 2017;Shortall et al., 2009), monitoring of endangered or invasive species, and for studies of insect fauna under environmental change (e.g., . Also, the approach could be beneficial for documenting currently unknown plant-insect interactions for rare, cryptic or even undescribed insect species and for agricultural pest management.
In the following, we discuss our results with increased focus on the limitations and future improvements of the current approach, which are essential to consider before it can be implemented to reach the above-mentioned perspectives.

| Differentiated arthropod communities
The arthropod communities recovered from eDNA clustered somewhat according to the plant species on which they were obtained.
However, the separation was far from perfect as no plant species came out as a single monophyletic group of samples (Supporting Information Figure S8). The best clustering was seen for Tanacetum, Echium and Angelica, where all but one sample clustered together.
Results from the RDA also showed some separation of communities according to the plant species, but again there were areas of overlap ( Figure 5).

| Comparison of eDNA metabarcoding with faunistic records
Generally, we found very good concordance between faunistic records of occurrence and eDNA metabarcoding results. In total, 59% of the species obtained from eDNA in Vestamager are known to occur at the site. Considering that public citizen science data are far from complete, this is rather impressive. Thus, it is most likely that the incongruence between the two sources of occurrence data is partly due to incomplete investigations in the study area. Intriguingly, we found eDNA from the braconid wasp Praon sp. (100% match to both P. longicorne and P. volucre). Praon spp. are parasitoids on aphids, which were also found by eDNA in the study.
Environmental DNA from Praon sp. was obtained in large read numbers from sample Poll_49, which also yielded many eDNA reads from the aphid Hyalopterus pruni-a known host species of P. volucre (Kavallieratos et al., 2005). This indicates that the current approach can potentially also be used to infer links between insects and their unknown host species.

| Perspectives for pollination studies
The majority of insect pollination studies focus on bees, butterflies, and hoverflies (Ollerton, 2017). However, moths and flies are likely very underrepresented in pollination analyses. For example, moths are, by a large margin, the most diverse group of pollinators due to their specialized mouthparts (Ollerton, 2017;Wardhaugh, 2015), and the importance of non-syrphid flies as pollinators is generally neglected (Orford, Vaughan, & Memmott, 2015). In crop pollination, the importance of non-bees has been demonstrated and might even be more robust to changes in land use (Rader et al., 2016). A notable finding in our study is the diversity of families obtained from various insect groups ( Figure 3, Table 1). In fact, the highest diversity (in families and species) was obtained from Diptera and been shown that DNA can be obtained from empty leaf mines, although these must be assumed to contain little DNA (Derocles et al., 2015). However, these cases are considered the exception, and at least for the larger species, the eDNA must be assumed to originate from sloughed cells or fecal pellets left on the flowers. For spiders, eDNA may have originated from webs (Blake et al., 2016;Xu et al., 2015). Finally, arthropods visiting the flowers could potentially be carrying eDNA from other arthropod species originating from previous flower visits, which could be deposited on the flowers during subsequent visits. We assume that this is very infrequent compared to the source of eDNA deposited directly on the sampled flowers by the visiting species.

| Choice of primers and database coverage in metabarcoding studies
The generic primers used in this study have been designed for metabarcoding of degraded DNA, and tested previously (Elbrecht et al., 2016;Zeale et al., 2011). Given the short amplicon size, they perform comparatively well by resolving most taxa to species level (Tables   S2-S3). However, an inadequacy of the current approach is that the targeted region (for 16S) cannot resolve more groups to species level due to its low interspecific variation and incompleteness of the reference database compared with COI. Although new probabilistic methods for taxonomic assignments using 16S could improve the approach in future studies (Somervuo, Koskela, Pennanen, Henrik Nilsson, & Ovaskainen, 2016;Somervuo et al., 2017), identification to the species level is generally necessary for inferring relevant biological information on plant associations. Nevertheless, an impressive diversity of arthropods was still detected in the study. We chose two sets of primers in this study for increased taxonomic coverage (Alberdi et al., 2018) and because the two genes offer different advantages. Ribosomal genes are superior in metabarcoding studies due to the unbiased amplification of taxa within the target group (Deagle, Jarman, Coissac, Pompanon, & Taberlet, 2014), while increased taxonomic resolution is possible for the COI gene due to the extensive reference database. Meanwhile, the COI gene will most likely provide a biased representation of an eDNA sample, which was evident from the results of the mock sample that gave a much poorer representation of the actual sample content (Table S5). Despite the more comprehensive coverage of arthropod diversity with the COI gene in this study, it is evident from accumulation analyses that even for COI, higher diversity can be obtained from flower samples than what we recovered here ( Figure 4). Higher sequencing depth might saturate the number of taxa recovered, but also more PCR replicates on the same samples would likely increase the recovered diversity (Supporting Information Figures S4-S5). Some aquatic eDNA metabarcoding studies suggests running 12 PCR replicates of each sample in order to capture the rare sequences , while a metabarcoding study on soil fungi argues that higher sequencing depth is more important for describing diversity than PCR replication (Smith & Peay, 2014).
The lack of database coverage for arthropods is a serious impediment for biodiversity studies using eDNA. The taxonomic identification of taxa is no better than the reference database used. Similar to other metabarcoding studies, the imperfect taxonomic identification of this study is thus due to the fact that (a) only an estimated 10% of arthropod species are described by science (less relevant for the study area in Denmark compared to other parts of the world), (b) not all described species have a DNA reference sequence for any of the two particular target fragments, and finally (c) the primers used here cannot positively discriminate all species, as mentioned above.
The database issue is however continuously being abated as more reference sequences are generated.

| Future focus and validation
While the current flower-arthropod eDNA approach provides great perspectives for both fundamental and applied research on plant-associated arthropod communities, a range of uncertainties should be addressed to fully validate the perspectives. In the following, we thus suggest several experiments for future studies in order to obtain better insights into the nature of arthropod eDNA on flowers. Firstly, a comparative study of actual trapping experiments in parallel with eDNA sampling would further validate the correspondence between eDNA reads and insect richness and abundance from traps. In the current study, we indirectly validated our results through Danish citizen science occurrence data (naturbasen.dk), but these data are incomplete and do not reflect the actual occurrence or abundance of arthropods in the particular time of sampling. Especially, the quantitative aspect of abundance would be relevant to investigate, as eDNA from other sources such as soil and water samples suggest that relative abundance estimates are to some degree possible (Andersen et al., 2012;Thomsen et al., 2012;Thomsen, Møller, et al., 2016). Also, the degradation of eDNA on the flowers should be investigated. The seasonal and diurnal variation of eDNA on flowers, and how this relates to actual visitation by insects, which is indirectly related to the degradation time, should also be explored. Seasonal signals have been observed in studies of aquatic eDNA from nonbiting midges (Chironomidae) (Bista et al., 2017), as well as from marine fishes (Sigsgaard et al., 2017;Stoeckle, Soboleva, & Charlop-Powers, 2017). Another question to test through experiments would be to investigate to what extent certain insects leave more eDNA on the flowers than others, and whether this is associated with more frequent and/or longer visits to the flowers. Additionally, the origin of arthropod DNA on the flowers would be very relevant to explore. For example, do insects carry eDNA from other insects with them between flowers (as mentioned above)? If this is the case, it would lead to eDNA detection of insects that had no contact with the flower, and could lead to false positive results on insect-plant associations and occurrences. Furthermore, a general improvement of methodological approaches would be relevant. It has been shown that complete mitochondrial genomes can be extracted from water samples (Deiner et al., 2017)-such an approach would greatly improve the taxonomic identification and subsequent ecological inferences in studies such as this one (Crampton-Platt, Yu, Zhou, & Vogler, 2016). We believe the DNA extraction procedure on arthropod eDNA from flowers could also be made more efficient through comparative experiments, which has been made for eDNA in water samples (Deiner, Walser, Mächler, & Altermatt, 2015).
Finally, we encourage a general replication of the approach, which should also include other types of habitats and other families of plants, in order to test the reproducibility of the method.

ACK N OWLED G M ENTS
We thank Tobias G. Frøslev for developing bioinformatic analyses tools based on the DADA2 pipeline and for use of tagged primers.

CO N FLI C T O F I NTE R E S T
None declared.

AUTH O R CO NTR I B UTI O N S
PFT conceived the idea and designed the methodology; PFT collected the samples; PFT and EES performed the laboratory work; PFT and EES analyzed the data; PFT and EES wrote the manuscript.

DATA ACCE SS I B I LIT Y
Illumina NextSeq raw sequence data are available from the Dryad Digital Repository (https://doi.org/10.5061/dryad.2j151bd).