Assessing the feasibility of fly based surveillance of wildlife infectious diseases

Monitoring wildlife infectious agents requires acquiring samples suitable for analyses, which is often logistically demanding. A possible alternative to invasive or non-invasive sampling of wild-living vertebrates is the use of vertebrate material contained in invertebrates feeding on them, their feces, or their remains. Carrion flies have been shown to contain vertebrate DNA; here we investigate whether they might also be suitable for wildlife pathogen detection. We collected 498 flies in Taï National Park, Côte d’Ivoire, a tropical rainforest and examined them for adenoviruses (family Adenoviridae), whose DNA is frequently shed in feces of local mammals. Adenoviral DNA was detected in 6/142 mammal-positive flies. Phylogenetic analyses revealed that five of these sequences were closely related to sequences obtained from local non-human primates, while the sixth sequence was closely related to a murine adenovirus. Next-generation sequencing-based DNA-profiling of the meals of the respective flies identified putative hosts that were a good fit to those suggested by adenoviral sequence affinities. We conclude that, while characterizing the genetic diversity of wildlife infectious agents through fly-based monitoring may not be cost-efficient, this method could probably be used to detect the genetic material of wildlife infectious agents causing wildlife mass mortality in pristine areas.

Grubaugh and colleagues went a step further and proposed to use blood meal analysis as a tool to survey human pathogens in remote tropical locales, which they refer to as xenosurveillance 15 . Blood-sucking arthropods however often exhibit strong host preferences, which may be suboptimal when the objective is to survey infectious agent diversity in complex ecosystems with high biodiversity. Non blood-sucking invertebrates feeding on vertebrate fecal matter and/or carrion, such as blow and flesh flies (here referred to simply as flies), might also be suitable for the surveillance of wildlife infectious agents. Flies are abundant and ubiquitous, have little host preference and are easy to trap 16,17 . We also recently showed that flies often contain DNA fragments of their mammalian hosts 16,17 . Finally, the genetic material of a number of hitchhiked microorganisms was already detected in flies, including food borne bacteria, e.g. Salmonella spp., and enteric viruses [18][19][20][21] . For example, Newcastle disease virus (NDV) RNA was detected in, and even virions were isolated from, flies collected in the vicinity of infected chickens. Similarly, H5N1 RNA was found in flies collected in the surroundings of a poultry farm with infected birds 22,23 .
These studies however focused on flies caught near high-density vertebrate populations, which raises questions about the broad applicability of this method. In this study, we investigate whether flies are suitable for vertebrate-infecting microorganism surveillance in complex ecosystems with high species richness. We analyzed flies collected in a remote tropical rainforest, Taï National Park (TNP), Côte d'Ivoire, and focused on an a priori favorable target: adenoviruses (AdV; family Adenoviridae). AdV are shed massively in feces, are usually host-specific and have already been detected in many vertebrates in TNP 24,25 .

Results
Out of 498 flies, 156 (31%) contained mammalian 16 S mt DNA. We considered all mammal positive flies as suitable for AdV screening, but due to shortage of material we could only test 142 of these flies. From eight flies AdV DNA could be amplified and sequenced once and six AdV sequences could be confirmed a second time. A BLAST search revealed that four of the six sequences were ≥ 98% identical to a simian AdV sequence determined from a mona monkey (Cercopithecus campbelli, KP274048) in Côte d'Ivoire (Fly 92, Fly 101, Fly 740, Fly 1355) 26 . The remaining two sequences (Fly 381, Fly 1375) showed 100% identity with simian AdV sequences obtained from captive chimpanzees in the US (FJ025905, FJ025926) and from a wild chimpanzee in TNP (JN163974) and 98% identity with a murine AdV 2 sequence (NC014899), respectively (Table 1).
We also performed phylogenetic analyses in both maximum likelihood and Bayesian frameworks to better determine the position of fly-derived AdV sequences within the AdV family tree (Fig. 1). In line with the BLAST search, sequences from Fly 92, Fly 101, Fly 740 and Fly 1355 formed a well-supported clade with the mona monkey AdV sequence (aLRT 0.99, pp 1; Fig. 2C). The sequence of Fly 381 clustered with AdV sequences from captive and wild chimpanzees (FJ025905, FJ025926, JN163974, FJ025906, FJ025904, FJ0295899; aLRT 0.98, pp 1; Fig. 2B) 24,27 . These sequences nested within the clade corresponding to species Human mastadenovirus C, albeit with a much lower statistical support (HAdV-C; aLRT 0.93, pp 0.85). The sequence of Fly 1375 was most closely related to the murine AdV B (Murine AdV 2) sequence (aLRT 0.99, pp 1; Fig. 2A).
Meal analyses based on Sanger sequencing identified plausible hosts in 3 of the 6 AdV positive flies (Fly 92, Fly 101, Fly 1355; Table 1). To further investigate potential hosts, we performed an in-depth fly meal analysis of all six AdV positive flies using a metabarcoding approach. After quality trimming, 65,552 reads-8,683 to 12,278 reads per fly -were used for taxonomic assignment. Overall, we identified hosts from 9 mammal families and 10 genera/species. Five flies contained DNA from multiple hosts (Fly 92, Fly 101, Fly 381, Fly 1355, Fly 1375; Table 1). The 3 plausible hosts identified by Sanger sequencing were confirmed by this approach. Fly 740, which harbored one of the simian AdV sequences, only contained rodent DNA fragments. For the last two flies, the metabarcoding approach revealed the presence of DNA fragments belonging to plausible hosts, i.e. rodents in Fly 1375 and a hominid in Fly 381. As the hominid family contains the two closely related genera Homo and Pan, we manually checked the according sequences and were able to refine the assignment to Pan troglodytes.

Discussion
We investigated the feasibility of using DNA derived from flies for the surveillance of wildlife infectious diseases. We were able to detect short AdV sequences in 6 flies, that is 4.2% of all mammal positive flies. We used these sequences for phylogenetic analyses and found that most represented AdVs known to infect monkeys and great apes in the region 24,25 . The close relationship of four sequences with an AdV sequence obtained from a single mona monkey supports the notion that this AdV may be relatively abundant in the region 26 . The fifth fly-based simian AdV sequence clustered with HAdV-C sequences and clearly belonged to the chimpanzee clade. HAdV-C viruses are very host-specific and seem to have co-diverged with their hominid hosts 25 . We also detected what is likely a new rodent AdV, thereby underlining the potential of flies to also monitor small-bodied species. Finally, our high-throughput fly meal analyses identified multiple hosts, including plausible ones, in 5 of 6 AdV positive flies. These results demonstrate that fly-based analyses allow for the simultaneous characterization of microorganism genetic diversity and their distribution in local mammalian hosts.
In comparison with detection rates in fecal samples (11 to 58%), the AdV detection rate in flies appears low 25,28 . This might result from the extreme dilution of vertebrate-infecting microorganisms in carrion flies, which itself results from the interplay of meal quantity, quality, frequency and the speed of digestive processes 29 . Given this low detection rate, systematic screenings would probably only make sense where fly collections established for other purposes, e.g. mammal diversity assessment, are available. Sample pooling combined with deep sequencing of PCR products may help decrease the workload and costs of such a screening approach.
Further investigations are needed to determine the extent to which the approach described here is applicable to other microorganisms. The low detection rate of AdV sequences in flies suggests that surveillance of non-enteric microorganisms might be complicated. However, in the case of outbreaks with massive production of microorganisms, e.g. Ebola virus outbreaks 30 , there might be a good chance that pathogen nucleic acids Scientific RepoRts | 6:37952 | DOI: 10.1038/srep37952 are detectable in flies. Of course, the detection probability will depend on the biology of the microorganism of interest. Here also, high throughput sequencing approaches (including shotgun sequencing) could open up new perspectives, as recently shown with mosquitoes 13,14 . It was recently demonstrated that portable sequencing devices such as the MinION (Oxford Nanopore Technologies, Oxford, United Kingdom) can be used for on-site sequencing in outbreak situations 31,32 . These technologies only require a basic molecular laboratory in the field. Such laboratories currently allow users to perform sequencing, though the high error rate and relatively low throughput of MinIONs currently limit them to amplicon sequencing based approaches. The limited needs in the present study suggest it could be feasible to conduct fly/amplicon-based wildlife surveillance during major outbreaks, including in resource-poor countries.
Invertebrates other than carrion flies and blood sucking arthropods might also constitute a valuable source of information on vertebrate-infecting microorganisms. For example, leeches can ingest several times their weight host blood in a single meal and could therefore be seen as long-term blood tanks. Most recently, a number of viruses (with DNA or RNA genomes) were shown to persist up to four months in experimentally fed aquatic leeches, with bovine parvovirus being detectable for up to six months 33 . Terrestrial leeches, whose lifestyle might be more compatible with broad, undirected wildlife molecular epidemiology, were also recently shown to allow retrieval of their host's DNA 34 . Both aquatic and terrestrial leeches warrant a careful examination of their potential as tools for wildlife microorganism sampling.
Finally, an alternative to microorganism nucleic acid detection might be the detection of antibodies reactive to these microorganisms. If this is feasible, it would open the potential to examine wildlife exposure to microorganisms. Detection of trypanosome-reactive antibodies from a number of haematophagous dipterans was reported as early as 1962 35 . This potential tool then fell into a long-lasting oblivion until its recent rediscovery by Barbazan and colleagues, who showed that blood-fed mosquitoes contain detectable levels of various virus-reactive antibodies 36 . Determining whether invertebrate-based serological surveys can be conducted in the wild promises to be an exciting area of future research.

Material and Methods
Sample collection. Sample collection was performed with the permission of the Ivorian national parks authorities (OIPR) and the ministry of research of Côte d'Ivoire. Flies used in this study were captured in Taï National Park, Côte d'Ivoire, a tropical rainforest with remarkable mammal biodiversity. Overall, 498 flies were captured using customized fly traps consisting of a pyramidal mosquito net over a plastic bowl containing a  commercial bait (Unkonventionelle Produkte Feldner, Waldsee, Germany) or a piece of meat 16 . After collection, flies were either placed in Cryotubes (Thermofischer, Waltham, MA, USA), and stored in liquid nitrogen tanks, or in 50 ml Falcon tubes (Carl Roth, Karlsruhe, Germany) containing silica and stored either at ambient temperature or 4 °C.  39 . Sequences were assigned to species or higher taxa using BLAST 40 and following the rationale depicted in Calvignac-Spencer and colleague's study 16 . Most of these assignments were made in course of the study of Schubert & Stockhausen et al. 2014. Flies that produced a band of the expected size but did not yield interpretable sequences were also used for AdV screening.

Nucleic acid extraction.
Adenovirus screening. We implemented various countermeasures to minimize contamination. To avoid cross-contamination with native AdV DNA, DNA extraction was never performed simultaneously with other sample types (fecal and tissue samples). To minimize contamination with PCR products, PCR setup and post-PCR analysis steps were performed in separate, dedicated rooms. In addition, a glovebox exclusively dedicated to fly analysis was used to set up all PCR performed for this study. We also used dUTP instead of dTTP and cleaned our confirmatory reactions with uracil n-glycosylase (UNG) to further reduce the likelihood of carry over contamination with PCR products (see below). It should also be noted that before this study, AdV sequences had never been amplified in the laboratory where AdV screening was performed. A semi-nested PCR system described by Pauly et al. 26 was used for detection of adenoviruses 26 . Primers had been designed for the generic detection of mastadenoviruses and targeted a short 160 bp fragment of the hexon gene (6500 s 5′ CgCAgTggKCNTWCATgCACAT-3′ , 6500 s 5′ -ACCCACgAYgTSACNACNgA-3′ , 6500as 5′ -gTgCCggTgTANggYTTRAA-3′ ). All PCRs were carried out in a 25 μ L mix containing 0.2 mM dNTP (with dUTPs replacing dTTPs), 4 mM MgCl 2 , 0.2 μ M of each primer, 1.25U Platinum ® Taq Polymerase (Invitrogen) and 2.5 μ L 10x PCR Buffer (Invitrogen). Reactions of the first round were seeded with 200 ng DNA extract or 5 μL if DNA concentration was below 40 ng/μ L and the second round with 1.5 μ L of 1:40 diluted PCR product of the first round. Cycling conditions were: 95 °C 5 min, 40 cycles [95 °C 30 sec, 56 °C 30 sec, 72 °C 60 sec], and 72 °C 10 min. PCR products were Sanger sequenced as described above. Flies apparently containing amplifiable AdV nucleic acids were confirmed using the same assay but including 0.3 U Amperase ® UNG (Invitrogen) in the first round reaction so as to minimize the risk of contamination with PCR products. Again, PCR products that yielded a band were sequenced using the Sanger method. All chromatograms were evaluated using Geneious Pro v9.1.3 39 and the respective sequences were confirmed to be adenoviral by a BLAST search 40 .
Phylogenetic analysis. The dataset used for phylogenetic analysis comprised AdV sequences generated in this study (n = 6) and hexon gene sequences extracted from all available complete genomes of the genera Mastadenovirus and Atadenovirus (n = 506). This set of sequences was reduced to only contain unique sequences using FaBox v1.41 41 . All remaining 363 sequences were aligned at the nucleotide level in SeaView v4 42 , using the MUSCLE algorithm 43 . Conserved blocks were selected using Gblocks 44 (as implemented in SeaView v4) resulting in an alignment of 370 positions. After block selection, sequences were de-replicated again using FaBox v1.41 (218 unique sequences). The best-fit model of nucleotide substitution was selected using JModelTest v2.14 and the Bayesian information criterion 45 (SYM + I + G). Maximum likelihood (ML) as well as Bayesian frameworks were used for tree reconstruciton. The ML tree was reconstructed using PhyML v3.0 46 . Branch support was estimated using SH-like approximate likelihood-ratio tests. The Bayesian phylogeny was estimated using BEAST v1. 8 Carlo analyses were run; convergence and effective sample sizes were checked in Tracer v1.6 (combined effective sampling size was > 200). Tree files of all runs were combined using LogCombiner v1.8.2 and the maximum clade credibility tree was extracted using TreeAnnotator v1.8.2.
Identification of fly meals. We performed in depth meal analysis of the 6 AdV-positive flies using a metabarcoding approach. Primary 16 S amplicons were generated with the same primers and under the same conditions as mentioned above 16,37 . Preparation of the generated amplicons for the Illumina MiSeq (San Diego, CA, US) sequencing platform included a first PCR in which Illumina specific overhang adapters are added to the fragment and a second PCR in which sequencing adapters and sample specific indexes were added. The first PCR reaction contained 25 μ L of 1:50 diluted PCR-product, 0. . Raw reads were analyzed using a custom bioinformatic pipeline: paired-end reads were first merged with the program illuminapairedend of the software package OBITools v1.1.18 setting the minimum alignment score to 40. Primer sequences were then removed using the program Cutadapt v1.2.1 48,49 before quality trimming was conducted with the program Trimmomatic v0.35 50 setting the quality score to 30 over a sliding window of four bases. We then de-replicated identical sequences and filtered out those that occurred less than 10 times using the obiuniq and obigrep commands of OBITools. For taxonomic assignment a reference database was built by performing an in silico PCR on all mammalian and vertebrate sequences available at Genbank (http://www.ncbi.nlm.nih.gov/genbank/) using the program ecoPCR v0.2 51,52 . This not only contained the reference sequences themselves, but also a unique taxid that links each sequence to a taxonomy database where the taxonomic information was stored. For the assignment itself we used the ecotag command of OBITools. Ecotag uses the global alignment algorithm Needleman-Wunsch to find the most similar sequence to the query sequence in the reference database with a minimum identity level of 0.95 (primary reference sequence). The query sequence is then assigned to the most common recent ancestor of the primary reference sequence and the most similar reference sequence to the primary reference (secondary reference sequence).