Non‐destructive DNA extractions from fly larvae (Diptera: Muscidae) enable molecular identification of species and enhance morphological features

Insects preserved as reference specimens are important in a wide range of fields, including health, pest management and forensics. The aim of the present study was to test a non‐destructive DNA extraction method on samples of soft‐bodied insects, fly larvae, which are otherwise hard to identify morphologically. This not only provides DNA enabling molecular identification but also retains morphological reference specimens for samples belonging to collections and museums that cannot be destroyed. In this work, fly larvae identified as belonging to the family Muscidae were non‐destructively processed. DNA barcoding molecular identification allowed most of these specimens to be assigned to species. Furthermore, the visibility of seven important larval morphological characters – anterior and posterior spiracles, mouth hooks, cephalopharyngeal skeleton, locomotory welts, segmentation and colour – was assessed pre‐ and post‐DNA extraction. It was found that the morphology generally did not deteriorate post‐DNA extraction but actually improved through increased visibility of internal features. Therefore, this non‐destructive DNA extraction method not only allowed COI barcode sequences to be obtained, but also enabled a better morphological identification of the fly larvae retaining physical reference voucher specimens and avoiding the need for dissections.


INTRODUCTION
Accurate species identification is critical to many fields of entomology, including health, pest management and forensics. Preservation of insect reference specimens is critical as they provide evidence for pest distributions (White & Elson-Harris 1992) or inform forensic and/or health investigations (e.g. Byrd & Castner 2001;Hall et al. 2001). Flies (Insecta: Diptera) are one of the most important insect groups for economic and health reasons, with fly larval specimens generally preserved in ethanol to maintain the morphological features used for identification (White & Elson-Harris 1992). However, Diptera is one of the most diverse insect groups, with the larvae of many species currently not well characterised. In an animal and plant health pest surveillance context, fly larvae are usually initially identified to family level, which can often provide adequate information to assess health and economic concerns (Ferrar 1987;CSIRO 1991). Identification of specimens further, e.g. to genus or species, often requires much greater processing time and taxonomic knowledge (i.e. depending on the insect group) and can be limited by the sparse morphological characters available for larvae (White & Elson-Harris 1992).
The morphological characters available for the identification of immature flies are greatly reduced compared with those of adult flies, with characters sometimes also varying between larval instars (White & Elson-Harris 1992). The key general characters used for morphological identification of Diptera (suborder Brachycera) larvae are as follows: the external general body form, segmentation, tubercles and processes, sensory organs of the head, anterior and posterior spiracles, anal plate, colour, size and internal cephalopharyngeal (head) skeleton, mouthhooks and associated sclerites (Ferrar 1987). Of these, the most important primary characters used for identification are the general body form, spiracles, anal plate and mouthparts (e.g. Skidmore 1987;White & Elson-Harris 1992). Certain additional morphological features can also be very important for species identification, e.g. the external oral ridges in Tephritidae (White & Elson-Harris 1992) and locomotory creeping welts in Calliphoridae (Byrd & Castner 2001), which are used to separate closely related species.
Molecular identification methods, such as DNA barcoding (Hebert et al. 2003), can be particularly valuable as an alternative method to accurately identify fly larvae to species (e.g. Blacket et al. 2012;Blacket et al. 2015). However, DNA barcoding is generally a destructive procedure, resulting in the loss of a physical reference specimen (Floyd et al. 2010). In the last decade, a number of DNA extraction methods have been published, which allow preservation, at least partially, of the morphological features of the specimen (Favret 2005;Gilbert et al. 2007;Rowley et al. 2007;Castalanelli et al. 2010;Porco et al. 2010;Bahder et al. 2015). However, these protocols may be expensive and inefficient. For example, in some cases, the DNA is extracted during the process of softening specimens to be mounted on a microscope slide, where the integrity of the whole insect is not required (e.g. Gullan 2000;Favret 2005;Porco et al. 2010;Malausa et al. 2011). Moreover, the reagents used in these DNA extraction protocols are generally optimised for dealing with the chitinised cuticle of adult specimens (Castalanelli et al. 2010;Bahder et al. 2015). When applied to immature stages, this can result in significant disfiguration of the morphology of the larvae, as shown by Castalanelli et al. (2010).
The primary aim of the present study was to examine the effect of a non-destructive method for extraction of DNA, modified from Bahder et al. 2015, on the morphology of fragile soft-bodied fly larvae, in order to retain a voucher morphological specimen post-DNA extraction. Flies of the family Muscidae, which includes house flies, were chosen from an entomological collection based on their relative importance to animal and plant health. Furthermore, being samples preserved in a reference collection and not freshly collected, they offered a great example of DNA preserved for a few years, in a similar condition to that of other museum specimens, for which this protocol was originally designed. Muscidae is a large family containing more than 3800 species (Skidmore 1987), with larvae of some species known to be beneficial insect predators or plant pests (i.e. Atherigona spp.), while adults of other species can be important biting or nuisance animal pests (Ferrar 1987;Skidmore 1987). Muscidae specimens were examined pre-and post-DNA extraction to assess the visibility of morphological characters used for identification to species and genus level. This was conducted to understand if morphology-based identification of fly larvae could be performed even after DNA extraction and if the methodology, being considered non-destructive, was indeed preserving all the characters intact.

Specimens
A total of 31 Muscidae larvae were used for this study (Table  S1). Additionally, a single whole adult fly (later identified as Helina sp.) and single adult leg and pupa (both later identified as Atherigona orientalis Schiner, 1868) were also tested as controls for the DNA extractions and amplifications. Samples included all Muscidae larval specimens identified and databased at the Victorian Agricultural Insect Collection (VAIC), collected during a 3-year period between 2009 and 2012. Fly larvae were collected through routine fruit fly surveillance activities in the state of Victoria, Australia (Dominiak & Daniels 2012). All larval specimens were preserved initially in absolute ethanol (100%) immediately after collection and then stored at À20°C at the Victorian Agricultural Insect Tissue Collection (VAITC), located at AgriBio, Bundoora, Australia, within the VAIC (https://collections.ala.org.au).
The specimens used in this study included larvae of various instar stages, identified initially to family, or genus (using Ferrar 1987;White & Elson-Harris 1992), from a wide range of host fruit (Table S1). Therefore, these specimens represented a great example of the diversity collected during pest management operations, including various sizes (between 4.3 and 14.1 mm) and durations of preservation (at the time of the experiment in 2016, between 4 and 7 years), both factors likely to affect the quantity and quality of the DNA extracted.

Morphological characters
Standard morphological characters which are used to separate families, genera and species were examined in the current study (Ferrar 1987;Skidmore 1987;White & Elson-Harris 1992). In addition to overall shape, they included the posterior spiracle, anterior spiracle, mouth hooks, cephalopharyngeal skeleton, locomotory welts, segmentation and larval colour (Fig. 1).
The diagram of a generalised fly larva presented in Figure 1 was made using Inkscape v.0.92.3. High-resolution automontage images of larvae in ethanol before and after DNA extraction were obtained using the Leica Application Suite software (version 4.5.0), from five to 20 stacked images obtained using a Leica stereo microscope M205C with a DFC450 camera. Larval length (Table S1) was measured from images using the Leica 'Segment Line Tool', measuring every segment along the dorsal edge of each larva from the maxillary antennae to the posterior spiracles ( Fig. 1). The visibility of morphological characters pre-and postextraction was assessed from the automontage images. Characters were scored as being completely visible (V), partially visible (P; if even partially un-assessable) or not visible (N).

Non-destructive DNA extraction protocol
DNA extractions were performed using Proteinase K and ATL extraction buffer from a DNeasy Blood and Tissue Kit (Qiagen, Hilden, Germany), in a protocol modified from Bahder et al. (2015). Each specimen was completely immersed in a 180 μl ATL buffer plus 20 μl of Proteinase K in a 1.5 ml Eppendorf tube. Due to the elongated shape of larvae, however, for six of the larger samples (length > 1 cm), an additional 100 μl of ATL buffer, without any additional Proteinase K, was added in order to ensure the whole specimen was in contact with the extraction buffer. Genomic DNA was eluted using 50 μl of elution (AE) buffer and stored at À20°C. A blank extraction was also processed as a control. Specimens were gently mixed (not vortexed) and then incubated at 56°C overnight for 17 h.
After incubation, each specimen was removed from the extraction buffer using a single-use plastic pipette tip bent into a hook shape, and the larvae were transferred into 70% ethanol, as VAIC morphological voucher specimens.
DNA from all samples was quantified both by eye, after a 30 min electrophoresis run on agarose gel (1.5%) at 100 V, and by using a NanoDrop ND-1000 Spectrophotometer (Thermo Fisher Scientific, Massachusetts, USA) and Qubit 2.0 Fluorometer (Thermo Fisher Scientific, Massachusetts, USA) ( Fig. S1; Table S1).
PCR amplification was confirmed on 2% agarose gels. PCRs were repeated a second time for five samples that did not initially amplify; this was done following the same protocol but using 5 μl of DNA template with same primers. All amplified PCR products were purified and sequenced both ways commercially by Macrogen Inc. (Seoul, Korea).
Alignment of the resulting 621 bp long sequences was performed using the software MEGA X (Kumar et al. 2018) and the ClustalW algorithm. The BOLD database (Ratnasingham & Hebert 2007) was used to query each sequence to provide molecular identification of specimens (Table 3). Representatives of the closest matches (<10% divergent) on BOLD and GenBank (National Centre for Biotechnology Information, NCBI) to 'Muscidae sp. A' (see Results below) were used to generate the barcode COI gene tree presented in Figure 4. We used the software MEGA X to identify the best model of nucleotide substitution based on the Bayesian information criterion (BIC). This reported the GTR + I model of nucleotide substitution (gamma distribution with five rate categories) as the best model. The tree was then generated using a maximum likelihood (ML) algorithm, with 10 000 replicates (in MEGA X). All sequences have been submitted to GenBank, with accession numbers MK265327-MK265352 (Table S1).
Averages, ranges, correlation (function Correlation) and regression (function Regression) analyses between the different measurements were conducted using the Analysis ToolPak in

RESULTS
The concentration of DNA obtained from each specimen was assessed both on the gel (Fig. S1A) and using Nanodrop and Qubit, with an average yield of DNA measured at 141.6 and 13.92 ng/μl using the two machines, respectively (Table S1). The amount of DNA extracted was slightly positively correlated with larval length by a multiple R value of 0.516 or 0.417 with Nanodrop and Qubit quantifications, respectively, while the linear regression R 2 values were 0.25 using the Nanodrop quantifications and 0.17 for the Qubit ones. The time of specimen storage, up to 7 years in 100% ethanol at À20°C, did not appear to affect DNA yields. There was no apparent correlation between the age of the specimens and the concentration of DNA obtained, with the DNA concentration obtained being similar for all years of collection (Table S1).
The Proteinase K incubation for the non-destructive DNA extraction method did not alter the visibility of most of the diagnostic Muscidae larval characters (segmentation, posterior spiracles, anterior spiracles and locomotory welts) that could be previously seen externally on the body (Table 1). Importantly, in the case of the mouth hooks and cephalopharyngeal skeleton, the extraction process actually improved their  (Table 2). In fact, visibility of mouth hooks (V + P, Table 1) improved from 29% to 87% and visibility of cephalopharyngeal skeleton from 13% to 77% (V + P, Table 1). The colouration of the specimens appeared to fade slightly, with a 26% decrease from completely visible to partially visible (19%) and not visible (7%), where colouration was completely lost resulting in the exoskeleton appearing transparent (Table 1, Fig. 2). However, such loss of colour was non-significant after a McNemar test, due to the fact that colour was generally just faded and not completely lost (Table 2). Additionally, larvae were found to have reduced in size on average by 4.94% of their body length (Fig. 3). No other change in the aspect of the samples was noted. Despite the apparent low yield of DNA (Fig. S1A), the PCR results (Fig. S1B) allowed sequencing of 26 (84%) specimens (MK265327-MK265352). Most of the specimens which were previously only identified to family were assigned to species based on a COI barcoding sequence similarity-99% (Hebert et al. 2003), a threshold given to match the same species (Ratnasingham & Hebert 2007) (Table 3). Based on this threshold, four larval samples from market intercepted tomato and capsicum fruits (not originating from Victoria) initially identified as Atherigona, were subsequently identified as Atherigona orientalis; another three larval samples, identified as Muscidae, were found to be Musca domestica Linnaeus, 1758, Muscina stabulans (Fallén, 1817) and Stomoxys calcitrans (Linnaeus, 1758) (Table 3). DNA extraction and amplification generated viable sequences also from the control specimens tested here, with the pupa and the adult leg identified as A. orientalis and the whole adult fly being a Helina sp.
However, one taxon named here 'Muscidae sp. A' (Table 3, Fig. 4a,b) could not be assigned to a species using DNA barcoding, due to it currently not being represented by reference DNA sequences on BOLD or GenBank. The sequence divergence between 'Muscidae sp. A' and the most similar sequences held in public DNA databases was found to be 9-10% divergent (Table 3) and included a broad range of taxa (Fig. 4a). Two undetermined Muscidae species from Western Australia (WA) and the Australian Capital Territory (ACT) were the closest matches to this taxon. However, only the specimens from ACT could be included in Figure 4 since sequences from the three specimens from WA were not publicly available (Fig. 4a). The next closest matches belonged to four different genera: the floricolous species Colocasiomyia xenalocasiae (Okada, 1980) (Diptera: Drosophilidae), two predatory genera Hydrotaea and Metopia (Diptera: Muscidae) and Eumesembrinella randa Walker, 1849 (Diptera: Muscidae) (Fig. 4a).

Morphological assessment
Initial morphological tests were carried out on 31 Muscidae larvae (Table S1), collected through plant pest fruit fly surveillance activities (Dominiak & Daniels 2012), which had been previously identified either as Muscidae, or further to Atherigona (Couri & De Araujo 1992), the only genus in the family known to be a plant pest (Ferrar 1987). This initial examination involved assessing the visibility of seven key larval morphological characters (Fig. 1, Table 1). However, this process was challenging due to one of the main characters, the cephalopharyngeal skeleton, being internal to the insect body and, therefore, not assessable. The Proteinase K incubation for the non-destructive DNA extraction method did not alter the visibility of most of the diagnostic Muscidae larval characters (segmentation, posterior spiracles, anterior spiracles and locomotory welts) that could be previously seen externally on the body. Most importantly, this protocol allowed visual morphological assessment of cephalopharyngeal skeleton and mouth hooks that would have required a dissection to be examined, as traditional morphology-based identification usually requires dissection of the head of the larvae to expose the skeleton and mouth hooks (Ferrar 1987;Skidmore 1987). Here, such visibility improvement was due to a general clearing of specimens that became more transparent after the incubation period. This clearing altered the colour from opaque to clear or semi-transparent improving the overall visibility (Table S2) and exposing these characters for further examination (Fig. 2). To verify if larval size was affected by the incubation period, we measured the length of 22 larvae pre-and post-incubation (Table S1). While larvae were found to have reduced in size on average by <5% of their body length (Fig. 3), this did not alter the overall larval shape or the morphology of the seven diagnostic characters in the specimens (Table 1). Ultimately, the discolouration of the specimens appeared to remain mostly localised in the area of the head and was found to be non-significant.

COI barcoding identification
A total of 31 Muscidae larvae (Table S1) were used to test a nondestructive DNA extraction protocol, modified from Bahder For the characters where changes were observed, the χ 2 value (with degree of freedom) is reported together with P value and significance.
(a)     et al. (2015), outlined above. COI barcoding identification using DNA sequences allowed identification to species level of all the specimens which had DNA sequences present on both BOLD and GenBank. This allowed us to identify Musca domestica (N = 1), Muscina stabulans (N = 1) and Stomoxys calcitrans (N = 1). These three cosmopolitan species are all known to utilise decomposing organic matter, including rotting vegetable matter (Permkam & Hancock 1995); they are also known to be potential nuisance or disease vector pests, potential agents of myiasis or biting blood-feeding pests, respectively (Ferrar 1987). Among the controls tested, it is important to highlight the result obtained from the single leg of adult fly that generated a viable sequence allowing to identify it as Helina sp. This suggests that the nondestructive method adopted can be used on adults and does not require to be applied to the whole body. Sixteen larval samples (Muscidae sp. A) could not be identified to the species level due to the lack of matching sequences on any public database. The morphology of the mouthparts observed in 'Muscidae sp. A' included curved slender sharply pointed mouthhooks with lower barbed oral sclerites, which became visible after the DNA extraction process ( Fig. 2b; Fig. 4b), features that make it likely that this is a predatory species (Skidmore 1987;Permkam & Hancock 1995). The lack of a matching reference DNA sequence in public databases highlights the large number of currently undescribed and/or unidentified Diptera with DNA sequences. In fact, only approximately 20% of recognised Muscidae species are currently represented on BOLD (accessed December 2018). This highlights the need to deposit sequences for correctly identified specimens (Hodgetts et al. 2016) and, at the same time, confirms how detrimental missing data or incorrect annotation of barcode sequences deposited on such database can be (e.g. Bengtsson-Palme et al. 2016;Mioduchowska et al. 2018). The method presented here enables better morphological examination of the samples, contributing to the identification of specimens submitted to barcode reference public databases, which could be linked to high-quality images and metadata available for future re-examinations of the samples (Batovska et al. 2016).

Implications for taxonomy, pest management and forensic entomology
The application of a non-destructive DNA extraction protocol brought the advantage of increased visibility of morphological characters used for species identification and, consequently, the opportunity to retain a voucher specimen which is a key element of collections and museums. Increased visibility of morphological characters may allow rapid species identification avoiding the morphologically destructive and laborious requirement for dissections. Additionally, for undescribed species, this can enable formal taxonomical description of the larvae retaining intact specimens. Similarly, unidentified larvae can be linked to a DNA sequence that can be compared to that of adult insects, having the potential to fill the gaps in insect's identification at different life stages. Therefore, adoption of this technique at a larger scale will enable a better understanding of the morphological characters of larvae, linking them not only to a DNA sequence but also, through the implementation of a database, to their adult form. The method presented here improves morphological examination and therefore the identification of fly larvae, a difficult group to work on, particularly when early instars are under examination. This method may be particularly useful in animal and plant health surveys that generate large number of samples, which need accurate identification, as seen in mosquito surveys (Batovska et al. 2016) and Queensland fruit flies (QFF) surveys (Blacket & Malipatil 2010).
In a pest management context, it is fundamental to preserve insects as reference specimens as it provides evidence for pest distributions and a valuable tool for future morphological comparisons and identifications (White & Elson-Harris 1992). At the same time, being able to generate and associate DNA sequences from a morphological specimen can allow data to travel rapidly, significantly reducing the times that would be required to physically compare morphological specimens deposited in geographically distant collections. The application of this non-destructive method in a pest management context could therefore lead to reduced diagnostic times and a quick pest management response between detection and identification, allowing for more effective biosecurity operations (Plant Health Australia 2018).
In a forensics context, species-level identification of entomological samples is fundamental for post-mortem interval (PMI) estimation (Grzywacz et al. 2017). Furthermore, the preservation of a physical specimen is critical and used as evidence in court cases (Grzywacz et al. 2017). Muscidae, which are well known to breed in carrion and decomposing human bodies (e.g. Sukontason et al. 2004;Szpila et al. 2010;Velásquez et al. 2010), are used as an indicator of time of death and other important forensic parameters (Grzywacz et al. 2017). Where larval measurements are required, this protocol could be used if the operation is performed pre-DNA extraction, in order to avoid the risk of shrinking reported here. Ultimately, our results suggest that non-destructive DNA extractions of Muscidae larvae not only enabled obtaining DNA sequences used to accurately identify species but also retained the specimen intact for evidence, should this be needed.  Table S1), seven morphological characters have been examined before and after DNA extraction. These are the anterior and posterior spiracles, mouth hooks, cephalopharyngeal skeleton, locomotory welts, segmentation and colour. A visibility score was attributed depending on the character being visible (V), partially visible (P) or not visible (N). Red Xs highlight a change in visibility.

Figure S1
Photos of electrophoretic gel runs under UV light. The agarose gel (1.5%) (A) shows the results of DNA extractions for all the samples (N = 34) and one blank extraction (B); agarose gel (B) shows COI DNA amplification attempted for all DNA extractions (N = 34), one blank extraction (B) and one PCR control (C).