Protein‐encoding ultraconserved elements provide a new phylogenomic perspective of Oestroidea flies (Diptera: Calyptratae)

The diverse superfamily Oestroidea with more than 15 000 known species includes among others blow flies, flesh flies, bot flies and the diverse tachinid flies. Oestroidea exhibit strikingly divergent morphological and ecological traits, but even with a variety of data sources and inferences there is no consensus on the relationships among major Oestroidea lineages. Phylogenomic inferences derived from targeted enrichment of ultraconserved elements or UCEs have emerged as a promising method for resolving difficult phylogenetic problems at varying timescales. To reconstruct phylogenetic relationships among families of Oestroidea, we obtained UCE loci exclusively derived from the transcribed portion of the genome, making them suitable for larger and more integrative phylogenomic studies using other genomic and transcriptomic resources. We analysed datasets containing 37–2077 UCE loci from 98 representatives of all oestroid families (except Ulurumyiidae and Mystacinobiidae) and seven calyptrate outgroups, with a total concatenated aligned length between 10 and 550 Mb. About 35% of the sampled taxa consisted of museum specimens (2–92 years old), of which 85% resulted in successful UCE enrichment. Our maximum likelihood and coalescent‐based analyses produced well‐resolved and highly supported topologies. With the exception of Calliphoridae and Oestridae all included families were recovered as monophyletic with the following conclusions: Oestroidea is monophyletic with Mesembrinellidae as sister to the remaining oestroid families; Oestridae is paraphyletic with respect to Sarcophagidae; Polleniidae is sister to Tachinidae; Rhinophoridae sister to (Luciliinae (Toxotarsinae (Melanomyinae + Calliphorinae))); Phumosiinae is sister to Chrysomyinae and Bengaliinae is sister to Rhiniidae. These results support the ranking of most calliphorid subfamilies as separate families.


Introduction
Flies have had immense lasting effects on the earth and human civilization. They have been more than an annoying buzzing in the Earth's history, as they have colonized almost (Rhinophoridae), flesh flies (Sarcophagidae), tachinid flies (Tachinidae) and the enigmatic McAlpine's fly (Ulurumyiidae) (Fig. 1). In urban environments, blow flies and flesh flies are mechanical vectors of important diseases, as they feed on garbage and corpses of animals and humans (Buenaventura et al., 2009;Byrd & Castner, 2010). Whereas these flies are critical elements for public health, they are also useful as indicators of time and place of death in forensic investigations (Wells et al., 2001;Mulieri et al., 2012;Meiklejohn et al., 2013;Vairo et al., 2017). Carrion flies provide ecosystem services, such as nutrient recycling and pollination, which are essential for the sustainability and management of urban and wild ecosystems. In addition, some Oestroidea lineages develop in a close association with a broad range of hosts as parasitoids or kleptoparasitoids (Piwczyński et al., 2017), which are hypothesized to have evolved independently across lineages. Polleniids develop on earthworms, ameniines and melanomyines (Calliphoridae) and some sarcophagids on terrestrial gastropods, rhinophorids on woodlice, tachinids on terrestrial arthropods (mostly insects) and miltogrammines (Sarcophagidae) as kleptoparasitoids on soil-nesting wasps, bees and ants (Eggleton & Belshaw, 1992;Feener Jr. & Brown, 1997). Tachinidae includes mostly parasitoids playing significant roles in regulating their host populations, and hence being relevant for biological control programmes (Stireman et al., 2006). During the last decade, several studies have explored the phylogenetic relationships of Oestroidea (Marinho et al., 2012;Singh & Wells, 2013;Zhang et al., 2016a;Cerretti et al., 2017;Michelsen & Pape, 2017;Buenaventura et al., 2019;Cerretti et al., 2019;Kutty et al., 2019); however, our knowledge about their evolution and diversity patterns remains limited, especially with regard to Calliphoridae. The most limiting factor has been the identification and screening of orthologous loci across an evolutionarily distant set of taxa, which when compared in phylogenetic analyses, recover poorly supported relationships of the important major lineages. In addition, studies using molecular (Kutty et al., 2010;Marinho et al., 2012) or morphological data (Pape, 1992;Rognes, 1997;Cerretti et al., 2017) have suffered from equivocal phylogenetic results due to limited taxon sampling, limited data or both.
Massively parallel DNA sequencing technologies are used to generate vast amounts of molecular data (Metzker, 2010;Shendure et al., 2017). These technologies, also known as high-throughput sequencing or next-generation sequencing, outperform Sanger sequencing in efficiency to recover genomic data (Blaimer et al., 2015;Matos-Maraví et al., 2019) at a cost that is becoming more accessible. Approaches such as targeted enrichment (e.g. anchored hybrid enrichment, ultraconserved elements), random reduced-representation of the genome (e.g. transcriptomics, RAD-seq, MIG-seq) and whole-genome sequencing are being used to understand biological diversification in time and space.
Targeted enrichment has gained popularity in phylogenetic studies because it can recover DNA loci with a particular rate of evolution (fast and slow) or under different selective pressures (Lemmon & Lemmon, 2013). Targeted enrichment can also produce large amounts of molecular data from sub-optimally preserved samples and/or highly fragmented DNA, such as those coming from museum specimens Blaimer et al., 2016b) whereas the use of transcriptomics and RAD-seq is generally limited to fresh or properly preserved tissue (e.g. tissues stored in liquid nitrogen or RNAlater). Given the practical limitations of collecting rare species, or species whose habitats are inaccessible or have completely disappeared, museum collections present an invaluable source of biological tissues for targeted-enrichment molecular studies. However, DNA derived from museum tissues and legacy preserved DNA aliquots from previous studies, is often highly degraded and low in concentration. DNA aliquots may face the same constraints of using suboptimally preserved samples like museum specimens. To date, only a few studies using anchored hybrid enrichment (AHE) or ultraconserved elements (UCE) have capitalized on museum specimens, but not on legacy DNA aliquots McCormack et al., 2016;Blaimer et al., 2016b;Baca et al., 2017;Wood et al., 2018;Derkarabetian et al., 2019), although with few exceptions (Buenaventura et al., 2019).
Ultraconserved elements and AHE methods target ultraconserved or highly conserved genome regions flanked by regions of greater nucleotide variability, which make them useful to reconstruct phylogenies at wide evolutionary scales (McCormack et al., 2013b;da Fonseca et al., 2016). Differences between these two methods seem to be related to targeting highly conserved noncoding and coding regions of the genome with UCE  or coding regions with AHE (Lemmon et al., 2012). However, studies on arthropods reported approximately 61% or more of the UCE loci overlapping with some coding genomic regions (Branstetter et al., 2017;Hedin et al., 2019;Kieran et al., 2019). Even though invertebrate UCEs have been characterized as predominantly coding sequences (Bossert & Danforth, 2018), UCE probes have usually been designed without much consideration on whether they capture loci from the transcribed portion of the genome and without refining them using available transcriptome resources. Thus, protein-coding portions are generally discovered during UCE data processing, but not at the probe design stage. Previous probe designs did not take into consideration the development of probes targeting loci only found within coding regions McCormack et al., 2013a;Smith et al., 2014;Crawford et al., 2015;Baca et al., 2017;Branstetter et al., 2017;Faircloth, 2017;Wood et al., 2018;Oliveros et al., 2019).
The design of taxon-specific UCE probes is a growing field, which is rapidly developing a body of knowledge (Branstetter et al., 2017;Gustafson et al., 2019a;Gustafson et al., 2019b;Kulkarni et al., 2020). Recent literature justified the use of tailored UCE probes as they frequently outperform general (i.e. universal) probes and should aid in locus recovery (Branstetter et al., 2017;Gustafson et al., 2019a;Gustafson et al., 2019b;Kulkarni et al., 2020). Previous molecular studies on Oestroidea (Kutty et al., 2010;Buenaventura et al., 2017;Buenaventura & Pape, 2017a;Buenaventura et al., 2019) and similar radiations predict that reconstructing a strong phylogeny would require a large increase in molecular data (Dell'Ampio et al., 2014; Giarla & Esselstyn, 2015) and the combination of information from multiple genome regions to attempt to produce accurate species tree estimates (Degnan & Rosenberg, 2009). A unique tailored UCE probe set for Calyptratae and Oestroidea flies could increase locus recovery, whereas a generalized UCE probe set designed for the entire order like the Diptera-wide UCE probe set (Faircloth, 2017) would capture a smaller and perhaps an insufficient number of loci to reconstruct this rapid radiation. Comparative analyses on loci recovery between generalized and taxon-specific UCE probes for Diptera lineages should be considered in future studies.
Possibly due to the availability of the probe sets (Faircloth, 2017), lab protocols (see www.ultraconserved.org) and bioinformatics tools such as PHYLUCE (Faircloth, 2016), phylogenetic reconstructions using a UCE approach are becoming more frequent. It has been used to resolve the phylogeny of various vertebrate groups McCormack et al., 2012;Faircloth et al., 2013;McCormack et al., 2013a;Sun et al., 2014;Crawford et al., 2015) and arthropod taxa (Blaimer et al., 2015;Faircloth et al., 2015;Starrett et al., 2016;Blaimer et al., 2016a;Blaimer et al., 2016b;Baca et al., 2017;Branstetter et al., 2017;Wood et al., 2018;Forthman et al., 2019;Kieran et al., 2019;Gustafson et al., 2019b). The use of UCEs to resolve relationships within dipterans has not been explored, and as flies are one of the four super-radiations of insects (along with beetles, wasps and moths) that account for the majority of animal life on Earth (Wiegmann et al., 2011), they constitute an interesting taxon to be studied with the combined use of the UCE targeted enrichment method and massively parallel DNA sequencing technologies. Thus, we aimed to design a taxon-specific probe set to capture only protein-encoding UCEs to reconstruct the phylogenetic relationships of one of the largest biotic radiations on Earth: the Oestroidea flies.

UCE probe design
Our UCE probe design followed the standard UCE workflow  to identify ultraconserved genomic regions and design enrichment probes targeting those regions. This workflow performs synteny-based genome-genome alignment to identify UCEs. Details of the workflow are found in Faircloth et al. (2012) and step-by-step procedures are also available online (https://phyluce.readthedocs .io/en/latest/tutorial-four.html) and unless otherwise noted scripts used below are contained within the PHYLUCE package Faircloth, 2016). Three FASTA format genome assemblies of Glossina fuscipes Newstead (Glossinidae, Hippoboscoidea) (GCA_000671735.1), Lucilia cuprina (Wiedemann) (Calliphoridae, Oestroidea) (GCA_000699065.2) and Musca domestica Linnaeus (Muscidae, Muscoidea) (GCA_000371365.1) were downloaded from GenBank and converted to 2bit format. The FASTA format genome and transcriptome assemblies of Sarcophaga (Liopygia) crassipalpis Macquart (Sarcophagidae, Oestroidea), which were sequenced in a parallel project (Buenaventura et al. unpublished data), were also converted to 2bit format and included. We simulated reads (without sequencing error) from the genomes and the transcriptome, and aligned them to the base genome (= S. crassipalpis) using art (Huang et al., 2012) with the built-in feature off. This genome assembly was used as the base genome due to our interest in having specific probes for Sarcophagidae, and also because of the contiguity, assembly quality and level of annotation of this genome (Buenaventura et al. unpublished data). We then aligned simulated reads to the base genome using stampy (Lunter & Goodson, 2011), which is efficient in aligning sequences to a divergent reference sequence. The resulting output was then converted to BAM format. Thus, by mapping simulated sequence data from exemplar genomes to the base genome sequence, we identified putatively orthologous loci shared between the exemplar taxa and the base taxon. These conserved regions had a sequence divergence of <5%. We then converted BAM files to BED format, order each line in the BED file by chromosome/scaffold/contig and merge together putative regions of conservation that are proximate using bedtools (Quinlan, 2014). This large number of conserved regions were filtered to remove repetitive regions using the script 'phyluce_probe_strip_masked_loci_from_set' . The total number of merged, putatively conserved regions in each exemplar taxon that was shared with the base genome of S. crassipalpis after repetitive regions were removed, are found in Table 1. We then used the script 'phyluce_probe_get_multi_merge_table' to put the identified conserved loci across taxa into an SQlite database. The database was then queried to identify how many loci were shared between taxa using 'phyluce_probe_query_multi_merge_table'. We selected the total loci shared between our base taxon (genome of S. crassipalpis) and all other exemplar taxa (25 166 loci) to be sure these loci are found in most/all of our taxa to be enriched ( Table 2).
The validation of conserved loci started with extracting FASTA sequences from the base genome using the script 'phyluce_probe_get_genome_sequences_from_bed'. We then designed a temporary probe set for the selected loci (25 166 loci) from the base taxon using the script 'phyluce_probe_ get_tiled_probes', selecting two probes per locus with 3× tiling that overlap the middle of the targeted locus and removing potentially problematic probes with >25% repeat content and GC content outside of the range of 30-70%  (30% > GC > 70%). We aligned the probes to themselves using the script 'phyluce_probe_easy_lastz', removed duplicates from the temporary probe set using the script 'phy-luce_probe_remove_duplicate_hits_from_probes_using_lastz' and kept 25 004 loci and 38 755 probes. At this point, the genome assembly of Drosophila melanogaster Meigen (GCA_000001215.4) in FASTA and 2bit format, was used to represent an outgroup for Calyptratae and Oestroidea and to bridge the divergence between outgroups and ingroups. Then, we aligned the temporary probe set to each genome and the transcriptome using the script 'phyluce_probe_ run_multiple_lastzs_sqlite' with 50% of the minimum sequence identity to accept a match. We extracted FASTA data from each of the exemplar sequences using the loci alignments created in the previous step and buffering each locus to 180 bp with the script 'phyluce_probe_slice_sequence_from_genomes' and used the script 'phyluce_probe_get_multi_fasta_table' to find, which loci we detect consistently (Table 2). Then, following a conservative approach, we only extracted 9562 conserved loci consistently detected in five out of six genetic resources (Table 3) used here (i.e. genomes of G. fuscipes, L. cuprina, M. domestica, D. melanogaster and the genome and transcriptome of S. crassipalpis) to design our temporary probes using the script 'phyluce_probe_query_multi_ fasta_table'. The design of the final probe set used all genetic resources and not only the base genome, which ensures a heterogeneous probe mix containing probes designed from each exemplar but targeting the same locus, which should produce a more universal probe set. Similar to the temporary probe set, we designed our final probe set for the selected loci (9562 loci) from all genetic resources using the script 'phyluce_probe_get_tiled_probe_from_multiple_inputs'. Thus, we selected two probes per locus with 3× tiling that overlap the middle of the targeted locus and removed potentially problematic probes with >25% repeat content and GC content outside of the range of 30-70%. We then aligned the probes to themselves using the script 'phyluce_probe_easy_lastz', removed duplicates from the temporary probe set using the script 'phy-luce_probe_remove_duplicate_hits_from_probes_using_lastz' and kept 9562 shared loci and 95 231 probes. The number of total conserved loci or UCEs detected in each exemplar taxon were 9479 for L. cuprina, 9459 for M. domestica, 9323 for G. fuscipes, 9270 for the genome of S. crassipalpis, 9138 for D. melanogaster and 2650 for the transcriptome of S. crassipalpis, for a total of total 9562 shared loci.
The final probe set was parsed to produce the master probe set, which included only those probes matching the transcriptome of S. crassipalpis. Thus, our design of UCE probes used available transcriptome resources, produced in a parallel project (Buenaventura et al. unpublished data) as a strategy to refine our probe set and capture only protein-coding regions. This design resulted in 5117 probes targeting 2581 conserved loci or UCEs. These probes were titled 'calyps-v1-master-probe-list-SARCTRA' for clarity.
This study used eight existing DNA aliquots extracted during 2011 and 2012 for previous molecular studies (Zhang et al., 2016b;Buenaventura et al., 2017;Buenaventura & Pape, 2017a) and stored in a −20 ∘ C freezer. We extracted DNA from 37 pinned specimens from the USNM and JOSC collections, 47 specimens preserved in 96% Ethanol (some of which were stored at below freezing upon completion of the field expedition) and 13 specimens collected and placed directly in empty vials stored in Liquid Nitrogen in the NMNH Biorepository, as indicated in Supplementary Table S1 and Figure S1, where specimen identity, preservation method, targeted tissue for extraction, collection data and corresponding repositories at natural history museums of all specimens is provided. Sterilized forceps helped in manipulating and removing dust, pollen and other forms of accumulated debris on pinned specimens. Tissues from thorax, abdomen or legs were used for DNA extractions as specified in Table S1. DNA was nondestructively extracted from the thorax of pinned specimens, whereas it was destructively extracted from specimens preserved in 96% ethanol and liquid nitrogen by grinding the thoracic or leg tissue with a sterile pestle. We used a DNeasy Blood and Tissue Kit (Qiagen, Valencia, CA, U.S.A.) and followed the manufacturer's protocol, but to maximize DNA yield the Proteinase K digestion ran for 48 h at 56 ∘ C and DNA was eluted in 100 L. To estimate the size of the genomic DNA, 10 L of each extract were run for 40 min at 100 volts on 1.5% agarose SB (sodium borate) gels.

Library preparation, target enrichment and sequencing of UCEs
We quantified DNA for each sample using a Qubit fluorometer (High sensitivity kit, Life Technologies, Inc.) and sheared 0.5-81.3 ng (27.9 ng mean) DNA to a target size of approximately 500-600 bp by sonication (Q800, Qsonica LLC.), and used this sheared DNA as input for a modified genomic DNA library preparation protocol following Blaimer et al. (2016a) and Faircloth et al. (2015). Each pool was enriched using our set of 5117 custom-designed probes (Arbor Biosciences, Inc.) targeting 2581 UCE loci in Calyptratae. Our library enrichment followed procedures for the MYcroarray MYBaits kit (Blumenstiel et al., 2010), except we used a 0.1× concentration of the standard MYBaits concentration, and added 0.7 L of 500 L custom blocking oligos designed against our custom sequence tags. We used the with-bead approach for PCR recovery of enriched libraries as described in Faircloth et al. (2015), with pool hybridization with UCE probes over a 24-h incubation period and 18 cycles in the post capture PCR reaction. Following post enrichment PCR, we purified resulting reactions using 1.0× speedbeads and rehydrated the enriched pools in 22 L TLE.
Post enrichment library concentration was quantified via qPCR using an SYBR ® FAST qPCR kit (Kapa Biosystems) on a ViiA™ 7 (Life Technologies), and based on the size-adjusted concentrations estimated by qPCR, we pooled libraries at equimolar concentrations and size-selected for 250-800 with a BluePippin (SageScience) (1.5% agarose, 250 bp -1.5 kb), and the pool-of-pools was quality checked on an Agilent 2200 TapeStation. The pooled libraries were sequenced using two lanes of a 125-bp paired-end Illumina Hi-Seq 2500 run (University of Utah Genomics Core Facility).

Processing and alignment of UCE data
Illumiprocessor (Faircloth, 2013), based on the package Trimmomatic (Bolger et al., 2014), was used to trim the demultiplexed FASTQ data output for adapter contamination and low-quality bases. All further data processing relied on the PHYLUCE package Faircloth, 2016) with Python scripts designed by the Smithsonian Institution Bioinformatics Group (available at www .github.com/SmithsonianWorkshops/Targeted_Enrichment/ blob/master/phyluce.md). We computed summary statistics on the data and assembled the cleaned reads using Trinity (Grabherr et al., 2011). To identify contigs representing enriched UCE loci from each species, species-specific contig assemblies were aligned to a FASTA file of all enrichment baits (min_coverage = 70, min_identity = 80), and sequence coverage statistics (avg, min, max) for contigs containing UCE loci were calculated. We created FASTA files for each UCE locus containing sequence data for taxa present at that particular locus and aligned these using MAFFT (Katoh & Standley, 2013) (min-length = 20, no-trim). Our alignments were trimmed using Gblocks (Castresana, 2000) with relaxed settings (b1 = 0.5, b2 = 0.5, b3 = 12, b4 = 7). We selected a subset of UCE alignments for phylogenetic analyses having varying taxon completeness (10-100%) for each UCE locus (Table 4). These ten different matrixes were designed to evaluate the relative contribution of varying amounts of UCE loci to the construction of the phylogenetic tree.

Phylogenetic inference
Each of the ten datasets of 37-2077 concatenated UCE loci having varying taxon completeness (10-80%) for each UCE locus (Table 4) was analysed with Maximum Likelihood (ML) best tree and bootstrap searches (N = 100) in RAxML v8.2.7 (Stamatakis, 2014). For the concatenated analyses, we calculated the global bootstrap support (GBS) by averaging all bootstrap support values on a given tree. We also partitioned the data by individual UCE loci using the Sliding-Window Site Characteristics approach and site characteristics, such as entropy implemented in the SWSC-EN algorithm, which generates partitions that account for heterogeneity in rates and patterns of molecular evolution within each UCE (Tagliacollo & Lanfear, 2018). We selected a partitioning scheme from the by-locus character sets with PartitionFinder2 (Lanfear et al., 2017). Then, we sequentially ran an ML analysis for the best tree and 1000 replicates of ultrafast bootstrap on each locus for our gene trees estimations using IQ-TREE (Nguyen et al., 2015). A multi-coalescent species tree analysis was carried out in ASTRAL-III (Zhang et al., 2018) using gene trees (one tree search per gene) estimated by 100 ML searches conducted in RAxML v8.2.7 (Stamatakis, 2014). Statistical supports by ASTRAL-III are local posterior probabilities (LPP), which are branch support values that measure the support for a quadripartition, not a bipartition.
All of the above phylogenetic analyses were performed on the Smithsonian Institution High-Performance Cluster Hydra (SI/HPC) using Python scripts designed by MWL (some modified by Bonnie Blaimer and EB) and the Smithsonian Institution Bioinformatics Group (available at www .github.com/SmithsonianWorkshops/Targeted_Enrichment/ blob/master/phyluce.md). Tips of final trees were renamed using a Perl script designed by MWL (available at www.github .com/MikeWLloyd/Tree-Tip-Replacer).

Data availability
The data that support the findings of this study and the Calyptratae/Oestroidea-specific UCE probes are openly available in Dryad data at https://doi.org/10.5061/dryad.3bk3j9kg1 (Buenaventura et al., 2020).

UCE probes and capture results
Summary results of empirically generated UCE data processed with in silico data are presented in Supplementary Table S1. Summary numbers for parsimony and invariant sites for each data matrix are reported in Supplementary Figure S3. We recovered more UCE loci than the average for two of the dried specimens used in this study ( Table S1). All DNA extractions from specimens preserved in 96% ethanol (= 50) and liquid nitrogen (= 12) succeeded. Similarly, all existing DNA aliquots had ample well-preserved DNA (above 1 ng/ L) for the present study. Only two of 36 DNA extractions from pinned museum specimens failed, meaning there was too little DNA (less than 0.05 ng/ L) to be detected by the Qubit Fluorometer. The unsuccessful DNA extractions were museum specimens of P. regina (Calliphoridae: Chrysomyinae) collected in the year 2016 and Rhinoestrus purpureus (Brauer) (Oestridae: Oestrinae) collected in 1935, which, however, were included in the library preparation, enrichment and sequencing. 105 DNA extractions were enriched using the UCE probes and sequenced on an Illumina Hi-Seq lane. See Supplementary Table S1 and Figure S2 to compare specimen age versus total DNA extracted and UCEs captured coded for preservation method.
We obtained an average of 2 264 424 raw paired-end reads per sample. Trinity assembled reads into 334-75 722 contigs with an average of 9246 contigs assembled per sample (Table S1). These contigs had average lengths of 246.9-450.4 bp. From the total assembled contigs, we recovered a total of 2077 UCE loci across all taxa with an average of 836 UCE loci per sample and average lengths ranging from 221.5 to 619.6 bp. UCE loci were produced for all samples except for four pinned museum specimens: Bengalia varicolor (Fabricius) (Calliphoridae: Bengaliinae) collected in 1980, Phumosia abdominalis (Robineau-Desvoidy) (Calliphoridae: Phumosiinae) collected in 1986, Gasterophilus intestinalis (de Geer) (Oestridae: Gasterophilinae) collected in 1926 and Hypoderma lineatum (Viller) (Oestridae: Hypodermatinae) collected in 1976. These four specimens were included in the library preparation, enrichment and sequencing, but were excluded from the phylogenetic analysis due to low UCE recovery. For two specimens, P. regina and R. purpureus, which did not have detectable levels of DNA in the extraction, we recovered 265 and 77 UCE loci, respectively. However, R. purpureus was excluded from the phylogenetic analysis because the recovered UCE loci for this sample were dropped in the PHYLUCE validation step where multiple contigs that map to a single locus are removed. Excluding the five species with low UCE recovery or no UCE recovery, the number of UCE loci recovered per sample ranged from 14 to 1713, with a total of 2077 UCE loci recovered out of 2581 UCE targets ( Table 4). 35% of the sampled taxa consisted of pinned museum specimens (2-92 years old), of which 85% resulted in successful UCE enrichment. The age of the sampled taxa, excluding the five species with low UCE recovery or no UCE recovery, ranged from 2 to 54 years. The average capture efficiency was 42.22%.
The number of minimum and maximum UCE loci captured per family and subfamily varied considerably, but in general, the probes captured UCEs for all Oestroidea lineages, including the Calyptratae outgroups (Fig. 2). Broadly speaking, the three subfamilies of Sarcophagidae showed the highest minimum, maximum and average UCE loci capture, followed by Calliphoridae subfamilies Luciliinae, Chrysomyinae, Bengaliinae and the Tachinidae subfamily Dexiinae. The amount of input DNA did not follow a pattern, and it did not predict the number of UCE loci captured ( Fig. 2A). However, when looking at the histogram of UCE loci captured per lineage, there was a trend with the tail of the histogram populated mostly with specimens of Gasterophilinae and Hypodermatinae (Oestridae), Melanomyinae and Toxotarsinae (Calliphoridae) (Fig. 2B), which were generally older than specimens of other lineages (Table S1). Unsurprisingly, the number of homologous UCEs dropped with increasing phylogenetic distance from Sarcophagidae, but even a species of Anthomyiidae had 945 homologous UCEs (Table S1).

Data completeness
All datasets were successfully analysed, except for datasets with 90% and 100% data completeness (= 90% and 100% of missing taxa), which were data insufficient (Table 4). In other words, since we captured a maximum of 80% of UCE loci per taxa, datasets with 90% and 100% data completeness for taxa were UCE data insufficient. We analysed datasets containing 37-2077 UCE loci from 98 representatives of the oestroid families and seven calyptrate outgroups with a total concatenated aligned length ranging between ten and 550 Kb ( Table 4).
The relation between data completeness and percentage of parsimony-informative sites, and data completeness and GBS per dataset were visualized (Table 4, Figures. S3-S5). We found an average of 34.73% parsimony-informative sites between UCE alignments (10-90% data completeness). There was a slight increase in the percentage of parsimony-informative sites with a decrease of the maximum percentage of missing taxa at each UCE locus, which was observed for the datasets having a maximum of 10-30% of missing taxa at each UCE locus ( Figure S4). Datasets with a maximum of 40-60% of missing taxa at each UCE locus had a similar percentage of parsimony-informative sites, whereas a more drastic decrease in parsimony-informative sites was observed for more incomplete datasets (= 80-90% incomplete) ( Figure S3). Higher values of GBS were not associated with datasets having a smaller maximum percentage of missing taxa at each UCE locus ( Figure S5) since there were only slight fluctuations in GBS among datasets with varying levels of data completeness. A notably low value of GBS was only observed for the dataset having the largest maximum percentage of missing taxa at each UCE locus ( Figure S5).
For illustration purposes, we depicted our best trees from the ML searches on the concatenated UCE dataset in Fig. 4 (topology #1) and coalescent-based phylogeny in Fig. 5 (topology #2). We recovered a highly resolved ML phylogeny with 99 nodes (Fig. 4), most of them displaying 100% bootstrap support (BS). Sixteen nodes had branch support values lower than BS = 95, 12 of them involving intraspecific relationships. These low-supported nodes involved five species for which less than 100 UCE loci were analysed (Table S1) Bellardia viarum (Robineau-Desvoidy) (Calliphoridae: Calliphorinae). Our coalescent-based phylogeny was better supported with only ten nodes having values lower than LPP = 0.70 (Fig. 5), five of them involved intraspecific relationships with the above-mentioned five species.
The superfamily Oestroidea was recovered as monophyletic in all our analyses. Most families were also recovered as monophyletic, the exceptions being Calliphoridae and Oestridae. At the subfamily level, the calliphorid subfamilies Calliphorinae and Melanomyinae and the rhiniid subfamily Cosmininae were not recovered as monophyletic. Mesembrinellidae was recovered as sister to the remaining families with BS = 100 and LPP = 1.0. Oestridae was paraphyletic with respect to Sarcophagidae, with the subfamily Oestrinae as sister to all sarcophagids. Within Sarcophagidae, the subfamily Sarcophaginae was strongly supported as the sister-group to the clade (Miltogramminae + Paramacronychiinae). Pollenia rudis (Fabricius) (Polleniidae) was strongly supported as the sister-group to Tachinidae. Within the clade of the tachinids, Cholomyia inaequipes Bigot (Tachininae, Myiophasiini) received high BS and LPP as sister to the clade (Dexiinae + Phasiinae). The clade of the Tachinidae subfamilies (Dexiinae + Phasiinae) was strongly supported as sister to (Exoristinae + Tachininae). Phumosia sp. (Phumosiinae) was strongly supported as sister to Chrysomyinae (Calliphoridae), and strong support was also obtained for a clade consisting of the sister-group relationship between Bengalia sp. (Bengaliinae) and Rhiniidae. A weaker supported relationship (BS = 85, LPP = 0.72) was recovered between Rhinophoridae and the clade (Luciliinae (Toxotarsinae (Melanomyinae + Calliphorinae))). Similarly, some of the relationships within the clade (Melanomyinae + Calliphorinae) were weakly supported and quite unstable throughout our analyses.

Discussion
Oestroid flies represent a massively diverse and taxonomically rich lineage. They play important ecological roles as nutrient recyclers, invertebrate parasitoids and as pollinators in various ecosystems. This group is also important to humans due to their vital roles in decomposing organic materials, as mechanical vectors of diseases and as indicators of time of death in forensic investigations. To explore the evolutionary history of this group of flies we designed UCE probes for Calyptratae flies, taking a subset of taxon-specific loci that are found in coding regions derived from a transcriptome. Our probe set resolved the clades (Sarcophagidae + Oestridae), (Polleniidae + Tachinidae), (Rhinophoridae + (Luciliinae (Toxotarsinae (Melanomyinae + Calliphorinae)))), (Phumosiinae + Chrysomyinae) and (Bengaliinae + Rhiniidae), some of which were unresolved, inconclusively resolved or weakly supported in previously published studies (Pape, 1992;Kutty et al., 2010;Marinho et al., 2012;Singh & Wells, 2013;Junqueira et al., 2016;Cerretti et al., 2017;Cerretti et al., 2019;Kutty et al., 2019;Stireman et al., 2019). In addition, we were successful in including traditional, pinned museum specimens in this study, which presents a significant resource for future phylogenetic studies in an age of museomics.

Age-and preservation-related factors of UCE sequence capture
Similar to most UCE phylogenetic studies using museum samples as a source of tissues (Blaimer et al., 2016b;Van Dam et al., 2017;Wood et al., 2018;Hedin et al., 2019;Kieran et al., 2019), we successfully enriched several UCE loci from museum fly specimens. DNA extracted from these samples was generally low quality (i.e. highly fragmented, low in concentration), which in pinned museum specimens may be related to long initial drying time and causes DNA degradation due to enzymatic decomposition (Lindahl, 1993;Gilbert et al., 2007;Van Dam et al., 2017;Derkarabetian et al., 2019). Specimen age is also usually an obstacle affecting DNA yielded after the extraction; however, we were able to recover enough DNA to enrich a relatively large number of loci from 50+ year old, pinned museum specimens. We observed a gradual decline in the UCE recovery rate with specimen age (Figure S2), as found in previous studies (Blaimer et al., 2016b;Van Dam et al., 2017). This suggests that obtaining enough DNA from older, pinned specimens is possible in studies using targeted enrichment in the future, but the maximum of 20 years should be considered (Blaimer et al., 2016b). Successful UCE enrichment of our taxon sample was mostly from dry pinned specimens and samples preserved in 96% ethanol, which highlights this approach as advantageous compared with other approaches requiring specimens preserved in RNA-later or in liquid nitrogen. Several museums have large, generally poorly studied wet collections, including unsorted Malaise trap samples, which might be an important source of fragmented DNA to be enriched for UCEs. Our results also show that well-preserved DNA-rich samples, such as DNA aliquots are also suitable for UCE enrichment.

Calyptratae/Oestroidea-specific UCE probes
General UCE probe sets designed to work across larger taxonomic groups have proven successful at resolving phylogeny in various groups (Faircloth et al., 2015;Starrett et al., 2016;Van Dam et al., 2017), but there is growing evidence for improved locus recovery through the use of probe sets tailored to focal taxa (Baca et al., 2017;Branstetter et al., 2017;Van Dam et al., 2019;Gustafson et al., 2019a;   Gustafson et al., 2019b;Kulkarni et al., 2020). An increase in locus recovery is especially important in the study of rapid radiations (Dell'Ampio et al., 2014;Giarla & Esselstyn, 2015), such as that of Oestroidea. The Diptera-wide UCE probe set by Faircloth (2017) was designed with a genome selection of fly families belonging to the early divergences within Diptera. This genome selection is most likely the result of the limited genomic resources available for Diptera at that time (Vicoso & Bachtrog, 2015;Dikow et al., 2017), as it includes mostly the species of medical importance of Culicidae and model organisms of Drosophilidae. In fact, 37 out of 45 genomic resources used in the design by Faircloth (2017) belong to these two families. This Diptera-wide UCE probe set should be very useful to resolve phylogenetic relationships within Culicidae, Drosophilidae and among other closely related dipteran lineages. However, a ∼200 Ma (Wiegmann et al., 2011) evolutionary distance between Culicidae and Calyptratae (i.e. the clade were Oestroidea is nested) is possibly too large to capture enough UCEs to efficiently reconstruct the Oestroidea radiation. Studies on other arthropod lineages have shown that more generalized UCE probe sets capture decreasing numbers of loci with increasing phylogenetic distance from the focal taxon (Faircloth et al., 2015;Gustafson et al., 2019a). Also, this Diptera-wide UCE probe set used a member of Culicidae (Aedes aegypti (L.)) as the base genome. The base genome is a decisive taxon for the UCE probe design, which should be nested or related to the focal group, and it is against, which data from other sampled exemplar taxa within the focal group will be aligned during the probe design process (Faircloth, 2017). Our probe design used S. crassipalpis as the base genome, based on the high contiguity and level of annotation as measures of assembly quality of this genome, and also because this sarcophagid species is nested within the focal group. We observed that taxa for which we recovered the most UCE loci were frequently those with the closest relationship to the base genome and other genomes used for probe design, that is species of the families Sarcophagidae and Calliphoridae. It is notable that at least one taxon of each family/subfamily in our taxon sampling produced 800+ UCE loci, which indicates that these probes can efficiently recover UCE loci from all lineages within Oestroidea and that recovery rates would mostly depend on the DNA quality of each sample. Furthermore, our probe design also used the genetic resources of Muscidae and Glossinidae and could be used for phylogenetic studies of the entire Calyptratae clade. In addition, as we filtered our UCE probes with a transcriptome to capture only protein-encoding loci, our UCE dataset is suitable for larger and more integrative phylogenomic studies using both genomic and transcriptomic resources. It should be noted that most lineages within Calyptratae and the superfamily Oestroidea do not have available genomic resources (Vicoso & Bachtrog, 2015;Dikow et al., 2017) to be used in phylogenomic studies in general, or in the design of UCE probe sets in particular. This highlights the need for a larger and more representative set of genomic resources for insect phylogenomics.

Oestroidea phylogeny
Our concatenated ML and coalescent-based analyses recovered a fully resolved and mostly well-supported phylogeny for Oestroidea flies. These analyses estimated generally identical topologies on the family, subfamily and species levels . Uncertainty is confined only to a few nodes in the phylogeny with BS < 95 and LPP < 0.70. We found a few important differences, such as the position of the clade C or A + B as sister to D. These alternative resolutions of the tree are associated with relatively low branch support. One of our concatenated ML analyses has C + D with BS = 61 while coalescent-based analysis and another of our concatenated ML analyses have D+(A + B) with LPP = 94 and BS = 68. These differences in support and resolution are discussed below.

Oestroidea monophyly and radiation
The monophyly of Oestroidea has been corroborated by morphological (Griffiths, 1972;Pape, 1992;Lambkin et al., 2013) and molecular characters (Kutty et al., 2010;Wiegmann et al., 2011;Marinho et al., 2012;Caravas & Friedrich, 2013;Lambkin et al., 2013;Cerretti et al., 2017;Kutty et al., 2019). Here we add support from UCE molecular data and test the monophyly of the oestroid families and subfamilies. The problem of deciphering large radiations is resolved here by a genome-scale alignment of UCE data, which yields a phylogenetic tree with high statistical support. Does this mean that with our genome-wide alignments, we now have reconstructed the true tree for Oestroidea? Answering this question requires a cross-validation experiment to assess whether the statistically significant inferences of two or more types of phylogenomic data are consistent. Indeed, the vast majority of our UCE-based phylogenetic relationships are largely consistent with transcriptome-based analyses (Kutty et al., 2019). For example, the problematic nodes involving Rhinophoridae and Calliphoridae lineages Luciliinae, Toxotarsinae, Melanomyinae, Calliphorinae (i.e. clade A + B) and Chrysomyinae and Phumosiinae (i.e. clade C) were also problematic using transcriptomes (Kutty et al., 2019: Fig. 1). Similarly, well-supported sister-group relationships of Bengaliinae and Rhiniidae, Chrysomyinae and Phumosiinae, Polleniidae and Tachinidae and Oestridae (nonmonophyletic) and Sarcophagidae are also supported by transcriptome-based analyses (Kutty et al., 2019).
The Calyptratae radiation has been suggested as multiple episodes of rapid diversification in the early Tertiary (Kutty et al., 2010) possibly associated with the opening of new niches following the Cretaceous-Palaeogene mass extinction event (Cerretti et al., 2017). Our results are concordant with studies showing lineages within Sarcophagidae (Piwczyński et al., 2014;Piwczyński et al., 2017;Buenaventura & Pape, 2017a;Buenaventura et al., 2019) and Tachinidae (Cerretti et al., 2014b;Stireman et al., 2019) as the dominant fast-evolving groups of Oestroidea. Furthermore, our results support a super-radiation within the genus Sarcophaga Meigen (Sarcophagidae), as recent studies suggest (Piwczyński et al., 2014;Buenaventura et al., 2017;Buenaventura & Pape, 2017a;Buenaventura et al., 2019). This super-radiation can be recognized by the presence of many short branches at the level of the subgeneric diversification, which is accompanied by lower BS and LPP in comparison with the rest of the tree nodes (Figs. 4, 5). Even larger radiations may have involved specific lineages within Tachinidae, which need further study using a larger taxon sampling within this family. The short internal branches within Sarcophagidae represent short intervals between speciation events, where it is likely that gene trees and species trees can conflict with the underlying species branching history (Degnan & Rosenberg, 2006, hampering phylogenetic inference. Similar gene tree discordance using UCEs in the study of rapid diversifications has been reported (Guillory et al., 2020). However, combining many independently segregated loci sampled from multiple species per lineage and applying a coalescent-based approach should improve phylogenetic inference by reducing sampling error and allowing one to distinguish evolutionary signal from a methodological artefact (McCormack et al., 2009;Kumar et al., 2012;Giarla & Esselstyn, 2015). Then, in these recent super-radiations within Oestroidea, poor support for a series of splits within Sarcophagidae, and possibly Tachinidae, should be attributed to its explosive evolutionary history limiting phylogenetic signal, rather than by methodological artefacts.

Monophyly and relationships of Oestroidea families
Mesembrinellidae. Our results are consistent with morphological and molecular characters supporting Mesembrinellidae as a separate, monophyletic family and not as a subfamily of Calliphoridae (Guimarāes, 1977;Kutty et al., 2010;Marinho et al., 2012;Singh & Wells, 2013;Cerretti et al., 2017;Marinho et al., 2017;Kutty et al., 2019). Other molecular studies found Mesembrinellidae as nested within (Kutty et al., 2010) or sister to Tachinidae (Marinho et al., 2012;Junqueira et al., 2016), or sister to Ulurumyiidae (Kutty et al., 2019). We recovered the mesembrinellids as sister to the remaining Oestroidea, which is partially supported by transcriptome data (Kutty et al., 2019), four nuclear loci  and by morphological characters (Michelsen & Pape, 2017). The monophyly of Mesembrinellidae is supported by characters, such as larvae hatch within the uterus and nourished by secretions of the female for an extended period (Guimarāes, 1977;Ferrar, 1978;Rognes, 1986;Pape, 1992;Meier et al., 1999). The phylogenetic relationships within Mesembrinellidae are still controversial. Marinho et al. (2017) treated Eumesembrinella Townsend, Giovanella Bonatto, Huascaromusca Townsend, Laneella Mello and Mesembrinella Giglio-Tos as separate genera and found support for the monophyly of Eumesembrinella and Laneella only. These authors also suggested Eumesembrinella as a junior synonym of Mesembrinella and Giovanella as a junior synonym of Huascaromusca. These suggestions were incorporated by Cerretti et al. (2017) who treated all species as one single genus Mesembrinella. Whitworth & Yusseff-Vanegas (2019) revised the family and arranged species into genera Laneella, Mesembrinella and Souzalopesiella Guimarães, which was in agreement with Marinho et al. (2017). Our UCE data, although limited in taxon sampling, supports the monophyly of Mesembrinella, which is consistent with other molecular data (Cerretti et al., 2017;Marinho et al., 2017) and with the classification proposed by Whitworth & Yusseff-Vanegas (2019). Mesembrinellidae is limited to Neotropical rainforests  and its possible close phylogenetic relationship to the dung breeders of the Australian Ulurumyiidae (Kutty et al., 2019); Michelsen & Pape (2017) suggest an interesting potential biogeographic connection through Antarctica.
Oestridae. The monophyly of the bot flies is strongly supported by morphological characters (Wood, 1986;Pape, 1992;Pape, 2001;Pape, 2006), but molecular characters produce different phylogenetic hypotheses for this lineage of mammal parasites. A well-supported, monophyletic Oestridae is supported by studies using COI (Otranto et al., 2003), mitochondrial genomes (Junqueira et al., 2016) and multi-locus datasets (Cerretti et al., 2017). However, also using mitochondrial genomes, Zhang et al. (2016a) found that the monophyly of Oestridae depended on which outgroups are included. Our UCE phylogeny supports Oestridae rendered paraphyletic by Sarcophagidae, a close relationship that is consistent with other phylogenetic studies combining molecular and morphological data (Cerretti et al., 2017) or using several molecular markers (Wiegmann et al., 2011;Stireman et al., 2019). However, other molecular datasets do not support the sister grouping of Oestridae with Sarcophagidae; mitochondrial genomes (Junqueira et al., 2016) suggest Oestridae is sister to Tachinidae + Mesembrinellidae, whereas transcriptomes (Kutty et al., 2019) point to Mystacinobiidae as the sister lineage to Oestridae.
Within Oestridae, our ML concatenated analysis reconstructs the three subfamilies as paraphyletic with respect to Sarcophagidae, with Hypodermatinae sister to the clade (Cuterebrinae + (Oestrinae + Sarcophagidae)), whereas our coalescent-based analysis has Cuterebrinae sister to the clade (Hypodermatinae + (Oestrinae + Sarcophagidae)). These analyses additionally support the monophyly of the subfamilies Cuterebrinae and Oestrinae. Oestrinae emerges as the sister taxon to Sarcophagidae in all analyses. This sister-group relationship is also supported by the presence of a bilobed uterine pouch for embryonating eggs (Pape, 1992), although there seem to be morphological differences in shape and tracheation between the uterine pouches of Sarcophagidae and Oestridae, and further study of these characters is needed. The configuration of openings of posterior spiracles in second and third instar larvae also deserves attention in future studies. A configuration of these openings as a porous plate is shared by Hypodermatinae and Oestrinae (Zumpt, 1965;Ferrar, 1987;Pape, 1992), whereas these openings are composed of three slits in Dermatobia Brauer, Ruttenia Rodhain and Neocuterebra Grünberg (Ferrar, 1987); these were coded as such in Cuterebrinae and Gasterophilinae in a morphology-based phylogenetic analysis of Oestroidea (Pape, 1992). Thus, morphological evidence would conflict with our ML concatenated UCE phylogeny, but supports our coalescent-based analysis. Finally, more DNA data is needed for subfamilies Gasterophilinae and Hypodermatinae to further test the monophyly and relationships of these lineages as well as the monophyly of Oestridae.
Within the flesh flies, our results largely agree with phylogenetic relationships for Sarcophaginae based on morphological characters (Giroux et al., 2010;Buenaventura & Pape, 2017b). The clade (Oxysarcodexia Townsend + Ravinia Robineau-Desvoidy) is an example of a sister-group relationship that is well corroborated by our UCE data and morphological characters. The 'Blaesoxipha clade' of Buenaventura & Pape (2017b) is represented here by Blaesoxipha Loew, Comasarcophaga Hall, Fletcherimyia Townsend and Spirobolomyia Townsend and are recovered as a well-supported clade. The internal relationships support those previously recovered by Giroux et al. (2010) with Blaesoxipha as sister to Spirobolomyia and Comasarcophaga sister to Fletcherimyia. Our results also support the 'Sarcophaga clade' of Buenaventura & Pape (2017b) with Chrysagria Townsend + (Helicobia Coquillett + (Lipoptilocnema Townsend + Sarcophaga)) (although Peckia Robineau-Desvoidy is not included here). Within the largest radiation of Sarcophaginae, phylogenetic relationships in the hyperdiverse genus Sarcophaga closely match those recovered in previous molecular phylogenies, with the Nearctic subgenus Neobellieria Blanchard as one of the earliest divergences (Buenaventura et al., 2017;Buenaventura & Pape, 2017a) and the subgenus Stackelbergeola Rohdendorf as sister to subgenus Rohdendorfisca Grunin (Zhang et al., 2016b).
Within Tachinidae, our analyses recovered (Phasiinae + Dexiinae) as sister to (Exoristinae + Tachininae), which is in agreement with previous phylogenetic studies (Cerretti et al., 2014b;Winkler et al., 2015;Stireman et al., 2019). Within the subfamily Tachininae, the sister-group relationship between the tribe Tachinini (represented by genus Epalpus Rondani) and the remaining Tachininae is congruent with a recent phylogeny for tachinids . However, genus Panzeria Robineau-Desvoidy (Ernestiini s.l.) was nested inside tribe Polideini, represented by the genera Spilochaetosoma Smith and Nigrilypha O'Hara. In addition, we recovered the enigmatic species Cholomyia inaequipes Bigot, as the sister taxon to the clade (Phasiinae + Dexiinae) (topology #1) or as sister to the rest of Tachinidae (topologies #2 and #3). Cholomyia Bigot was previously placed in the subfamily Dexiinae (Townsend, 1936;Guimarães, 1971) but later transferred to Myiophasiini (Tachininae) (O' Hara & Wood, 2004). Recent molecular analyses based on four nuclear loci recovered a clade of the tachinine tribes Macquartiini + Myiophasiini (including C. inaequipes) as a sister to the rest of the family , as reconstructed in some of our topologies (topologies #2 and #3). Thus, as suggested by O'Hara et al. (2019), some rearrangements may be necessary within the family with regard to the clade Macquartiini + Myiophasiini, which might constitute a separate subfamily (O'Hara & Henderson, 2020). Differences between phylogenetic reconstructions for C. inaequipes within Tachinidae found here and in a previous analysis  may be attributed to limited taxon sampling here.
The morphological diversity of Tachinidae has historically led to difficulties to define monophyletic groups and produce a stable classification scheme (e.g. Polideini, O'Hara, 2002). Only Dexiinae (excluding Eugherini) is supported by a hinged membranous dorsal connection between the basi-and distiphallus (Cerretti et al., 2014b) and Phasiinae by the elongated medial plate of the hypandrium (Blaschke et al., 2018). Other subfamilies are simply defined by the presence of combinations of characters, with several exceptions . At a higher taxonomic level, the eggs being laid embryonated is a potential character state uniting some Tachinidae and Rhinophoridae, but the significance should be treated with caution. Also, some phylogenetic analyses of Tachinidae suggest multiple, independent shifts in reproductive strategy, including reversals (i.e. ovipary to ovolarvipary and/or vice versa), with different conclusions depending on the inference method used (e.g. Stireman et al., 2019).
Rhiniidae. This lineage includes almost 400 described species belonging to 30 genera and two subfamilies (Pape et al., 2011;Rognes, 2013). Rhiniidae consistently emerges as a monophyletic group in our analyses, which is in agreement with published molecular (Kutty et al., 2010;Marinho et al., 2012;Singh & Wells, 2013;Cerretti et al., 2017;Marinho et al., 2017;Kutty et al., 2019) and morphological evidence (Rognes, 1997). Until recently, rhiinids were considered a subfamily of Calliphoridae, but removed from it and raised to a family level based on the results of the above studies. Additional evidence that might support the monophyly of Rhiniidae is the shape of the paraphallus (i.e. heavily sclerotized, dorsal part of the phallic tube). A paraphallus dorsally complete with a latero-distal margin expanded ventrally is found in Rhiniidae, whereas the paraphallus has a dorso-medial insertion of variable sizes with a latero-distal margin generally not expanded in other calliphorid linages (Thomas-Cabianca, unpublished data).
Our analyses support Cosmininae rendered paraphyletic by a monophyletic Rhiniinae. Additional evidence, which might support the monophyly of Rhiniinae is (i) desclerotization between the basi-and disti-phallus (these phallic sections are fused in Cosmininae), (ii) the absence of an epiphallus (present in Cosmininae) and (iii) a distal, prominent, globular paraphallus (less prominent in Cosmininae) (Thomas-Cabianca, unpublished data). Cosmininae exhibit extremely diverse morphological traits, which are still poorly studied. In addition, ecological associations and life cycles of Cosmininae species are mostly unknown (Cuthbertson, 1933;Cuthbertson, 1934;Peris, 1952;Zumpt, 1958;Kurahashi & Kirk-Spriggs, 2006). Cosmina Robineau-Desvoidy, Isomyia Walker, Rhyncomya Robineau-Desvoidy, Stomorhina Rondani and Sumatria Malloch emerged as monophyletic taxa, whereas the monophyly of Eurhyncomyia Malloch, Fainia Zumpt and Rhinia Robineau-Desvoidy could not be tested due to limited taxon sampling. We also confirmed that the enigmatic genus Villeneuviella Austen belongs to Rhiniidae. This is a large, yellow, night-flying species with rudimentary mouthparts that live in desert areas (Zumpt, 1957;Rognes, 2002). The larvae of Villeneuviella have been observed attacking termite workers (Grunin, 1957), which supports the placement of this genus in Rhiniidae based on the aforementioned hypothesis of an ant or termite association among rhiniids. Villeneuviella was recovered as sister to (Eurhyncomyia + Rhyncomya) in our ML concatenated analysis, but it was nested within Cosmina in our coalescent-based analysis. Both of these placements for Villeneuviella are weakly supported and will need to be further studied in future research using a larger molecular dataset and broader taxon sampling for the entire lineage.
No formal changes are proposed here until a more stable phylogenetic estimate can underpin any significant classification changes with regard to the calliphorids. Still, based on our phylogenetic analyses, two alternatives could be considered for achieving a more stable classification for Calliphoridae and related lineages in the future. These include (i) the monophyletic clades Bengaliinae, Chrysominae, Phumosiinae and clade B are raised to family-level status with clade B (composed of Calliphorinae, Luciliinae, Melanomyinae and Toxotarsinae) circumscribed as Calliphoridae, or, (ii) the families Rhiniidae and Rhinophoridae are synonymized under Calliphoridae to constitute subfamilies together with Bengaliinae, Calliphorinae, Chrysominae, Luciliinae, Melanomyinae, Phumosiinae and Toxotarsinae. Under any of these alternatives, Polleniidae would still be considered a separated family . However, a stable and robust classification of all Caliphoridae lineages can only be achieved with the inclusion of other lineages not included here that is Ameniinae, Aphyssurinae, Auchmeromyinae and Helicoboscinae.
Polleniidae. The cluster flies or Polleniidae were represented in our study by the species Pollenia rudis. Our UCE data strongly supports Polleniidae as a sister to Tachinidae. The sister-group relationship between Polleniidae and Tachinidae is consistent with phylogenetic estimations using molecular datasets (Cerretti et al., 2017;Cerretti et al., 2019;Kutty et al., 2019;Stireman et al., 2019). The phylogenetic and taxonomic limits of the lineage of Polleniidae have recently been studied and clearly defined with regard to Tachinidae and to controversial taxa, such as Alvamaja Rognes and Morinia Robineau-Desvoidy (including the carinata-group), which were historically assigned to Rhinophoridae but recently reassigned to Polleniidae . Pollenids are known as predators or parasitoids of earthworms (Rognes, 1991), but not all taxa seem to share these life strategies .

Conclusions
Our phylogenomic approach combining taxon-specific, protein-encoding, UCE probes with a large taxon sampling obtained a well-supported phylogeny for Oestroidea. Our analyses also provided a strong phylogenetic hypothesis at the subfamily level, within the major clades of Oestroidea, including the traditionally challenging Calliphoridae. Future studies using UCEs should include and phylogenetically place enigmatic taxa, such as families Ulurumyiidae, Mystacinobiidae and subfamilies Ameniinae, Aphyssurinae, Auchmeromyinae and Helicoboscinae of Calliphoridae. We would also expect our taxon-specific UCE probes to be suitable for studying higher-level phylogenetic relationships within the whole Calyptratae clade. Our UCE probes designed to capture only protein-encoding UCEs make our dataset suitable for larger and more integrative phylogenomic studies using both genomic and transcriptomic resources. Thus, our taxon-specific probe design provides a valuable toolset to address systematic questions, further our understanding of the timing of diversifications and help resolve long-standing controversial relationships within these fly radiations. Combining UCEs with other data sources (i.e. genome architecture, morphology, ecology and chemo-physiology) for an even broader taxon sampling could accelerate and advance the understanding of oestroid diverse morphologies, ecological roles in ecosystems and biogeography.

Supporting Information
Additional supporting information may be found online in the Supporting Information section at the end of the article. Table S1. Specimen data. Specimen identity, preservation method, targeted tissue for extraction, collection data, specimen identifiers and repositories at Natural History Museums, DNA input data, contig data, UCE capture data. Figure S1. Distribution of preservation methods in the specimens studied. Figure S2. Specimen age versus DNA extracted and UCEs captured coded for preservation method. Specimen age versus DNA input obtained from (A) specimens preserved in 96% Ethanol, (C) old DNA aliquots, (E) specimens preserved in liquid Nitrogen and (G) dry pinned specimens. Specimen age versus UCE captured from: (B) specimens preserved in 96% Ethanol, (D) old DNA aliquots, (F) specimens preserved in liquid Nitrogen and (H) dry pinned specimens.