The genome sequence of the thick-headed fly, Myopa tessellatipennis (Motschulsky, 1859)

We present a genome assembly from an individual female Myopa tessellatipennis (Arthropoda; Insecta; Diptera; Conopidae). The genome sequence is 249.3 megabases in span. Most of the assembly is scaffolded into four chromosomal pseudomolecules, including the assembled X sex chromosome. The mitochondrial genome has also been assembled and is 18.3 kilobases in length.


Background
Myopa tessellatipennis Motschulsky, 1859 (Diptera: Conopidae, Myopinae) is a member of the family commonly referred to as 'thick-headed flies', or more recently as 'bee-grabbers'.As far as is known, all members of this family develop as endoparasitoids of other insects, although life history data for many species is lacking.Hosts primarily comprise species of aculeate Hymenoptera, except in the large and mainly tropical subfamily Stylogasterinae (Stuke, 2017).Just over 800 valid species are currently recognised, occurring in all parts of the world except for the polar regions.
The genus Myopa currently comprises about 44 species, just over half of which are Palaearctic in distribution (Stuke, 2017).Myopa tessellatipennis lies within the 'polystigma-group' which comprises about 10 species in the Palaearctic.Nine Myopa species are known to occur in Britain (Smith, 1969), with tessellatipennis being confused under polystigma Rondani, 1857 until clarification (Smith, 1970).The British species can be identified using the keys of (Clements, 1995) or (Smit, 2013).Myopa tessellatipennis is mainly confined to the southern half of Britain, occurring most commonly in the south and east of England (UK Conopid Recording Scheme database).Elsewhere it occurs throughout the Western Palaearctic.
Host data is lacking for most Myopa species but where known comprises bees of the genus Andrena (Hymenoptera: Andrenidae).Smit et al., 2018 reported a record of Myopa pellucida Robineau-Desvoidy, 1830 identified by DNA barcoding from a larva found within Andrena nitida Müller, 1776, but most of the host-associations in the literature are tentative (Stuke, 2017).Myopa tessellatipennis occurs in a wide variety of mainly lowland habitats and is thought to parasitise Andrena barbilabris Kirby, 1802 and A. flavipes Panzer, 1799 although other host species may also be involved (Edwards, 1985;Nieuwenhuijsen, 2009;Stuke, 2017).
Typically, female Myopa lie in wait for a host bee at nectaring sites, grabbing it in flight.Wrapping its body around the bee, it uses a specialised ovipositor to prise open a gap between the bee's sclerites to inject an egg in the abdominal cavity.The fly larva develops within the host and pupation occurs upon its death.Infestation by conopid larvae has been shown to significantly alter the foraging and other behaviour of the host (see Clements, 1997), in some cases by inducing fossorial behaviour in which the host digs into the ground to provide a protected overwintering site for the conopid pupa (Malfi et al., 2014;Müller, 1994).Most Myopa species fly in the spring or early summer, coinciding with their likely hosts, although a few also fly later in the season.The barcoding of conopids is desirable in tackling their complex taxonomy, and also in allowing the identification of larvae when encountered within their aculeate hosts, some of which are commercially important, for example as pollinators or in honey production.
A mating pair of Myopa tessellatipennis (Figure 1) was observed on 26 April 2021 in a rural garden in Somerset, south-west England, and the pair were sent live to the Natural History Museum, London.The high-quality genome sequence for a female M. tessellatipennis reported here has been generated as part of the Darwin Tree of Life project.It will aid in understanding the biology, physiology and ecology of the species.

Genome sequence report
The genome was sequenced from one female Myopa tessellatipennis (Figure 1) collected from Yeovil, Somerset, England, UK (latitude 50.97, longitude -2.68).A total of 77-fold coverage in Pacific Biosciences single-molecule HiFi long reads was generated.Primary assembly contigs were scaffolded with chromosome conformation Hi-C data.Manual assembly curation corrected six missing or mis-joins reducing the scaffold number by 54.55%.
The final assembly has a total length of 249.3 Mb in five sequence scaffolds with a scaffold N50 of 65.8 Mb (Table 1).Most (99.99%) of the assembly sequence was assigned to four chromosomal-level scaffolds, representing three autosomes, and the X sex chromosome.Chromosome-scale scaffolds confirmed by the Hi-C data are named in order of size (Figure 2-Figure 5;

Sample acquisition and nucleic acid extraction
A female Myopa tessellatipennis (idMyoTess1) was collected from Yeovil, Somerset, England, UK (latitude 50.97, longitude -2.68) on 26 April 2021.The specimen was taken from a rural garden by Mike Ashworth (independent researcher) using an aerial net.The specimen was identified by Mike Ashworth and preserved at -80°C.This specimen was used for PacBio and Hi-C analysis.
A male specimen (idMyoTess2) was collected from Wytham Farm, Oxfordshire (latitude 51.79, longitude -1.32) on 19 April 2021 by netting.The specimen was collected and identified by Steven Falk (independent researcher), and was preserved on dry ice.This specimen was used for RNA sequencing.
DNA was extracted at the Tree of Life laboratory, Wellcome Sanger Institute (WSI).The idMyoTess1 sample was weighed and dissected on dry ice with tissue set aside for Hi-C sequencing.Abdomen tissue was cryogenically disrupted to a fine powder using a Covaris cryoPREP Automated Dry RNA was extracted from idMyoTess2 in the Tree of Life Laboratory at the WSI using TRIzol, according to the manufacturer's instructions.RNA was then eluted in 50 μl RNAse-free water and its concentration assessed using a Nanodrop spectrophotometer and Qubit Fluorometer using the Qubit RNA Broad-Range (BR) Assay kit.Analysis of the integrity of the RNA was done using Agilent RNA 6000 Pico Kit and Eukaryotic Total RNA assay.

Sequencing
Pacific Biosciences HiFi circular consensus and 10X Genomics read cloud DNA sequencing libraries were constructed according to the manufacturers' instructions.Poly(A) RNA-Seq libraries were constructed using the NEB Ultra II RNA Library Prep kit.DNA and RNA sequencing was performed by the Scientific Operations core at the WSI on Pacific Biosciences SEQUEL II (HiFi) and Illumina NovaSeq 6000 (RNA-Seq) instruments.Hi-C data were also generated from head tissue of idMyoTess1 using the Arima v2 kit and sequenced on the Illumina NovaSeq 6000 instrument.

Jan Ševčík
University of Ostrava, Ostrava, Czech Republic The first genomic data are presented for the relatively rare fly Myopa tessellatipennis.Both chromosome and mitogenome assemblies appear as of high quality.I have only following minor comment.
Full taxonomic name of the species is Myopa tessellatipennis Motschulsky, 1859 as stated in the Background chapter but not Myopa tessellatipennis (Motschulsky, 1859) as is used in the title and elsewhere in the text.Please correct.
It is not clear from the text if the entire body of the specimen was used for DNA or RNA isolation.This should be specified in detail.I strongly recommend to keep at least one wing and male or female terminalia as vouchers for further study and not to destroy the entire abdomen (or body).
It is also useful to take (before DNA/RNA isolation) detailed photographs of all important morphological characters used for the taxonomic identification of the specimen(s) and include them in the paper.Habitus photo of mating specimens may be useful to attract attention of potential readers but not to substantiate exact identification of the species.
Why a male from different locality was used for RNA isolation and not the male from the mating couple on the photograph?Please provide more details about that.
Why RNA was isolated?Are transcriptomic data available?This should be commented on in the text.

Is the rationale for creating the dataset(s) clearly described? Yes
Are the protocols appropriate and is the work technically sound?Yes

Are sufficient details of methods and materials provided to allow replication by others? Partly
Are the datasets clearly presented in a useable and accessible format?Yes Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Diptera taxonomy; molecular phylogenetics; DNA barcoding I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Figure 1
Figure 1.a) The mating pair of Myopa tessellatipennis, which were taken for genome analysis.The female of this pair (NHMUK014036856) was used for genome assembly.b) Photograph of the Myopa tessellatipennis (idMyoTess1) specimen used for genome sequencing during preservation and processing.

Figure 2 .
Figure 2. Genome assembly of Myopa tessellatipennis, idMyoTess1.1:metrics.The BlobToolKit Snailplot shows N50 metrics and BUSCO gene completeness.The main plot is divided into 1,000 size-ordered bins around the circumference with each bin representing 0.1% of the 249,275,748 bp assembly.The distribution of scaffold lengths is shown in dark grey with the plot radius scaled to the longest scaffold present in the assembly (74,632,285 bp, shown in red).Orange and pale-orange arcs show the N50 and N90 scaffold lengths (65,812,558 and 46,318,963 bp), respectively.The pale grey spiral shows the cumulative scaffold count on a log scale with white scale lines showing successive orders of magnitude.The blue and pale-blue area around the outside of the plot shows the distribution of GC, AT and N percentages in the same bins as the inner plot.A summary of complete, fragmented, duplicated and missing BUSCO genes in the diptera_odb10 set is shown in the top right.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/idMyoTess1.1/dataset/CALSGB01/snail.

Figure 5 .
Figure 5. Genome assembly of Myopa tessellatipennis, idMyoTess1.1:Hi-C contact map.Hi-C contact map of the idMyoTess1.1 assembly, visualised using HiGlass.Chromosomes are shown in order of size from left to right and top to bottom.An interactive version of this figure may be viewed at https://genome-note-higlass.tol.sanger.ac.uk/l/?d=dDPd6svVRwScfXFIRd8dMg.

Table 2
).The assembly has a BUSCO v5.3.2 (Manni et al., 2021)completeness of 96.3% (single 95.6%, duplicated 0.7%) using the diptera_odb10 reference set.While not fully phased, the assembly deposited is of one haplotype.Contigs corresponding to the second haplotype have also been deposited.

the datasets clearly presented in a useable and accessible format? Yes Competing Interests:
Table 3 contains a list of all software tool versions used, where appropriate.Darwin Tree of Life Partner is subject to the Darwin Tree of Life Project Sampling Code of Practice.By agreeing with and signing up to the Sampling Code of Practice, the Darwin Tree of Life Partner agrees they will meet the legal and ethical requirements and standards set out within this document in respect of all samples acquired for, and supplied to, the Darwin Tree of Life Project.All efforts are undertaken to minimise the suffering of animals used for sequencing.Each transfer of samples is further undertaken according to a Research Collaboration Agreement or Material Transfer Agreement entered into by the Darwin Tree of Life Partner, Genome Research Limited (operating as the Wellcome Sanger Institute), and in some circumstances other Darwin Tree of Life collaborators.No competing interests were disclosed.

have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.
https://doi.org/10.21956/wellcomeopenres.21184.r59759©2023 Ševčík J.This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.