The genome sequence of the Lesser Hornet Hoverfly, Volucella inanis (Linnaeus, 1758)

We present a genome assembly from an individual female Volucella inanis (the Lesser Hornet Hoverfly; Arthropoda; Insecta; Diptera; Syrphidae). The genome sequence is 961 megabases in span. Most of the assembly is scaffolded into six chromosomal pseudomolecules, including the assembled X sex chromosome. The mitochondrial genome has also been assembled and is 16.0 kilobases in length. Gene annotation of this assembly on Ensembl has identified 11,616 protein coding genes.


Background
The Wasp Plumehorn or Lesser Hornet hoverfly, Volucella inanis, (Figure 1a) is a large hoverfly, whose distinct yellow and black markings give it a wasp-like appearance.Although it is similar in appearance to the Hornet hoverfly (Volucella zonaria), the smaller-sized V. inanis can be identified as the markings at the base of the abdomen are yellow rather than chestnut-coloured, and the yellow colouration to the sternites is greater (Ball & Morris, 2015).V. inanis is noted to be a Batesian mimic of the European hornet (Vespa crabro) and the Common wasp (Vespula vulgaris), whose aposematic colours reduce the risk of predation (Hlaváček et al., 2022;Parmentier, 2020).This hoverfly is distributed across the eastern Palearctic, including most of central and southern Europe, Syria and North Africa, although its range is expanding northwards (Doyle et al., 2020;Rupp, 1989).Formerly restricted to the southeast in the UK, it has undergone a rapid range expansion over recent decades and can now be found northwards to Yorkshire (Ball & Morris, 2015).Adults are abundant mid-to late summer in woodland edges or green city spaces, feeding from the flowers of Buddleja, brambles, thistles, umbellifers, ivy, Snowberry and Devil's-bit scabious (Ball & Morris, 2015;Stubbs & Falk, 2002).Females are often associated with the nests of the German wasp (Vespula germanica), Common wasp (Vespula vulgaris) and European hornet (Vespa crabro), attempting to lay eggs undetected near the nest entrances (Bothe, 1984;Speight, 2011).Upon hatching ~5 days later, the V. inanis larvae will invade the host's nest (Bothe, 1984).These ectoparasitic larvae are very flattened, distinguishing them from other Volucella sp., and allowing them to fit inside the larval cells of their host, alongside the social wasps' larva (Parmentier, 2020).The V. inanis larva first scavenges on the excretions of the developing wasp larva.However, once the wasp larva pupates, V. inanis consumes the whole organism (van Veen, 2004), later leaving the nest and pupating underground.
The sequencing of this high-quality genome, accomplished for the first time through the Darwin Tree of Life project, will help uncover a greater understanding of the fascinating biology and ecology of this species.This includes studies investigation the evolution of mimicry and ectoparasitism, the biogeographical impacts of climate change and conservation of important pollinator species.

Genome sequence report
The genome was sequenced from one female V. inanis specimen (Figure 1) collected from Wytham Great Wood (51.773,.A total of 26-fold coverage in Pacific Biosciences single-molecule HiFi long reads and 40-fold coverage in 10X Genomics read clouds were generated.Primary assembly contigs were scaffolded with chromosome conformation Hi-C data.Manual assembly curation corrected 212 missing joins or mis-joins, and removed 4 haplotypic duplications, reducing the scaffold number by 71.43% and increasing the scaffold N50 by 63.95%. The final assembly has a total length of 961.4 Mb in 52 sequence scaffolds with a scaffold N50 of 163.5 Mb (Table 1).Most (99.92%) of the assembly sequence was assigned to six chromosomal-level scaffolds, representing five autosomes and the X sex chromosome.Chromosome-scale scaffolds confirmed by the Hi-C data are named in order of size (Figure 2-Figure 5; Table 2).The assembly has a BUSCO v5.2.2 (Manni et al., 2021) completeness of 97% (single 96.5%, duplicated 0.5%) using the diptera_odb10 reference set (n = 3,285).While not fully phased, the assembly deposited is of one haplotype.Contigs corresponding to the second haplotype have also been deposited.

Isolate information male
Raw data accessions PolyA RNA-Seq Illumina ERR6464927  RNA was extracted from thorax tissue of idVolInan3 in the Tree of Life Laboratory at the WSI using TRIzol, according to the manufacturer's instructions.RNA was then eluted in 50 μl RNAse-free water and its concentration assessed using a  libraries were constructed using the NEB Ultra II RNA Library Prep kit.DNA and RNA sequencing were performed by the Scientific Operations core at the WSI on Pacific Biosciences SEQUEL II (HiFi), Illumina HiSeq 4000 (RNA-Seq) and HiSeq X Ten (10X) instruments.Hi-C data were also generated from head and thorax tissue of idVolInan2 using the Arima v2 kit and sequenced on the Illumina NovaSeq 6000 instrument.

Genome assembly
Assembly was carried out with Hifiasm (Cheng et al., 2021) and haplotypic duplication was identified and removed with purge_dups (Guan et al., 2020).One round of polishing was performed by aligning 10X Genomics read data to the assembly with Long Ranger ALIGN, calling variants with freebayes (Garrison & Marth, 2012).The assembly was then scaffolded with Hi-C data (Rao et al., 2014) using SALSA2 (Ghurye et al., 2019).The assembly was checked for contamination and corrected using the gEVAL system (Chow et al., 2016) as described previously (Howe et al., 2021).Manual curation was performed using gEVAL, HiGlass (Kerpedjiev et al., 2018) andPretext (Harry, 2022).The mitochondrial genome was assembled using MitoHiFi (Uliano-Silva et al., 2022), which performed annotation using MitoFinder (Allio et al., 2020).The genome was analysed and BUSCO scores generated within the BlobToolKit environment (Challis et al., 2020).Table 3 contains a list of all software tool versions used, where appropriate.

Genome annotation
The Ensembl gene annotation system (Aken et al., 2016) was used to generate annotation for the V. inanis assembly (GCA_907269105.1).Annotation was created primarily through alignment of transcriptomic data to the genome, with gap filling via protein to-genome alignments of a select set of proteins from UniProt (UniProt Consortium, 2019).

Ethics/compliance issues
The I am wondering if the authors may have additional metadata they could provide on the specimens.What type of habitat were they collected in?Also, which flowers were they visiting?Are the second and third specimens female or male?
Is the rationale for creating the dataset(s) clearly described?Yes Are the protocols appropriate and is the work technically sound?Yes

Are sufficient details of methods and materials provided to allow replication by others? Yes
Are the datasets clearly presented in a useable and accessible format?Yes Competing Interests: No competing interests were disclosed.
Reviewer Expertise: I am an entomologist and ecologist with a specialty in syrphids.I have never published a genome previously, but I am familiar with some of the methodologies and protocols.My knowledge regarding these types of papers, however, is limited.
I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.In this manuscript, Crowley and colleagues describe the genome of the Lesser Hornet Hoverfly (V olucella inanis).The paper is well-written, with appropriate methods containing all the information necessary to interpret the data.Of note, the genome of this hoverfly is a valuable resource that can give, for instance, insights into the ectoparasite behavior of the V. inanis larvae in parasitizing larvae of other species.
Therefore, I recommend the publication of the manuscript in its current form.
Is the rationale for creating the dataset(s) clearly described?Yes Are the protocols appropriate and is the work technically sound?Yes

Are sufficient details of methods and materials provided to allow replication by others? Yes
Are the datasets clearly presented in a useable and accessible format?Yes Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Bioinformatics; Host-Pathogen Interactions; Genomics I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Figure 2 .
Figure 2. Genome assembly of Volucella inanis, idVolInan1.1:metrics.The BlobToolKit Snailplot shows N50 metrics and BUSCO gene completeness.The main plot is divided into 1,000 size-ordered bins around the circumference with each bin representing 0.1% of the 961,462,834 bp assembly.The distribution of scaffold lengths is shown in dark grey with the plot radius scaled to the longest scaffold present in the assembly (292,306,469 bp, shown in red).Orange and pale-orange arcs show the N50 and N90 scaffold lengths (163,465,417 and 141,167,267 bp), respectively.The pale grey spiral shows the cumulative scaffold count on a log scale with white scale lines showing successive orders of magnitude.The blue and pale-blue area around the outside of the plot shows the distribution of GC, AT and N percentages in the same bins as the inner plot.A summary of complete, fragmented, duplicated and missing BUSCO genes in the diptera_odb10 set is shown in the top right.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/idVolInan1.1/dataset/CAJSMH01/snail.

Figure 5 .
Figure 5. Genome assembly of Volucella inanis, idVolInan1.1:Hi-C contact map.Hi-C contact map of the idVolInan1.1 assembly, visualised using HiGlass.Chromosomes are shown in order of size from left to right and top to bottom.An interactive version of this figure may be viewed at https://genome-note-higlass.tol.sanger.ac.uk/app/?config=O-m3ajxFT8ugcnrEilKjLQ.

Table 3 . Software tools and versions used.
materials that have contributed to this genome note have been supplied by a Darwin Tree of Life Partner.The submission of materials by a Darwin Tree of Life Partner is subject to the Darwin Tree of Life Project Sampling Code of Practice.By agreeing with and signing up to the Sampling Code of Practice, the Darwin Tree of Life Partner agrees they will meet the legal and ethical requirements and standards set out within this document in respect of all samples acquired for, and supplied to, the Darwin Tree of Life Project.Each transfer of samples is further undertaken according to a Research Collaboration Agreement or Material Transfer Agreement entered into by the Darwin Tree of Life Partner, Genome Research Limited (operating as the Wellcome Sanger Institute), and in some circumstances other Darwin Tree of Life collaborators.

Is the rationale for creating the dataset(s) clearly described? Yes Are the protocols appropriate and is the work technically sound? Yes Are sufficient details of methods and materials provided to allow replication by others? Yes Are the datasets clearly presented in a useable and accessible format? Yes Competing Interests: No competing interests were disclosed. Reviewer Expertise: Bioinformatics I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard. Eric Aguiar
This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.Universidade Estadual de Santa Cruz, Ilhéus, State of Bahia, Brazil https://doi.org/10.21956/wellcomeopenres.20954.r68815© 2023 Diambra L. Luis Diambra National University of La Plata, Argentina, Argentina This manuscript from Crowley et al. reports the genome draft of the Diptera V. inanis.Data generation and analysis, used for several other genomes recently sequenced by the Tree of Life Project, are clearly explained and contain enough information to interpret this important genomic resource.Overall this is a quality paper and merits approval for indexing.