The genome sequence of the European hornet, Vespa crabro Linnaeus, 1758

We present a genome assembly from an individual female Vespa crabro (the European hornet; Arthropoda; Insecta; Hymenoptera; Vespidae). The genome sequence is 230 megabases in span. The majority of the assembly (94.93%) is scaffolded into 25 chromosomal pseudomolecules.


Background
The European hornet, Vespa crabro, is the largest social wasp in Europe, with workers measuring up to 23 mm and queens up to 30 mm (Archer & Turner, 2014).It is common and widespread throughout much of the palearctic, including throughout England and Wales, and has been introduced into North America (Archer, 1992).Vespa crabro occurs in a wide range of habitats but is particularly associated with mature deciduous woodlands and urban environments.The head and majority of the metasoma are predominantly yellow and the mesosoma, anterior segments of the metasoma and legs are reddish-brown with varying extents of black markings.This species is eusocial, living in colonies with a queen, workers and males.Whilst workers are capable of producing male offspring, this is rare due to the action of worker policing (Foster et al., 2002).Colonies are founded by overwintered queens from early April, with the first workers appearing in late June to early July (Bunn, 1988b).Nests are constructed out of a paper-like substance produced from the pulp of decaying wood mixed with saliva, often incorporating bark and twig fragments (Spredbery, 1973).The nest consists of up to around 1400 hexagonal cells arranged into 3-8 combs, covered by a nest envelope (Archer, 1993;Archer, 2008).Nests are usually located in aerial situations, particularly in hollow trees, but are also commonly located in human structures such as attics and outbuildings.Colonies may relocate the nest should it exceed the available space (Pawlyszyn, 1992).Nests are occasionally located underground, but such nests are more frequently relocated (Archer & Turner, 2014).The nests are well characterised as containing many species of inquilines, predators and parasitoids.Colonies build up throughout the summer, before the reproductive males and gynes are produced in September.Most queens are singularly mated but double and triple mating also occurs, although paternity of the offspring produced by multiply-mated queens is heavily biased to a single male (Foster et al., 1999).Workers may persist until October or occasionally early November (Archer, 1993;Bunn, 1988b).
Female hornets are generalist predators, catching, killing, and preparing prey of various arthropods to take back to the nest to feed to the developing brood.Prey includes other species of social wasp, honeybees, flies, butterflies, moths and spiders (Pawlyszyn, 1994).In particular, returning honeybee foragers are frequently taken, although this species does not inflict a considerable impact on honeybee colonies due to honeybee defensive behaviours (Baracchi et al., 2010).Vespa crabro also often frequents sap runs, where it feeds on the exudations, and is also known to rig bark twigs to stimulate sap flow (Bunn, 1988a;Edwards, 1980).Ivy (Hedera helix) flowers are visited when in bloom to feed on nectar (Pawlyszyn, 1994), particularly by males.The trophic preferences of this species overlap with the invasive Vespa velutina, meaning that where these species co-occur, V. crabro may be outperformed in instances of interspecific competition (Cini et al., 2018).
The species is not particularly aggressive, although females may sting if provoked or in defence of the nest.Major venom components, such as prepromastoparan, vespid chemotactic peptide precursor and vespakinin, are more highly enriched that in some other species of Vespa, meaning that the venom likely has a greater toxicity (Yoon et al., 2015).

Genome sequence report
The genome was sequenced from a single female V. crabro (Figure 1) collected from Wytham Woods, Oxfordshire, UK (latitude 51.77, longitude -1.338).A total of 70-fold coverage in Pacific Biosciences single-molecule long reads and 104fold coverage in 10X Genomics read clouds were generated.Primary assembly contigs were scaffolded with chromosome conformation Hi-C data.Manual assembly curation corrected 96 missing/misjoins, reducing the scaffold number by 41.67%, and increasing the scaffold N50 by 113.35% and assembly length by 0.01%.
The final assembly has a total length of 230 Mb in 106 sequence scaffolds with a scaffold N50 of 10 Mb (Table 1).Of the assembly sequence, 94.9% was assigned to 25 chromosomal-level scaffolds (numbered by sequence length) (Figure 2-Figure 5; Table 2).The assembly has a BUSCO v5.1.2 (Simão et al., 2015) completeness of 96.5% (single 96.3%, duplicated 0.3%) using the hymenoptera_odb10 reference set.While not fully phased, the assembly deposited is of one haplotype.Contigs corresponding to the second haplotype have also been deposited.
The assembly was constructed into 25 chromosomes and seems in agreement with the expected karyotype from Hoshiba et al. (1989).Large centromeric regions were observed in the Hi-C map, notably on SUPER_1, SUPER_12 and SUPER_16 (~2-3 Mbp in size), and could explain the skew in genome content to AT. Scaffold_5 (~6 Mbp in size) appears to be a collapsed centromeric repeat and could not be confidently placed.

Sample acquisition and DNA extraction
A single female V. crabro was collected from Wytham Woods, Oxfordshire, UK (latitude 51.774, longitude -1.332) by Liam Crowley, University of Oxford, using a net.The sample was identified by the same individual and snap-frozen on dry ice.
DNA was extracted at the Tree of Life laboratory, Wellcome Sanger Institute.The iyVesCrab1 sample was weighed and dissected on dry ice with tissue set aside for Hi-C sequencing.Thorax tissue was disrupted using a Nippi Powermasher fitted with a BioMasher pestle.Fragment size analysis of 0.01-0.5 ng of DNA was then performed using an Agilent FemtoPulse.
High molecular weight (HMW) DNA was extracted using the Qiagen MagAttract HMW DNA extraction kit.Low molecular weight DNA was removed from a 200-ng aliquot of extracted DNA using 0.8X AMpure XP purification kit prior to 10X Chromium sequencing; a minimum of 50 ng DNA was submitted for 10X sequencing.HMW DNA was sheared into an average fragment size between 12-20 kb in a Megaruptor 3 system with speed setting 30.Sheared DNA was purified by solid-phase reversible immobilisation using AMPure PB beads with a 1.8X ratio of beads to sample to remove the shorter fragments and concentrate the DNA sample.The concentration of the sheared and purified DNA was assessed using a Nanodrop spectrophotometer and Qubit Fluorometer and Qubit dsDNA High Sensitivity Assay kit.Fragment size distribution was evaluated by running the sample on the FemtoPulse system.

Sequencing
Pacific Biosciences HiFi circular consensus and 10X Genomics read cloud sequencing libraries were constructed according     More detailed information about the invasion of V. velutina to Europe should be added (e.g.PMID: 33177635). 1  2.
Explain why did you use a female hornet (a few words about sex determination in Hymenoptera)? 3.
The place of specimen collection is presented twice in the report.4.

Figure 1 .
Figure 1.Image of the iyVesCrab1 specimen taken during preservation and processing.

Figure 2 .
Figure 2. Genome assembly of Vespa crabro, iyVesCrab1: metrics.The BlobToolKit Snailplot shows N50 metrics and BUSCO gene completeness.The main plot is divided into 1,000 size-ordered bins around the circumference with each bin representing 0.1% of the 229,601,916 bp assembly.The distribution of scaffold lengths is shown in dark grey with the plot radius scaled to the longest scaffold present in the assembly (24,517,513 bp, shown in red).Orange and pale-orange arcs show the N50 and N90 chromosome lengths (9,767,562 and 4,871,713 bp), respectively.The pale grey spiral shows the cumulative chromosome count on a log scale with white scale lines showing successive orders of magnitude.The blue and pale-blue area around the outside of the plot shows the distribution of GC, AT and N percentages in the same bins as the inner plot.A summary of complete, fragmented, duplicated and missing BUSCO genes in the hymenoptera_odb10 set is shown in the top right.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/iyVesCrab1.1/dataset/CAJUUB01/snail.

Figure 3 .
Figure 3. Genome assembly of Vespa crabro, iyVesCrab1: GC coverage.BlobToolKit GC-coverage plot.Scaffolds are coloured by phylum.Circles are sized in proportion to scaffold length.Histograms show the distribution of scaffold length sum along each axis.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/iyVesCrab1.1/dataset/CAJUUB01/blob.

Figure 4 .
Figure 4. Genome assembly of Vespa crabro, iyVesCrab1: cumulative sequence.BlobToolKit cumulative sequence plot.The grey line shows cumulative length for all scaffolds.Coloured lines show cumulative lengths of scaffolds assigned to each phylum using the buscogenes taxrule.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/iyVesCrab1.1/dataset/ CAJUUB01/cumulative.

Figure 5 .
Figure 5. Genome assembly of Vespa crabro, iyVesCrab1: Hi-C contact map.Hi-C contact map of the iyVesCrab1.1 assembly, visualised in HiGlass.Chromosomes are shown in size order from left to right and top to bottom.
https://doi.org/10.21956/wellcomeopenres.19402.r48317© 2022 Nedoluzhko A. This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.Artem Nedoluzhko Russian Academy of Sciences, Saint Petersburg, Russian Federation I thank the Wellcome Open Research authorities for handing me an opportunity to review a genome report: "The genome sequence of the European hornet, Vespa crabro Linnaeus, 1758".The report of high-level assembly of European hornet genome is clearly presented and I have only minor comments to the authors: You should use genus Vespa name only one time per manuscript text then use abbreviation V. velutina and V. crabro.1.

Table 3 . Software tools used.
document in respect of all samples acquired for, and supplied to, the Darwin Tree of Life Project.Each transfer of samples is further undertaken according to a Research Collaboration Agreement or Material Transfer Agreement entered into by the Darwin Tree of Life Partner, Genome Research Limited (operating as the Wellcome Sanger Institute), and in some circumstances other Darwin Tree of Life collaborators. this

Is the rationale for creating the dataset(s) clearly described? Yes Are the protocols appropriate and is the work technically sound? Yes Are sufficient details of methods and materials provided to allow replication by others? Partly Are the datasets clearly presented in a useable and accessible format? Yes Competing Interests:
No competing interests were disclosed.

have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.
Reviewer Report 18 February 2022