The genome sequence of the Lunar Hornet, Sesia bembeciformis (Hübner 1806)

We present a genome assembly from an individual male Sesia bembeciformis (the Lunar Hornet; Arthropoda; Insecta; Lepidoptera; Sesiidae). The genome sequence is 477.1 megabases in span. Most of the assembly is scaffolded into 31 chromosomal pseudomolecules, including the Z sex chromosome. The mitochondrial genome has also been assembled and is 16.1 kilobases in length. Gene annotation of this assembly on Ensembl has identified 15,843 protein coding genes.


Background
The Lunar Hornet, Sesia bembeciformis (Hübner 1806), is a moth of the family Sesiidae, commonly known as clearwings.This group are largely diurnal and named for their mostly scale-less, elongated wings which together with the banded, elongated abdomens of many species, contribute to their Batesian mimicry of aposematic Hymenoptera.S. bembeciformis is a relatively large species, with a wingspan of 31 to 48 mm (Lastuvka & Lastuvka, 2001) and a broad abdomen with alternating yellow and black bands that enable it to mimic hornets (Vespa spp).
Sesia bembeciformis feeds mainly on willows (Salix spp.) particularly S. caprea and S. cinerea (Goossens, 2020;Henwood et al., 2020;Lastuvka & Lastuvka, 2001).Some authors also state that the species is occasionally found on Populus spp.(Henwood et al., 2020;St John, 1890), which are the main foodplants of the related S. apiformis (which is also occasionally found on Salix spp.) (Lastuvka & Lastuvka, 2001).S. bembeciformis can be encountered in a variety of habitats where these foodplants occur, and may prefer more open areas with scattered trees, in sunny positions (Goossens, 2020).In the past it has occasionally been considered a pest of commercial willow crops -(Johnson, 1874) wrote of it 'doing considerable injury to the osier beds'.
In common with other clearwing species, S. bembeciformis larvae are rather nondescript, with creamy-white bodies, brown heads and prothoraic plates, and internal development on the foodplants (Henwood et al., 2020).Eggs are laid in small batches on the bark near the base of the tree, and larvae enter the trunk, often through a wound, where they feed initially on the sap (Goossens, 2020).They can take several years to develop, hibernating over winter and moving deeper into the trunk, often down into the roots.Before their final hibernation, larvae prepare an exit hole in the trunk (leaving a thin layer of bark to cover it) before spinning a loose cocoon above it in spring and emerging in mid-summer.Adults are on the wing in June and July (Randle et al., 2019) and usually emerge early in the morning where they can be found resting on trunks above their emergence hole before engaging in active flight by day.
Males can be attracted to synthetic lures which mimic the sex pheromones produced by females, mostly during the morning before midday (Pühringer, 2021).These lures were only developed for this species relatively recently, being released commercially in 2020.They provide a quick and efficient method for recording the species and have revolutionised understanding of its distribution, as have other lures developed for other clearwing species, e.g.(Burman et al., 2016).Many clearwing species are rarely observed as adults without lures, due to their similarity to the Hymenoptera they mimic, and the fact that they fly by day, but many (like S. bembeciformis) have reduced mouth-parts and do not visit flowers.
The main method for recording many species before the advent of lures has therefore been to search for the well-hidden larvae, and dedicated surveys for S. bembeciformis larval exit holes have revealed it from areas where it was previously entirely unknown (Crégu, 2015;Goossens, 2020;Lastuvka & Lastuvka, 2014), and in some cases showed it to be very common.In the UK, it is the most widespread clearwing species, known from all areas (Waring et al., 2017).Outside the UK it has a limited, mostly western European distribution (Lastuvka & Lastuvka, 2001).
The genome of Sesia bembeciformis was sequenced as part of the Darwin Tree of Life Project, a collaborative effort to sequence all named eukaryotic species in the Atlantic Archipelago of Britain and Ireland.Here we present a chromosomally complete genome sequence for Sesia bembeciformis, based on one male specimen from Wytham Woods, Oxfordshire, UK.

Genome sequence report
The genome was sequenced from one male S. bembeciformis (Figure 1) collected from Wytham Woods, Oxfordshire (latitude 51.775, longitude -1.315).A total of 27-fold coverage in Pacific Biosciences single-molecule HiFi long was generated.Primary assembly contigs were scaffolded with chromosome conformation Hi-C data.Manual assembly curation corrected four missing joins or mis-joins and removed ten haplotypic duplications, reducing the scaffold number by 18.52%.
The final assembly has a total length of 477.1 Mb in 44 sequence scaffolds with a scaffold N50 of 17.4 Mb (Table 1).Most (99.99%) of the assembly sequence was assigned to 31 chromosomal-level scaffolds, representing   2).The assembly has a BUSCO v5.3.2 (Manni et al., 2021) completeness of 98.7% (single 98.2%, duplicated 0.5%) using the lepidoptera_odb10 reference set.While not fully phased, the assembly deposited is of one haplotype.Contigs corresponding to the second haplotype have also been deposited.

Genome annotation report
The S. bembeciformis GCA_943735995.1 genome assembly was annotated using the Ensembl rapid annotation pipeline (Table 1; https://rapid.ensembl.org/Sesia_bembeciformis_GCA_943735995.1/).The resulting annotation includes 16,010 transcribed mRNAs from 15,843 protein-coding genes.distribution was evaluated by running the sample on the FemtoPulse system.
RNA was extracted from abdomen tissue of ilSesBemb2 in the Tree of Life Laboratory at the WSI using TRIzol, according to the manufacturer's instructions.RNA was then eluted in 50 μl RNAse-free water and its concentration assessed using a Nanodrop spectrophotometer and Qubit Fluorometer using the Qubit RNA Broad-Range (BR) Assay kit.Analysis of the integrity of the RNA was done using Agilent RNA 6000 Pico Kit and Eukaryotic Total RNA assay.

Sequencing
Pacific Biosciences HiFi circular consensus and 10X Genomics read cloud DNA sequencing libraries were constructed according to the manufacturers' instructions.Poly(A) RNA-Seq libraries were constructed using the NEB Ultra II RNA Library Prep kit.DNA and RNA sequencing were performed by the Scientific Operations core at the WSI on Pacific Biosciences SEQUEL II (HiFi) and Illumina NovaSeq 6000 (RNA-Seq).
Hi-C data were also generated from abdomen tissue of ilSesBemb1 using the Arima v2 kit and sequenced on the Illumina NovaSeq 6000 instrument.

Genome assembly
Assembly was carried out with Hifiasm (Cheng et al., 2021).

Genome annotation
The BRAKER2 pipeline (Brůna et al., 2021) was used in the default protein mode to generate annotation for the S. bembeciformis assembly (GCA_943735995.1) in Ensembl Rapid Release.

Ethics and compliance issues
The materials that have contributed to this genome note have been supplied by a Darwin  The authors present the genome sequence of the Lunar Hornet, Sesia bembeciformis (Hübner 1806) Two male Sesia bembeciformis were obtained from Wytham Woods, Oxfordshire, on 17 July 2021.The specimens were collected from the orchard using a pheromone lure.This is the next Sesiidae genome to be sequenced.The Sesia bembeciformis genome was sequenced as part of the Darwin Tree of Life project.The chromosome scale is of the same high quality as other genomes generated by DtoL.The pipeline is identical to other genomes, which is great.The resulting annotation includes 16,010 transcribed mRNAs from 15,843 protein-coding genes.The introductory text is presented in a comprehensive and enjoyable manner for the reader.The photographs of the Sesia bembeciformis specimens could have been better or properly corrected in a suitable graphics programme.

Is the rationale for creating the dataset(s) clearly described? Yes
Are the protocols appropriate and is the work technically sound?Yes

Are sufficient details of methods and materials provided to allow replication by others? Yes
Are the datasets clearly presented in a useable and accessible format?Yes Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Broad Lepidopterology with particular emphasis on moths of the superfamily Noctuoidea I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Fahad Alqahtani
King Abdulaziz City for Science and Technology, Riyadh, Saudi Arabia The authors of "The genome sequence of the Lunar Hornet, Sesia bembeciformis (Hübner 1806)," have successfully reconstructed the genome sequence of the Lunar Hornet.The assembly quality was evaluated using BUSCO, leading to high scores.The genome assembly contains 31 chromosomal-level scaffolds, including 30 autosomes and the Z sex chromosome, totaling 477,100,000 base pairs.The background is well-organized, with a clear and logical flow throughout the content, showcasing a significant contribution to the field of genomics.
However, there are a few minor issues that need to be addressed.
In the Methods section (Sequencing), the authors mentioned 10X Genomics read cloud DNA sequencing.However, the paper specified the use of Pacific Biosciences SEQUEL II, Hi-C Illumina, and PolyA RNA-Seq Illumina.Can you explain this ambiguity?

○
In the "Genome Sequence Report," the authors also stated that the genome was reconstructed from one male.However, in the Methods -Sample Acquisition and Nucleic Acid Extraction, it was mentioned that two males were collected.Can you explain this?

○
A third issue arises from the RNA-seq data, which was not mentioned in the genome assembly section.The genome assembly section requires more details, including the data used for Hifiasm (Pacbio data is known to be used), as I understand the manuscript should be reproducible.

○
Moreover, in the Methods section (Genome Assembly, Curation, and Evaluation), it would be valuable to specify the closely related species used when MitoHiFi/Mitofinder guided the annotation of the mitochondrial genome.

○
Lastly, in the Methods section (Genome Assembly, Curation, and Evaluation), the term "Pretext" should align with the software tools table, which is mentioned as "PretextView."The genome is of the same high quality as the other genomes produced by the DtoL project.The pipeline is the same as for the other genomes which is excellent.All the links and datasets presented are publicly available.
The text of the introduction reads well and is interesting.
The photos of the specimens could have been better, i.e. more light, a scale and a side and dorsal photo.
Is the rationale for creating the dataset(s) clearly described?Yes

Are the protocols appropriate and is the work technically sound? Yes
Are sufficient details of methods and materials provided to allow replication by others?

Yes
Are the datasets clearly presented in a useable and accessible format?Yes Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Molecular evolution I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.
30 autosomes and the Z sex chromosome.Chromosome-scale scaffolds confirmed by the Hi-C data are named in order of size.(Figure 2-Figure 5; Table

Figure 2 .
Figure 2. Genome assembly of Sesia bembeciformis, ilSesBemb1.1:metrics.The BlobToolKit Snailplot shows N50 metrics and BUSCO gene completeness.The main plot is divided into 1,000 size-ordered bins around the circumference with each bin representing 0.1% of the 477,151,446 bp assembly.The distribution of scaffold lengths is shown in dark grey with the plot radius scaled to the longest scaffold present in the assembly (23,076,235 bp, shown in red).Orange and pale-orange arcs show the N50 and N90 scaffold lengths (17,397,286 and 10,737,896 bp), respectively.The pale grey spiral shows the cumulative scaffold count on a log scale with white scale lines showing successive orders of magnitude.The blue and pale-blue area around the outside of the plot shows the distribution of GC, AT and N percentages in the same bins as the inner plot.A summary of complete, fragmented, duplicated and missing BUSCO genes in the lepidoptera_odb10 set is shown in the top right.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/ilSesBemb1.1/dataset/CALSEX01/snail.

Figure 5 .
Figure 5. Genome assembly of Sesia bembeciformis, ilSesBemb1.1:Hi-C contact map.Hi-C contact map of the ilSesBemb1.1 assembly, visualised using HiGlass.Chromosomes are shown in order of size from left to right and top to bottom.An interactive version of this figure may be viewed at https://genome-note-higlass.tol.sanger.ac.uk/l/?d=UU22HbFPTsSHjXIUIZh37w.

Reviewer Report 05
October 2023 https://doi.org/10.21956/wellcomeopenres.21187.r67884© 2023 Alqahtani F. This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

○
Is the rationale for creating the dataset(s) clearly described?YesAre the protocols appropriate and is the work technically sound?YesAre sufficient details of methods and materials provided to allow replication by others?YesAre the datasets clearly presented in a useable and accessible format?Yes Competing Interests: No competing interests were disclosed.Reviewer Expertise: BioinformaticsI confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.Reviewer Report 07 September 2023 https://doi.org/10.21956/wellcomeopenres.21187.r64898© 2023 Nabholz B. This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.Benoit Nabholz Universite de Montpellier, Montpellier, Occitanie, France The authors present the chromosome scale assembly of the Lunar Hornet, Sesia bembeciformis.It is the 16th Sesiidae genome to be sequenced (most of them by the Darwin Tree of Life project, DtoL).

Table 2 . Chromosomal pseudomolecules in the genome assembly of Sesia bembeciformis, ilSesBemb1.
Tree of Life Partner.The submission of materials by a Darwin Tree of Life Partner is subject to the Darwin Tree of Life Project Sampling Code of Practice.By agreeing with and signing up to the Sampling Code of Practice, the Darwin Tree of Life Partner agrees they will meet the legal and ethical requirements and standards set out within this document in respect of all samples acquired for, and supplied to, the Darwin Tree of Life Project.All efforts are undertaken to minimise the suffering of animals used for sequencing.Each transfer of samples is further undertaken according to a Research Collaboration Agreement or Material Transfer Agreement entered into by the Darwin Tree of Life Partner, Genome Research Limited (operating as the Wellcome Sanger Institute), and in some circumstances other Darwin Tree of Life collaborators.Members of the Darwin Tree of Life Barcoding collective are listed here: https://doi.org/10.5281/zenodo.4893703.Members of the Tree of Life Core Informatics collective are listed here: https://doi.org/10.5281/zenodo.5013541.Members of the Darwin Tree of Life Consortium are listed here: https://doi.org/10.5281/zenodo.4783558.