The genome sequence of the Seraphim, Lobophora halterata (Hufnagel, 1767)

We present a genome assembly from an individual female Lobophora halterata (the Seraphim; Arthropoda; Insecta; Lepidoptera; Geometridae). The genome sequence is 315 megabases in span. The complete assembly is scaffolded into 32 chromosomal pseudomolecules with the Z and W sex chromosomes assembled. The mitochondrial genome has also been assembled and is 15.7 kilobases in length.


Background
Lobophora halterata (the Seraphim) is a delicately patterned moth in the family Geometridae. The adult has a series of wavy grey and brown stripes across broad white forewings, which at rest form effective camouflage against tree bark. The moth is found across central and northern Europe, with scattered records from Russia and Japan (GBIF Secretariat, 2021). L. halterata has a single generation per year in the UK: larvae feed on aspen and polar, overwintering occurs as a pupa, and adults have a short flight period in May and June (Waring & Townsend, 2017). Males of this species have a particularly unusual hindwing feature that explains the origin of the common name and the scientific name. The common name refers to a type of angel in Jewish, Christian and Islamic texts; the seraphim are generally described as six-winged angels that act as guardians of the throne of God (Holland, 2012). The Seraphim moth does not actually have six wings, but there is a 'concertina' or Z-fold on the trailing edge of hind wing giving the impression of a small third pair of wings lying on top of the hindwings. There is also error in entomological etymology: in religious texts seraphim is the plural of seraph, whereas in the moth the plural term has become singular (Holland, 2012). The genus name Lobophora refers to this 'lobe' of wing tissue, as does the specific name halterata which draws comparison to the lobe-like halteres of Diptera (Maitland Emmet, 1991). The function of the unusual hindwing lobe of L. halterata is unknown, although the fact it is restricted to males suggests it likely to have a sex-specific role, potentially associated with a scent organ (Hobby, 2009). The developmental genetic basis of the morphological feature is entirely unknown.
The genome of L. halterata was sequenced as part of the Darwin Tree of Life Project, a collaborative effort to sequence all named eukaryotic species in the Atlantic Archipelago of Britain and Ireland. Here we present a chromosomally complete genome sequence for L. halterata, based on the ilLobHalt1 specimen from Wytham Woods, Oxfordshire, UK.

Genome sequence report
The genome was sequenced from an individual female L. halterata ( Figure 1) collected from Wytham Woods, Berkshire, UK (51.77, -1.34). A total of 72-fold coverage in Pacific Biosciences single-molecule HiFi long reads was generated. Primary assembly contigs were scaffolded with chromosome conformation Hi-C data. Manual assembly curation corrected four missing or mis-joins, reducing the scaffold number by 10.53%.
The final assembly has a total length of 314.9 Mb in 34 sequence scaffolds with a scaffold N50 of 10.9 Mb ( Table 1). The complete assembly sequence was assigned to 32 chromosomal-level scaffolds, representing 30 autosomes and the W and Z sex chromosomes. Chromosome-scale scaffolds confirmed by the Hi-C data are named in order of size. (Figure 2- Figure 5; Table 2). The mitochondrial genome was also assembled. The assembly has a BUSCO v5.3.2 (Manni et al., 2021) completeness of 98.3% (single 98.0%, duplicated 0.3%), using the lepidoptera_odb10 reference set. Evaluation of the assembly shows a consensus quality value (QV) of 71.9 and k-mer completeness of 100%. While not fully phased, the assembly deposited is of one haplotype. Contigs corresponding to the second haplotype have also been deposited.

Methods
Sample acquisition and nucleic acid extraction A male L. halterata (ilLobHalt1) was collected using a light trap from Wytham Woods, Berkshire, UK (latitude 51.77, longitude -1.34) by Douglas Boyes (University of Oxford). The sample was identified by Douglas Boyes and snap-frozen on dry ice.
DNA was extracted at the Tree of Life laboratory, Wellcome Sanger Institute. The ilLobHalt1 sample was weighed and dissected on dry ice with tissue set aside for Hi-C sequencing. Abdomen tissue was cryogenically disrupted to a fine powder using a Covaris cryoPREP Automated Dry Pulveriser, receiving multiple impacts. High molecular weight (HMW) DNA was extracted using the Qiagen MagAttract HMW DNA extraction kit. HMW DNA was sheared into an average fragment size of 12-20 kb in a Megaruptor 3 system with speed setting 30. Sheared DNA was purified by solid-phase reversible immobilisation using AMPure PB beads with a 1.8X ratio of beads to sample to remove the shorter fragments and concentrate the DNA sample. The concentration of the sheared and purified DNA was assessed using a Nanodrop spectrophotometer and Qubit Fluorometer and Qubit dsDNA High Sensitivity Assay kit. Fragment size distribution was evaluated by running the sample on the FemtoPulse system.

Sequencing
Pacific Biosciences HiFi circular consensus and 10X Genomics read cloud DNA sequencing libraries were constructed according to the manufacturers' instructions. DNA sequencing was performed by the Scientific Operations core at the WSI on the Pacific Biosciences SEQUEL II (HiFi) instrument. Hi-C data were also generated from head/thorax tissue of ilLobHalt1 using the Arima v2 kit and sequenced on the Illumina HiSeq X Ten instrument.

INSDC accession Chromosome Size (Mb) GC%
OW052018.  The genome sequence is released openly for reuse. The Lobophora halterata genome sequencing initiative is part of the Darwin Tree of Life (DToL) project. All raw sequence data and the assembly have been deposited in INSDC databases: The genome will be annotated and presented through the Ensembl pipeline at the European Bioinformatics Institute. Raw data and assembly accession identifiers are reported in Table 1.

Satoshi Yamamoto
Institute for Agro-Environmental Sciences, NARO (NIAES), Tsukuba, Japan This study presents a de novo genome assembly of Lobophora halterata (Lepidoptera; Geometridae) using a high coverage HiFi long read dataset. Scaffolding with Hi-C data subsequently yielded 32 chromosome-scale scaffolds. The assembly techniques employed conform to standard genome assembly methods, and the resulting chromosomal count is consistent with previous reports on chromosomes of geometrid species (for example, Suomalainen 1965 1 ). As such, the assembly can be considered credible.
However, the methodology section does not sufficiently detail the methods employed for quality control of raw reads, the options and parameters utilized in the assembly program, or the approach for identifying sex chromosomes.
All data generated in this study have been deposited in public databases for accessibility. This study is deemed valuable given the relative scarcity of chromosome-scale genome assemblies for geometrid species. Yes