The genome sequence of the sallow kitten, Furcula furcula (Clerck, 1759)

We present a genome assembly from an individual male Furcula furcula (the sallow kitten; Arthropoda; Insecta; Lepidoptera; Notodontidae). The genome sequence is 736 megabases in span. The entire assembly (100%) is scaffolded into 29 chromosomal pseudomolecules, with the Z sex chromosome assembled. The complete mitochondrial genome was also assembled and is 17.2 kilobases in length.


Background
The sallow kitten, Furcula furcula (Clerck, 1759), is a holarctic moth belonging to the family Notodontidae (prominent moths) that is commonly found throughout Europe, Asia, and North America (Heath et al., 1983;Miller et al., 2018;Okagaki, 1958). Adults have a wingspan ranging from 30-36 mm, varying slightly between regions of its geographic distribution, and are white to grey in colour with a large grey band on the dorsal wing surface (Heath et al., 1983). Larvae can grow up to 35 mm and are bright green with a purple or brown marking on the dorsal side and can be identified by their modified anal prolegs that form a forked tail-like appendage, for which the genus derives its name from ("furca" being Latin for fork) (Heath et al., 1983).
Larvae have been observed to feed on leaves of poplar, willow, birch, and beech trees (Robinson et al., 2010;Vorbrodt & Müller-Rutz, 1911). Once mature, larvae crawl down the tree trunk and make a hardened cocoon, consisting of silk and wood pulp, in which they pupate (Heath et al., 1983;Okagaki, 1958). The moth emerges in the summer, between May and September, with 1-3 generations emerging per year (Heath et al., 1983;Okagaki, 1958;Robinson et al., 2010). The number of generations depends on the climate of the region, with more generations emerging per year in warmer regions (Okagaki, 1958). Until recently, F. occidentalis (Lintner, 1878) was treated as a subspecies of F. furcula. Given the global distribution of F. furcula, a fully annotated genome will help provide data needed to understand the link between its genotype and its broad larval host breadth and help distinguish this species from similar ones with overlapping distributions.

Genome sequence report
The genome was sequenced from a single male F. furcula collected from Wytham Woods, Berkshire, UK (Figure 1). A total of 28-fold coverage in Pacific Biosciences single-molecule HiFi long reads and 62-fold coverage in 10X Genomics read clouds were generated. Primary assembly contigs were  scaffolded with chromosome conformation Hi-C data. Manual assembly curation corrected 20 missing/misjoins and removed 5 haplotypic duplications, reducing the assembly size by 0.44% and the scaffold number by 30.43%.
The final assembly has a total length of 736 Mb in 32 sequence scaffolds with a scaffold N50 of 27.06 Mb (Table 1). 100% of the assembly sequence was assigned to 29 chromosomallevel scaffolds, representing 28 autosomes (numbered by sequence length) and the Z sex chromosome (Figure 2- Figure 5; Table 2).   Megaruptor 3 system with speed setting 30. Sheared DNA was purified by solid-phase reversible immobilisation using AMPure PB beads with a 1.8X ratio of beads to sample to remove the shorter fragments and concentrate the DNA sample. The concentration of the sheared and purified DNA was assessed using a Nanodrop spectrophotometer and Qubit Fluorometer and Qubit dsDNA High Sensitivity Assay kit.
Fragment size distribution was evaluated by running the sample on the FemtoPulse system.
RNA was extracted from abdomen tissue of ilFurFurc1 in the Tree of Life Laboratory at the WSI using TRIzol, according to the manufacturer's instructions. RNA was then eluted in 50 μl RNAse-free water and its concentration RNA assessed

Genome assembly
Assembly was carried out with Hifiasm (Cheng et al., 2021); haplotypic duplication was identified and removed with purge_dups (Guan et al., 2020). One round of polishing was performed by aligning 10X Genomics read data to the assembly with longranger align, calling variants with freebayes (Garrison & Marth, 2012). The assembly was then scaffolded with Hi-C data (Rao et al., 2014) using SALSA2 (Ghurye et al., 2019). The assembly was checked for contamination as described previously (Howe et al., 2021). Manual curation was performed using HiGlass (Kerpedjiev et al., 2018) and Pretext. The mitochondrial genome was assembled using MitoHiFi (Uliano-Silva et al., 2021), which performs annotation using MitoFinder (Allio et al., 2020). The genome was analysed and BUSCO scores generated within the BlobToolKit environment (Challis et al., 2020).  I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.