The genome sequence of the Lesser Yellow Underwing, Noctua comes Hübner, 1813

We present a genome assembly from an individual female Noctua comes (the Lesser Yellow Underwing; Arthropoda; Insecta; Lepidoptera; Noctuidae). The genome sequence is 540.7 megabases in span. Most of the assembly is scaffolded into 32 chromosomal pseudomolecules, including the W and Z sex chromosomes. The mitochondrial genome has also been assembled and is 15.37 kilobases in length. Gene annotation of this assembly on Ensembl identified 18,001 protein coding genes.


Background
Noctua comes, the Lesser Yellow Underwing, is a moth in the family Noctuidae and has the typical fat body and relatively narrow wing seen in this group of moths.Like other yellow underwings, it has forewings in varying shades of brown, and yellow hind wings with a black band towards their trailing edge, but it can be distinguished from the similar Lunar Yellow Underwing Noctua orbona by its less clearly defined triangular black marking at the apex of the forewing, and from the Large Yellow Underwing Noctua pronuba which lacks the large discal spot seen in N. comes (Skinner & Wilson, 2009).
Noctua comes is a common moth, found in most habitats throughout Britain and Ireland (Waring et al., 2017) and, unlike many moths, its numbers increased between 1968and 2002(Conrad et al., 2006)).It has a single generation on the wing from June to October.The larvae feed nocturnally from August to May (Waring et al., 2017), and, like those of many noctuid moths, are known as cutworms because of their habit of causing damage to plant stems by feeding close to the ground (Bourner & Cory, 2004).The larvae feed on a wide range of small trees, shrubs and herbaceous plants, including common nettle Urtica dioica, broad-leaved dock Rumex obtusifolius, foxglove Digitalis purpurea, hawthorn Crataegus monogyna, blackthorn Prunus spinosa, sallow Salix spp., bramble Rubus fruticosus, broom Cytisus scoparius, and heather Calluna vulgaris (Skinner & Wilson, 2009;Waring et al., 2017).Where it has been introduced to North America it can also be a pest of crop plants such as tobacco Nicotiana spp.and grape Vitis spp.(Copley & Cannings, 2005;Crolla, 2008) and there have been attempts to find biological methods of control using various viruses (Bourner & Cory, 2004).
Like many moths, Noctua comes is important in the diet of bats (Mitschunas & Wagner, 2015;Norman et al., 1999).Moths can detect ultrasonic waves and are better at evading Greater Mouse-eared Bat Myotis myotis and the Lesser Mouse-eared Bat Myotis blythii which have been ringed with a metal and plastic ring positioned on the same limb so that they can rub together, because the frequency of the sound emitted by the rings is closer to the best auditory frequency of the moth than that of the calls of the bats (Norman et al., 1999).
We present a chromosomally complete genome sequence for Noctua comes, based on a female specimen from Wytham Woods, Oxfordshire, UK.

Genome sequence report
The genome was sequenced from a female Noctua comes (Figure 1) collected from Wytham Woods, Oxfordshire, UK (51.77, -1.34).A total of 43-fold coverage in Pacific Biosciences single-molecule HiFi long reads was generated.Primary assembly contigs were scaffolded with chromosome conformation Hi-C data.
The final assembly has a total length of 540.7 Mb in 45 sequence scaffolds with a scaffold N50 of 18.3 Mb (Table 1).The snail plot in Figure 2 provides a summary of the assembly statistics, while the distribution of assembly scaffolds on GC proportion and coverage is shown in Figure 3.The cumulative assembly plot in Figure 4 shows curves for subsets of scaffolds assigned to different phyla.Most (99.83%) of the assembly sequence was assigned to 32 chromosomal-level scaffolds, representing 30 autosomes and the W and Z sex chromosomes.Chromosome-scale scaffolds confirmed by the Hi-C data are named in order of size (Figure 5; Table 2).Chromosome Z and W were identified by read coverage statistics.While not fully phased, the assembly deposited is of one haplotype.Contigs corresponding to the second haplotype have also been deposited.The mitochondrial genome was also assembled and can be found as a contig within the multifasta file of the genome submission.

Sample acquisition and nucleic acid extraction
A specimen of Noctua comes (specimen ID Ox000594, ToLID ilNocCome1) was collected from Wytham Woods, Oxfordshire (biological vice-county Berkshire), UK (latitude 51.77, longitude -1.34) on 2020-07-05 using a light trap.The specimen was collected and identified by Douglas Boyes (University of Oxford).The specimen used for Hi-C and RNA sequencing (specimen ID Ox003031, ToLID ilNocCome2) was collected from the same location on 2022-07-22, also using a light trap.The specimen was collected by Liam Crowley (University of Oxford) and Finley Hutchinson (University of Exeter), and identified by Finley Hutchinson.The specimens were stored, handled and delivered on dry ice.
The workflow for high molecular weight (HMW) DNA extraction at the Wellcome Sanger Institute (WSI) includes a sequence of core procedures: sample preparation; sample homogenisation, DNA extraction, fragmentation, and clean-up.In sample preparation, the ilNocCome1 sample was weighed and dissected on dry ice (Jay et al., 2023).Tissue from the abdomen was homogenised using a PowerMasher II tissue disruptor (Denton et al., 2023a).HMW DNA was extracted using the Automated MagAttract v1 protocol (Sheerin et al., 2023).DNA was sheared into an average fragment size of 12-20 kb in a Megaruptor 3 system with speed setting 30 (Todorovic et al., 2023).Sheared DNA was purified by solid-phase reversible immobilisation (Strickland et al., 2023): in brief, the method employs a 1.8X ratio of AMPure PB beads to sample to eliminate shorter fragments and concentrate RNA was extracted from abdomen tissue of ilNocCome1 in the Tree of Life Laboratory at the WSI using the RNA Extraction: Automated MagMax™ mirVana protocol (do Amaral et al., 2023).The RNA concentration was assessed using a Nanodrop spectrophotometer and a Qubit Fluorometer using the Qubit RNA Broad-Range Assay kit.Analysis of the integrity of the RNA was done using the Agilent RNA 6000 Pico Kit and Eukaryotic Total RNA assay.
Protocols developed by the WSI Tree of Life laboratory are publicly available on protocols.io(Denton et al., 2023b).

Sequencing
Pacific Biosciences HiFi circular consensus DNA sequencing libraries were constructed according to the manufacturers' instructions.Poly(A) RNA-Seq libraries were constructed using the NEB Ultra II RNA Library Prep kit.DNA and RNA sequencing was performed by the Scientific Operations core at the WSI on Pacific Biosciences SEQUEL II (HiFi) and Illumina NovaSeq 6000 (RNA-Seq) instruments.Hi-C data were also generated from head tissue of ilNocCome2 using the Arima2 kit and sequenced on the Illumina NovaSeq 6000 instrument.

Genome assembly, curation and evaluation
Assembly was carried out with Hifiasm (Cheng et al., 2021) and haplotypic duplication was identified and removed with purge_dups (Guan et al., 2020).The assembly was then scaffolded with Hi-C data (Rao et al., 2014) using YaHS (Zhou et al., 2023).The assembly was checked for contamination and corrected as described previously (Howe et al., 2021).Manual curation was performed using HiGlass (Kerpedjiev et al., 2018) and PretextView (Harry, 2022).The mitochondrial genome was assembled using MitoHiFi (Uliano-Silva et al., 2023), which runs MitoFinder (Allio et al., 2020) or MITOS (Bernt et al., 2013) and uses these annotations to select the final mitochondrial contig and to ensure the general quality of the sequence.

Genome annotation
The BRAKER2 pipeline (Brůna et al., 2021) was used in the default protein mode to generate annotation for the Noctua comes assembly (GCA_963082995.1) in Ensembl Rapid Release at the EBI.

Wellcome Sanger Institute -Legal and Governance
The materials that have contributed to this genome note have been supplied by a Darwin Tree of Life Partner.The submission of materials by a Darwin Tree of Life Partner is subject to the 'Darwin Tree of Life Project Sampling Code of Practice', which can be found in full on the Darwin Tree of Life website here.By agreeing with and signing up to the Sampling Code of Practice, the Darwin Tree of Life Partner agrees they will meet the legal and ethical requirements and standards set out within this document in respect of all samples acquired for, and supplied to, the Darwin Tree of Life Project.
Further, the Wellcome Sanger Institute employs a process whereby due diligence is carried out proportionate to the nature of the materials themselves, and the circumstances under which they have been/are to be collected and provided for use.The purpose of this is to address and mitigate any potential legal and/or ethical implications of receipt and use of the materials as part of the research project, and to ensure that in doing so we align with best practice wherever possible.The overarching areas of consideration are: • Ethical review of provenance and sourcing of the material The assembly is of high quality and will be useful to the research community.It contains 30 autosomes, the Z and W chromosomes and the mitochondrial genome.It would be helpful to label chromosomes in the Hi-C contact map (fig 5).Some chromosomes seem to have a strange pattern of genome-wide contact, e.g.second, fifth and eights from last.Could this be an artefact of the assembly?Adding information about the expected chromosome number, e.g. from cytology studies, would be useful.

Maria Nilsson
Senckenberg Biodiversity and Climate Research Centre, Frankfurt am Main, Germany The genome of Noctua comes, the lesser yellow underwing moth, as been assembled into a highly contiguous assembly.
Two individuals of Noctua comes was used to generate the assembly, both captured using a light trap at the same location.One individual was used for long read sequencing and a second one for RNA-sequencing and HiC.
The QV is 71.2 and BUSCO score of 99.0% which indicate a very complete assembly.The contig N50 is 18.3Mb and it is assembled into 45 scaffolds.The genome assembly is 540Mb large and the transcriptome analysis identified 18,001 genes.
I have three minor complaints about the readability in the introduction/Background: 1) the sentence "It has a single generation on the wing from June to October." does not make sense to me.Please improve if necessary.
2) The sentence where the three species of moths are described.This is the second sentence in the Background starting with "Like other yellow underwings....".This is a very long and complex sentence and can likely be shortened.
3) the sentence in the third paragraph starting with "Moths can detect ultrasonic waves and are... ", seems irrelevant to the species Noctua comes.

Is the rationale for creating the dataset(s) clearly described? Yes
Are the protocols appropriate and is the work technically sound?Yes

Are sufficient details of methods and materials provided to allow replication by others? Yes
Are the datasets clearly presented in a useable and accessible format?

Figure 1 .
Figure 1.Photograph of the Noctua comes (ilNocCome1) specimen used for genome sequencing.

Figure 2 .
Figure 2. Genome assembly of Noctua comes, ilNocCome1.1:metrics.The BlobToolKit snail plot shows N50 metrics and BUSCO gene completeness.The main plot is divided into 1,000 size-ordered bins around the circumference with each bin representing 0.1% of the 540,735,880 bp assembly.The distribution of scaffold lengths is shown in dark grey with the plot radius scaled to the longest scaffold present in the assembly (24,202,804 bp, shown in red).Orange and pale-orange arcs show the N50 and N90 scaffold lengths (18,283,226 and 12,831,000 bp), respectively.The pale grey spiral shows the cumulative scaffold count on a log scale with white scale lines showing successive orders of magnitude.The blue and pale-blue area around the outside of the plot shows the distribution of GC, AT and N percentages in the same bins as the inner plot.A summary of complete, fragmented, duplicated and missing BUSCO genes in the lepidoptera_odb10 set is shown in the top right.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/CAUJBG01/dataset/CAUJBG01/snail.

Figure 3 .
Figure 3. Genome assembly of Noctua comes, ilNocCome1.1:BlobToolKit GC-coverage plot.Sequences are coloured by phylum.Circles are sized in proportion to sequence length.Histograms show the distribution of sequence length sum along each axis.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/CAUJBG01/dataset/CAUJBG01.1/blob.

Figure 4 .
Figure 4. Genome assembly of Noctua comes, ilNocCome1.1:BlobToolKit cumulative sequence plot.The grey line shows cumulative length for all sequences.Coloured lines show cumulative lengths of sequences assigned to each phylum using the buscogenes taxrule.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/CAUJBG01/dataset/CAUJBG01/cumulative.

Figure 5 .
Figure 5. Genome assembly of Noctua comes, ilNocCome1.1:Hi-C contact map of the ilNocCome1.1 assembly, visualised using HiGlass.Chromosomes are shown in order of size from left to right and top to bottom.An interactive version of this figure may be viewed at https://genome-note-higlass.tol.sanger.ac.uk/l/?d=JlNOMLnKTpGGbP0ewYF8cA.

Table 3
contains a list of relevant software tool versions and sources.

Is the rationale for creating the dataset(s) clearly described? Yes Are the protocols appropriate and is the work technically sound? Yes Are sufficient details of methods and materials provided to allow replication by others? Yes Are the datasets clearly presented in a useable and accessible format? Yes
Reviewer Expertise: evolutionary genomics, phylogenetics, insect evolution I

confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.
This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.