The genome sequence of the Northern Deep-brown Dart, Aporophyla lueneburgensis (Freyer, 1848)

We present a genome assembly from an individual female Aporophyla lueneburgensis (the Northern Deep-brown Dart; Arthropoda; Insecta; Lepidoptera; Noctuidae). The genome sequence is 978.3 megabases in span. Most of the assembly is scaffolded into 31 chromosomal pseudomolecules, including the Z sex chromosome. The mitochondrial genome has also been assembled and is 15.5 kilobases in length. Gene annotation of this assembly on Ensembl identified 12,580 protein coding genes.


Background
Aporophyla is a genus of moths from the family Noctuidae found predominantly in Europe; most species in the genus have autumn flight periods. The Northern deep-brown dart A. lueneburgensis has brown or grey-brown forewings with a series of wavy markings forming a central darker band finely outlined in cream. The moth is widely distributed across Scotland and northern counties of England, with scattered and less frequent records from southern counties of England, Northern Ireland and Ireland (NBN Atlas Partnership, 2021;Randle et al., 2019;Thompson & Nelson, 2003). Although most common in northern latitudes of Europe and Scandinavia, the moth has also been recorded in Italy, Spain and Portugal (Corley et al., 2018;GBIF Secretariat, 2022). These reported distributions need further verification, as discussed below. A. lueneburgensis is univoltine with adults on the wing in August and September, often in moorland and rough grassland habitats. Larvae feed on heather Calluna vulgaris or bird'sfoot trefoil Lotus corniculata in autumn and again in spring, overwintering at an early larval stage (Waring et al., 2017).
There has been taxonomic debate about whether A. lueneburgensis should be given species status. For a century, the moth now named A. lueneburgensis was described as a colour variant of the deep-brown dart Aporophyla lutulenta, and was considered either a subspecies or given a variety designation, var. luneburgensis. In the 1950s, it was proposed that the two forms could be different species (Wightman, 1954). "I am now quite satisfied that… two distinct species are involved" wrote Archibald Wightman, although confusingly he added "I can give no structural point of difference but I can say they are distinct" (Wightman, 1954). The species-level separation was not initially adopted (for example, South, 1961), but gradually found favour through the second half of the 20th century (for example, Skinner & Wilson, 2009;Waring et al., 2017). Distinctiveness of the two species was challenged from initial mitochondrial DNA barcode analyses (Orhant, 2012), before being supported after DNA barcodes from more specimens were obtained (Corley et al., 2018;Haslberger & Segerer, 2016). Recent molecular analyses clearly support the view that A. lueneburgensis and A. lutulenta are indeed distinct species, although more specimens need to be analysed to determine accurately their geographic distribution (Boyes et al., 2021).
Here we report the complete genome sequence of A. lueneburgensis. In phylogenetic analyses, the mitochondrial CO1 DNA barcode of the specimen used here groups in a clade with other A. lueneburgensis specimens, distinct from A. lutulenta (Boyes et al., 2021). A complete genome sequence will facilitate studies into colour pattern evolution and adaptation to specific food plants, and contribute to research into lepidopteran genome evolution.

Genome sequence report
The genome was sequenced from one female Aporophyla lueneburgensis ( Figure 1) collected from Wytham Woods, Oxfordshire, UK (latitude 51.77, longitude -1.34). A total of 34-fold coverage in Pacific Biosciences single-molecule HiFi long reads was generated. Primary assembly contigs were scaffolded with chromosome conformation Hi-C data. Manual assembly curation corrected 33 missing or mis-joins, and removed 14 haplotypic duplications, reducing the assembly length by 1.27% and the scaffold number by 13.64%, and increasing the scaffold N50 by 0.63%.
The final assembly has a total length of 978.3 Mb in 76 sequence scaffolds with a scaffold N50 of 32.1 Mb (Table 1). Most (99.12%) of the assembly sequence was assigned to 31 chromosomal-level scaffolds, representing 30 autosomes and the Z sex chromosome. Chromosome-scale scaffolds confirmed by the Hi-C data are named in order of size (Figure 2- Figure 5; Table 2). There is half-coverage of the Z chromosome in the Hi-C map, but no W chromosome, indicating that the specimen is most likely a Z0 female (Sahara et al., 2012). While not fully phased, the assembly deposited is of one haplotype. Contigs corresponding to the second haplotype have also been deposited. The mitochondrial genome was also assembled and can be found as a contig within the multifasta file of the genome submission.
Metadata for specimens, spectral estimates, sequencing runs, contaminants and pre-curation assembly statistics can be found here.

Genome annotation
Number of protein-coding genes 12,580 Number of non protein-coding genes 1,675 Number of gene transcripts 21,617 * Assembly metric benchmarks are adapted from column VGP-2020 of "    (Manni et al., 2021;Simão et al., 2015) were calculated within the Blob-ToolKit environment (Challis et al., 2020). Table 3 contains a list of software tool versions and sources.

Genome annotation
The Ensembl gene annotation system (Aken et al., 2016) was used to generate annotation for the Aporophyla lueneburgensis assembly (GCA_932294355.1). Annotation was created primarily through alignment of transcriptomic data to the genome, with gap filling via protein-to-genome alignments of a select set of proteins from UniProt (UniProt Consortium, 2019).  In this note the authors present the genome sequence of Northern Deep-brown Dart, Aporophyla lueneburgensis. The methods are sound and have been used numerous times by the the Tree of Life Consortium to produce high quality genomes as this one. The completeness and quality of the genome is very high and validated by various metrics such as k-mer completeness, BUSCO and percentage of assembly mapped to chromosomes. The note is straightforward and well-written and I only have very minor comments/suggestions. I have particularly appreciated the discussion about A. lueneburgensis status as species through history. However, after such an introduction I was expecting more comparative analysis between both species that can highlight more the importance of whole genome sequencing. On the contrary, all controversy was already solved using mitochondrial DNA barcodes. Therefore I think that more importance could have been given to explain the interest to study colour pattern and specific food plant evolution in this species.

○
As a suggestion for future sequencing projects: The picture provided is of low quality and does not show the individual completely. Wings hide the body and the ventral side is not visible. I think that having more pictures of the sequenced individual (as supplementary material) could be helpful (see next point for an example).

○
The authors deduce that the sequenced individual is a Z0 female due to the half-coverage of the Z chromosome but no W chromosome found. While I agree with their interpretation, maybe in future genomes it would be worth observing the genitalia too. More pictures of the abdominal ventral section could have helped to verify this after sequencing.

○
Another interesting information would be to have a karyotype from the sequenced species to confirm the 31 chromosomal-level scaffolds found by the authors.

○
Methods section is extremely concise, without more details I assume that all the programs listed are used with default options. I think a phrase should be added to clarify that (hence the 'Partly' in sufficient details methods section). Otherwise, either a column could be added at Table 3 to include commands used, or a reference to an article with those details.

Is the rationale for creating the dataset(s) clearly described? Yes
Are the protocols appropriate and is the work technically sound? Yes Are sufficient details of methods and materials provided to allow replication by others?