The genome sequence of the Brindled Flat-body, Agonopterix arenella (Denis & Schiffermüller, 1775)

We present a genome assembly from an individual male Agonopterix arenella (the Brindled Flat-body; Arthropoda; Insecta; Lepidoptera; Depressariidae). The genome sequence is 545.8 megabases in span. Most of the assembly is scaffolded into 30 chromosomal pseudomolecules, including the assembled Z sex chromosome. The mitochondrial genome has also been assembled and is 15.3 kilobases in length.


Background
Moths in the genus Agonopterix have a flattened oval shape when at rest, formed by overlapping their rounded wings directly above a compressed abdomen.Many Agonopterix species hibernate as adults, and the flat shape may facilitate hiding in bark crevices or leaf litter (Harper et al., 2002).Some species in the genus can be difficult to distinguish as adults; detailed descriptions and useful taxonomic keys, based on species found in the Netherlands, have been published (Huisman, 2012).
Agonopterix arenella, sometimes called the Brindled Flat-body, is a common moth found widely across northern Europe and Scandinavia, including most of Britain and Ireland (GBIF Secretariat, 2022;Harper et al., 2002;Sterling & Parsons, 2018).The adult moth has buff-coloured forewings marked with a brown 'smudged' blotch and several sharper brown-black dots forming a paw print impression. A. arenella is attracted to light and adults can be recorded throughout the year, with numbers peaking sharply in autumn and spring before and after hibernation (Asher et al., 2013;Huisman, 2012;NBN Atlas, 2021).Eggs are laid in spring on the leaves of the food plant, usually thistles Carduus and Cirsium spp., knapweeds Centaurea spp., burdocks Arctium spp. or saw-wort Serratula tinctoria.The larvae initially mine inside the leaf blade before spinning a feeding web on the underside of leaves (Harper et al., 2002).The pupal stage is brief before the adults emerge in autumn.
An assembled genome sequence for A. arenella will contribute to the growing set of genomic resources for understanding lepidopteran biology.

Genome sequence report
The genome was sequenced from one male A. arenella (Figure 1) collected from Wytham Woods, Oxfordshire, UK (latitude 51.77, longitude -1.34).A total of 42-fold coverage in Pacific Biosciences single-molecule HiFi long reads and 79-fold coverage in 10X Genomics read clouds were generated.Primary assembly contigs were scaffolded with chromosome conformation Hi-C data.Manual assembly curation corrected seven missing joins or mis-joins and removed three haplotypic duplications, reducing the assembly length by 1.93% and the scaffold number by 15.79%, and increasing the scaffold N50 by -2.06%.
The final assembly has a total length of 545.8 Mb in 32 sequence scaffolds with a scaffold N50 of 18.9 Mb (Table 1).Most (99.98%) of the assembly sequence was assigned to 30 chromosomal-level scaffolds, representing 29 autosomes, and the Z sex chromosome.Chromosome-scale scaffolds confirmed by the Hi-C data are named in order of size (Figure 2-Figure 5; Table 2).While not fully phased, the assembly deposited is of one haplotype.Contigs corresponding to the second haplotype have also been deposited.The estimated Quality Value (QV) of the final assembly is 60.2 with k-mer based completeness of 100%, and the assembly has a BUSCO v5.3.2 (Manni et al., 2021) completeness of 98.7% (single 98.0%, duplicated 0.7%) using the lepidoptera_odb10 reference set (n = 5,286).

Sample acquisition and nucleic acid extraction
A male A. arenella specimen (ilAgoAren1) was collected from Wytham Woods, Oxfordshire (biological vice-county Berkshire), UK (latitude 51.77, longitude -1.34) on 8 September 2020.The specimen was taken from woodland habitat by Douglas Boyes (University of Oxford) using a light trap.The specimen was identified by the collector and snap-frozen on dry ice.
DNA was extracted at the Tree of Life laboratory, Wellcome Sanger Institute (WSI).The ilAgoAren1 sample was weighed and dissected on dry ice with tissue set aside for Hi-C sequencing.Whole organism tissue was disrupted using a Nippi Powermasher fitted with a BioMasher pestle.High molecular weight (HMW) DNA was extracted using the Qiagen MagAttract HMW DNA extraction kit.Low molecular weight DNA was removed from a 20 ng aliquot of extracted DNA using the 0.8X AMpure XP purification kit prior to 10X Chromium sequencing; a minimum of 50 ng DNA was submitted for 10X sequencing.HMW DNA was sheared into an average fragment size of 12-20 kb in a Megaruptor 3 system with speed setting 30.Sheared DNA was purified by solid-phase reversible immobilisation using AMPure PB beads with a 1.8X ratio of beads to sample to remove the shorter fragments and concentrate the DNA sample.The concentration of the sheared and purified DNA was assessed using a Nanodrop spectrophotometer and Qubit Fluorometer and Qubit dsDNA High Sensitivity Assay kit.Fragment size distribution was evaluated by running the sample on the FemtoPulse system.

Sequencing
Pacific Biosciences HiFi circular consensus and 10X Genomics read cloud DNA sequencing libraries were constructed according to the manufacturers' instructions.DNA sequencing was performed by the Scientific Operations core at the WSI on   3 contains a list of software tool versions and sources.

Ethics and compliance issues
The materials that have contributed to this genome note have been supplied by a Darwin Tree of Life Partner.The submission of materials by a Darwin Tree of Life Partner is subject to

Luca Livraghi
George Washington University, Washington, USA The paper reports a very high quality assembly generated from an individual male Agonopterix arenella moth.Some basic biology of the moth is discussed, as well as the rationale for sequencing.The genome generated is of excellent quality, with almost complete chromosomal scaffolding, large N50 and very high BUSCO completeness scores.This includes 30 chromosomal level scaffolds, including the Z chromosome.The assembly has no signs of contaminants and a GC% content consistent with other high quality lepidopteran assemblies.
This is another fantastic addition to the ever-growing repertoire of lepidopteran genomes assembled by DToL, and will not doubt be useful to the community for comparative genomics studies.
Some minor comments: 1.The methods mention the sample was collected on 8 September 2020 and snap frozen on dry ice, but storage conditions post snap freeze are not detailed (any buffer?dry storage at -80? How long was the sample stored before DNA extraction?).
2. The methods lack some detail on what tissue was used for which library preparation.It mentions "tissue set aside for Hi-C seqeuncing."The methods could benefit from detailing the origin of the tissue for each library prep, if information is available (e.g 10X library from full organism, HiFi from subset of DNA extraction, Hi-C from subset of tissue x).
3. There is no mention of genome annotation methods beyond the mitochondrial genome.The ENA entry mentions "transcroptomic data."Was this collected for genome annotation?If so from which tissues, and how was this used to annotate the genome?If the genome will be annotated at a later date it would be helpful to add a note.This is mentioned in the Data availability but not immediately clear from the text.

Is the rationale for creating the dataset(s) clearly described? Yes
Are the protocols appropriate and is the work technically sound?Yes Are sufficient details of methods and materials provided to allow replication by others?Partly Are the datasets clearly presented in a useable and accessible format?Partly Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Evolutionary developmental biology.
I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Michael Hiller
LOEWE-Centre for Translational Biodiversity Genomics (TBG), Senckenberg Nature Research Society, Frankfurt Am Main, Germany The data note presents a high quality genome of the moth Agonopterix arenella.The assembly is of exceptional quality, with high contig N50 values, chromosome-level scaffolds and a high base accuracy (QV60).This genome adds to genomic resources that will enable comparative lepidopteran studies.
The manuscript is well written and the methods are clearly described.
I have no comments and hence recommend indexing.

Is the rationale for creating the dataset(s) clearly described? Yes
Are the protocols appropriate and is the work technically sound?Yes

Are sufficient details of methods and materials provided to allow replication by others? Yes
Are the datasets clearly presented in a useable and accessible format?Yes Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Comparative genomics I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Figure 2 .
Figure 2. Genome assembly of Agonopterix arenella, ilAgoAren1.1:metrics.The BlobToolKit Snailplot shows N50 metrics and BUSCO gene completeness.The main plot is divided into 1,000 size-ordered bins around the circumference with each bin representing 0.1% of the 545,825,857 bp assembly.The distribution of scaffold lengths is shown in dark grey with the plot radius scaled to the longest scaffold present in the assembly (30,785,055 bp, shown in red).Orange and pale-orange arcs show the N50 and N90 scaffold lengths (18,923,089 and 14,436,485 bp), respectively.The pale grey spiral shows the cumulative scaffold count on a log scale with white scale lines showing successive orders of magnitude.The blue and pale-blue area around the outside of the plot shows the distribution of GC, AT and N percentages in the same bins as the inner plot.A summary of complete, fragmented, duplicated and missing BUSCO genes in the lepidoptera_odb10 set is shown in the top right.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/ilAgoAren1.1/dataset/CAKMJH01/snail.

Figure 5 .
Figure 5. Genome assembly of Agonopterix arenella, ilAgoAren1.1:Hi-C contact map.Hi-C contact map of the ilAgoAren1.1 assembly, visualised using HiGlass.Chromosomes are shown in order of size from left to right and top to bottom.An interactive version of this figure may be viewed at https://genome-note-higlass.tol.sanger.ac.uk/l/?d=eBh637UMSs-8lwIy7MFttg.

Peer Review Current Peer Review Status: Version 1
Life Partner, Genome Research Limited (operating as the Wellcome Sanger Institute), and in some circumstances other Darwin Tree of Life collaborators.This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.