The genome sequence of the Ruddy Flat-body, Agonopterix subpropinquella (Stainton, 1849)

We present a genome assembly from an individual male Agonopterix subpropinquella (the Ruddy Flat-body; Arthropoda; Insecta; Lepidoptera; Depressariidae). The genome sequence is 667.9 megabases in span. Most of the assembly is scaffolded into 28 chromosomal pseudomolecules, including the Z sex chromosome. The mitochondrial genome has also been assembled and is 16.5 kilobases in length. Gene annotation of this assembly on Ensembl identified 18,796 protein coding genes.


Background
Agonopterix subpropinquella, the Ruddy Flat-body, is a small moth of the Depressariidae family.The species has a pan-European, mainly coastal, distribution (GBIF Secretariat, 2023), and is found widely across Britain and Ireland (Harper et al., 2002).Like many species in its genus, the imago is drab-coloured and indistinctly patterned, indeed the species' scientific name likely being a reference to its similarity to other members of the genus (Emmet, 1991).However, the species also shows a distinctive form f. rhodochrella, with exaggerated blackish colouring on the head, thorax, and forewing (Harper et al., 2002).
In common with many members of its genus, the species overwinters as an adult (Harper et al., 2002).Adults are on the wing between August and May, hibernating over the winter, and may be disturbed from their hibernation by beating thatch or dense vegetation over the winter (Sterling & Parsons, 2018).Eggs are lain in May on knapweeds (Centaurea spp.) or thistles (Cirsium spp.) (Harper et al., 2002).The larva is green and initially mines the foodplant before feeding in a silken spinning (Harper et al., 2002).Larvae feed between June and July, pupating from July to August in earth or amongst detritus (Harper et al., 2002).
The genome of the ruddy flat-body, Agonopterix subpropinquella, was sequenced as part of the Darwin Tree of Life Project, a collaborative effort to sequence all named eukaryotic species in the Atlantic Archipelago of Britain and Ireland.Here we present a chromosomally complete genome sequence for Agonopterix subpropinquella, based on one male specimen from Wytham Woods, Oxfordshire.

Genome sequence report
The genome was sequenced from one male Agonopterix subpropinquella (Figure 1) collected from Wytham Woods, Oxfordshire, UK (51.77,.A total of 28-fold coverage in Pacific Biosciences single-molecule HiFi long reads and 63-fold coverage in 10X Genomics read clouds were generated.Primary assembly contigs were scaffolded with chromosome conformation Hi-C data.Manual assembly curation corrected 56 missing joins or mis-joins and removed 12 haplotypic duplications, reducing the assembly length by 0.43% and the scaffold number by 50.79%. The final assembly has a total length of 667.9 Mb in 31 sequence scaffolds with a scaffold N50 of 25.1 Mb (Table 1).Most (99.98%) of the assembly sequence was assigned to 28 chromosomal-level scaffolds, representing 27 autosomes and the Z sex chromosome.A summary of the assembly statistics is shown in Figure 2, while the distribution of assembly scaffolds on GC proportion and coverage is shown in Figure 3.The cumulative assembly plot in Figure 4 shows curves for subsets of scaffolds assigned to different phyla.Chromosome-scale scaffolds confirmed by the Hi-C data are named in order of size (Figure 5; Table 2).While not fully phased, the assembly deposited is of one haplotype.Contigs corresponding to the second haplotype have also been deposited.The mitochondrial genome was also assembled and can be found as a contig within the multifasta file of the genome submission.
Metadata for specimens, spectral estimates, sequencing runs, contaminants and pre-curation assembly statistics can be found at https://links.tol.sanger.ac.uk/species/1857958.

Sample acquisition and nucleic acid extraction
A male Agonopterix subpropinquella (specimen ID Ox000822, ToLID ilAgoSubp1) was collected from Wytham Woods, Oxfordshire (biological vice-county Berkshire), UK (latitude 51.77, longitude -1.34) on 2020-08-01 using a light trap.The specimen was collected and identified by Douglas Boyes (University of Oxford) and preserved on dry ice.
DNA was extracted at the Tree of Life laboratory, Wellcome Sanger Institute (WSI).The ilAgoSubp1 sample was weighed and dissected on dry ice with tissue set aside for Hi-C   sequencing.Tissue from the whole organism was disrupted using a Nippi Powermasher fitted with a BioMasher pestle.High molecular weight (HMW) DNA was extracted using the Qiagen MagAttract HMW DNA extraction kit.Low molecular weight DNA was removed from a 20 ng aliquot of extracted DNA using the 0.8X AMpure XP purification kit prior to 10X Chromium sequencing; a minimum of 50 ng DNA was submitted for 10X sequencing.HMW DNA was sheared into an average fragment size of 12-20 kb in a Megaruptor 3 system with speed setting 30.Sheared DNA was purified by solid-phase reversible immobilisation using AMPure PB beads with a 1.8X ratio of beads to sample to remove the shorter fragments and concentrate the DNA sample.The concentration of the sheared and purified DNA was assessed using a  A Hi-C map for the final assembly was produced using bwa-mem2 (Vasimuddin et al., 2019) in the Cooler file format (Abdennur & Mirny, 2020).To assess the assembly metrics, the k-mer completeness and QV consensus quality values were calculated in Merqury (Rhie et al., 2020).This work was done using Nextflow (Di Tommaso et al., 2017) DSL2 pipelines "sanger-tol/readmapping" (Surana et al., 2023a) and "sangertol/genomenote" (Surana et al., 2023b).The genome was analysed within the BlobToolKit environment (Challis et al., 2020) and BUSCO scores (Manni et al., 2021;Simão et al., 2015) were calculated.
Table 3 contains a list of relevant software tool versions and sources.

Genome annotation
The BRAKER2 pipeline (Brůna et al., 2021) was used in the default protein mode to generate annotation for the Agonopterix subpropinquella assembly (GCA_922987775.1) in Ensembl Rapid Release.

Wellcome Sanger Institute -Legal and Governance
The materials that have contributed to this genome note have been supplied by a Darwin Tree of Life Partner.The submission of materials by a Darwin Tree of Life Partner is subject to the 'Darwin Tree of Life Project Sampling Code of Practice', which can be found in full on the Darwin Tree of Life website here.By agreeing with and signing up to the Sampling Code of Practice, the Darwin Tree of Life Partner agrees they will meet the legal and ethical requirements and standards set out within this document in respect of all samples acquired for, and supplied to, the Darwin Tree of Life Project.

Wei Zhang
Peking University, Beijing, Beijing, China This manuscript reports a chromosome-level genome assembly of the Rubby Flat-body ( Agonopterix subpropinquella) with the Z chromosome assembled.This genome assembly has an N50 value of 25.1 Mb.It also has considerable BUSCO statistics and annotations.The methods section is well written.I think the overall quality of this genome assembly is valid and will benefit relevant studies.Therefore, I support the acceptance of the manuscript to be indexed.ABSTRACT Include a sentence on the sequencing strategy employed to generate the reported chromosomallevel genome assembly for the target species.

BACKGROUND
Kindly incorporate a sentence regarding the genome sizes of moths, accompanied by a few examples, to provide the reader with an understanding of the average genome size of moths.
Change "…Like many species in its genus…" to "…Like many of its congeners…" Question: What was the rationale behind selecting a homogametic male for sequencing rather than a heterogametic female?The reviewer is merely inquisitive given that the research gap is now on the assembly of the W sex chromosome.

Is the rationale for creating the dataset(s) clearly described? Yes
Are the protocols appropriate and is the work technically sound?Yes

Are sufficient details of methods and materials provided to allow replication by others? Yes
Are the datasets clearly presented in a useable and accessible format?Yes Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Genome Assembly; Ecological Genomics; Conservation Genomics; Invasive Genomics; Population Genomics; Phylogenomics; Molecular Phylogenetics I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Figure 2 .
Figure 2. Genome assembly of Agonopterix subpropinquella, ilAgoSubp1.1:metrics.The BlobToolKit Snailplot shows N50 metrics and BUSCO gene completeness.The main plot is divided into 1,000 size-ordered bins around the circumference with each bin representing 0.1% of the 667,905,724 bp assembly.The distribution of scaffold lengths is shown in dark grey with the plot radius scaled to the longest scaffold present in the assembly (68,253,362 bp, shown in red).Orange and pale-orange arcs show the N50 and N90 scaffold lengths (25,090,608 and 16,437,072 bp), respectively.The pale grey spiral shows the cumulative scaffold count on a log scale with white scale lines showing successive orders of magnitude.The blue and pale-blue area around the outside of the plot shows the distribution of GC, AT and N percentages in the same bins as the inner plot.A summary of complete, fragmented, duplicated and missing BUSCO genes in the lepidoptera_odb10 set is shown in the top right.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/Agonopterix%20subpropinquella/dataset/CAKLPO01.1/snail.

Figure 5 .
Figure 5. Genome assembly of Agonopterix subpropinquella, ilAgoSubp1.1:Hi-C contact map of the ilAgoSubp1.1 assembly, visualised using HiGlass.Chromosomes are shown in order of size from left to right and top to bottom.An interactive version of this figure may be viewed at https://genome-note-higlass.tol.sanger.ac.uk/l/?d=RQh-QoTJRnastq9ymSznYA.

©
2024 Çağatay N.This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.Naciye Sena ÇağatayUniversity of Liverpool, Liverpool, England, UKThe complete genome of the Ruddy Flat-body Agonopterix subpropinquella was assembled with Boyes and Hammond using PacBio HiFi sequencing, 10X Genomics and Hi-C Illumina sequencing techniques.The analysis were explained very well in each method.The genome assembly is of high quality and the figures shown are informative.I have a minor revision below.Background Can you check these words.Like many species in its genus, the imago is ''image'' Eggs are lain in May on knapweeds "laid" Q1: I was just wondering, there is no information about morphological identification.How do you decide this species?Why are you using male sample?Is the rationale for creating the dataset(s) clearly described?YesAre the protocols appropriate and is the work technically sound?YesAre sufficient details of methods and materials provided to allow replication by others?YesAre the datasets clearly presented in a useable and accessible format?YesCompeting Interests: No competing interests were disclosed.Reviewer Expertise: Molecular entomologyI confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.Simo Njabulo MadunaNorwegian Institute of Bioeconomy Research, Svanvik, NorwayThe near-complete genome of the Ruddy Flat-body Agonopterix subpropinquella was assembled by Boyes and Hammond through the utilization of BacBio HiFi sequencing, as well as 10X Genomics and Hi-C Illumina sequencing techniques.The authors present the initial and superior reference genome for the Ruddy Flat-body.Their manuscript is skilfully composed and includes an elaborate bioinformatics pipeline and succinct findings, accompanied by genome assembly metric benchmarks to facilitate comprehension.Below I present minor revisions.

Table 3 . Software tools: versions and sources. Software tool Version Open Peer Review Current Peer Review Status: Version 1
This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Is the rationale for creating the dataset(s) clearly described? Yes Are the protocols appropriate and is the work technically sound? Yes Are sufficient details of methods and materials provided to allow replication by others? Yes Are the datasets clearly presented in a useable and accessible format? Yes Competing Interests:
No competing interests were disclosed.