The genome sequence of the Dotted Border, Agriopis marginaria (Fabricius, 1776)

We present a genome assembly from an individual male Agriopis marginaria (the Dotted Border, Arthropoda; Insecta; Lepidoptera; Geometridae). The genome sequence is 500.9 megabases in span. Most of the assembly is scaffolded into 29 chromosomal pseudomolecules, including the assembled Z sex chromosome. The mitochondrial genome has also been assembled and is 16.9 kilobases in length. Gene annotation of this assembly on Ensembl identified 12,443 protein coding genes.


Background
The Dotted Border, Agriopis marginaria, is a geometrid moth, the female being brachypterous, spider-like, and the male fully winged, moderately large and varying from greyish to brown or even melanic on the forewing, usually with a paler median fascia and distinguished by around seven prominent dark dots along the termen of both wings. The moth emerges in early Spring (usually flying from February to late April in the UK, with a peak in mid-March that has advanced in 30 years; Randle et al., 2019), occasionally from late December.
The Dotted Border is found in a wide variety of wooded and open scrubby habitats. The larvae are polyphagous, feeding on a diversity of deciduous trees and shrubs, especially oaks.
A. marginaria is generally common, sometimes abundant and widespread in the western Palaearctic only, from Scandinavia to the circum-Mediterranean; but has relatively few records for eastern Europe and western Russia, Iceland and Northern Africa (GBIF Secretariat, 2022). In the UK it is also widespread and often common with relatively fewer records towards the north (NBN Atlas Partnership, 2021). Populations evidence a significant decline in abundance and a moderate decline in distribution since 1970(Conrad et al., 2006Randle et al., 2019).
There was a single DNA barcode cluster on BOLD (8 March 2023), the BIN BOLD:AAC0355, another (BOLD:AES2645) apparently being artefactual (the available 347 bp of a sequence from Czech Republic are of the same haplotype as some members of the other BIN).
A. marginaria is a member of the ennomine tribe Boarmiini, and Agriopis (Hübner, 1825) fell sister to the genus Calamodes (Guenée, 1857) in the study of Murillo-Ramos et al. (Murillo-Ramos et al., 2021), from which it has been estimated to have diverged around 30 Ma and about 9.6 Ma from A. aurantiaria. Agriopis is currently placed in the tribe Bistonini. The genome sequence will be useful in further phylogenetic studies, and also to understand evolution of various traits like larval polyphagy and adult flightlessness, as well as of the development of melanism (see e.g., Majerus (1998)) in comparison to the Peppered Moth, Biston betularia (L.), whose genome is available (Boyes & Wright, 2022).

Genome sequence report
The genome was sequenced from one male Agriopis marginaria ( Figure 1) collected from High Wycombe (see Methods). A total of 34-fold coverage in Pacific Biosciences single-molecule HiFi long reads was generated. Primary assembly contigs were scaffolded with chromosome conformation Hi-C data. Manual assembly curation corrected 25 missing or mis-joins and removed 17 haplotypic duplications, reducing the assembly length by 3.12% and the scaffold number by 24.39%, and decreasing the scaffold N50 by 5.06%.
The final assembly has a total length of 500.9 Mb in 31 sequence scaffolds with a scaffold N50 of 18.4 Mb ( Table 1). Most (99.99%) of the assembly sequence was assigned to 29 chromosomal-level scaffolds, representing 28 autosomes, and the Z sex chromosome. Chromosome-scale scaffolds confirmed by the Hi-C data are named in order of size (Figure 2- Figure 5; Table 2). While not fully phased, the assembly deposited is of one haplotype. Contigs corresponding to the second haplotype have also been deposited. The mitochondrial genome was also assembled and can be found as a contig within the multifasta file of the genome submission.

Genome annotation report
The Agriopis marginaria genome assembly (GCA_932305915.1) was annotated using the Ensembl rapid annotation pipeline (Table 1; Accession Number: GCA_932305915.1). The resulting annotation includes 21,487 transcribed mRNAs from 12,443 protein-coding and 1,975 non-coding genes.   light trap. The specimen was identified by the collector and snap-frozen on dry ice. This specimen was used for RNA sequencing.
DNA was extracted at the Tree of Life laboratory, Wellcome Sanger Institute (WSI). The ilAgrMarg1 sample was weighed and dissected on dry ice with tissue set aside for Hi-C sequencing.
Abdomen tissue was cryogenically disrupted to a fine powder using a Covaris cryoPREP Automated Dry Pulveriser, receiving multiple impacts. High molecular weight (HMW) DNA was extracted using the Qiagen MagAttract HMW DNA extraction kit. HMW DNA was sheared into an average fragment size of 12-20 kb in a Megaruptor 3 system with speed setting 30. Sheared DNA was purified by solid-phase reversible      sequencing was performed by the Scientific Operations core at the WSI on the Pacific Biosciences SEQUEL II (HiFi) and Illumina NovaSeq 6000 (RNA-Seq) instruments. Hi-C data were also generated from head and thorax tissue of ilA-grMarg1 using the Arima v2 kit and sequenced on the Illumina NovaSeq 6000 instrument.
To evaluate the assembly, MerquryFK was used to estimate consensus quality (QV) scores and k-mer completeness (Rhie et al., 2020). The genome was analysed, and BUSCO scores (Manni et al., 2021;Simão et al., 2015;) were generated within the BlobToolKit environment (Challis et al., 2020). Table 3 contains a list of software tool versions and sources.

Genome annotation
The Ensembl gene annotation system (Aken et al., 2016) was used to generate annotation for the Agriopis marginaria assembly (GCA_932305915.1). Annotation was created primarily through alignment of transcriptomic data to the genome, The genome sequence is released openly for reuse. The Agriopis marginaria genome sequencing initiative is part of the Darwin Tree of Life (DToL) project. All raw sequence data and the assembly have been deposited in INSDC databases. Raw data and assembly accession identifiers are reported in Table 1.