The genome sequence of the Dark Arches Apamea monoglypha (Hufnagel, 1766)

We present a genome assembly from an individual male Apamea monoglypha (the Dark Arches, Arthropoda; Insecta; Lepidoptera; Noctuidae). The genome sequence is 576 megabases in span. Most of the assembly is scaffolded into 31 chromosomal pseudomolecules, including the assembled Z sex chromosome. The mitochondrial genome has also been assembled and is 16.5 kilobases in length. Gene annotation of this assembly on Ensembl has identified 17,963 protein coding genes.

by the Hi-C data are named in order of size (Figure 2-Figure 5; Table 2).The assembly has a BUSCO v5.3.2 (Manni et al., 2021) completeness of 99.0% using the lepidoptera_odb10 reference set.While not fully phased, the assembly deposited is of one haplotype.Contigs corresponding to the second haplotype have also been deposited.

Sample acquisition and nucleic acid extraction
One A. monoglypha specimen (ilApaMono1) was collected in Wytham Woods, Oxfordshire (biological vice-county: Berkshire), UK (latitude 51.77, longitude -1.34) on 20 July 2020 using a light trap.The specimen was collected and identified by Douglas Boyes (University of Oxford) and snap-frozen on dry ice.
DNA was extracted at the Tree of Life laboratory, Wellcome Sanger Institute (WSI).The ilApaMono1 sample was weighed and dissected on dry ice with tissue set aside for Hi-C sequencing.Abdomen tissue was cryogenically disrupted to a fine powder using a Covaris cryoPREP Automated Dry Pulveriser, receiving multiple impacts.High molecular weight (HMW) DNA was extracted using the Qiagen MagAttract HMW DNA extraction kit.Low molecular weight DNA was removed from a 20 ng aliquot of extracted DNA using 0.8X AMpure XP purification kit prior to 10X Chromium sequencing; a minimum of 50 ng DNA was submitted for 10X sequencing.HMW DNA was sheared into an average fragment size of 12-20 kb in a Megaruptor 3 system with speed setting 30.Sheared DNA was purified by solid-phase reversible immobilisation using AMPure PB beads with a 1.8X ratio of beads to sample to remove

Background
The Dark Arches Apamea monoglypha Hufnagel, 1766 is a large (45-55 mm wingspan) noctuid moth that is common in Europe and has scattered records from elsewhere across the western Palearctic.It can be extremely abundant in some locations in the south of the UK.A recent review of macro-moth status classified A. monoglypha as being widespread and abundant in Great Britain and placed it in the 'Least Concern' IUCN Red List category (Fox et al., 2019).Adults are primarily on the wing from June to September, with a peak abundance in July; the larvae feed on a range of grasses before overwintering among the bases and roots of these plants.
There is sometimes a second brood later in the year in the more southern parts of the UK (Knill-Jones, 2005).
A. monoglypha is easily recognised by its size, distinct oval and kidney markings, and a 'W'-shaped line at the outer edge (termen) of the forewings; overall colouration is variable, with specimens ranging from a light cream colour through to almost fully black.A melanic form (f. aethiops (Tutt, 1891)) has been recorded, which lacks the typical markings.Kettlewell considered the melanic form to be an example of ancient and non-industrial melanism (albeit with localised incidences of industrial melanism) under the control of a single locus with the melanic form dominant (Kettlewell, 1973), although possibly with influence from other genetic or environmental factors (Bishop et al., 1976).The more variable background colouration is likely polygenic (Cockayne, 1938;Fraiers et al., 1994).
A genome assembly for Apamea monoglypha will be invaluable in identifying the genetic basis of colour polymorphism in this species and facilitate further research into this often abundant and ecologically important species.

Genome sequence report
The genome was sequenced from one male A. monoglypha specimen (Figure 1) collected from Wytham Woods, UK (latitude 51.77, longitude -1.34).A total of 25-fold coverage in Pacific Biosciences single-molecule HiFi long reads and 60-fold coverage in 10X Genomics read clouds were generated.Primary assembly contigs were scaffolded with chromosome conformation Hi-C data.Manual assembly curation corrected 52 missing joins or mis-joins and removed 14 haplotypic duplications, reducing the assembly length by 1.02% and the scaffold number by 54.79%, and increasing the scaffold N50 by 7.26%.
The final assembly has a total length of 575.7 Mb in 33 sequence scaffolds with a scaffold N50 of 19.9 Mb (Table 1).Most (99.99%) of the assembly sequence was assigned to 31 chromosomal-level scaffolds, representing 30 autosomes and the Z sex chromosome.Chromosome-scale scaffolds confirmed the shorter fragments and concentrate the DNA sample.The concentration of the sheared and purified DNA was assessed using a Nanodrop spectrophotometer and Qubit Fluorometer and Qubit dsDNA High Sensitivity Assay kit.Fragment size distribution was evaluated by running the sample on the FemtoPulse system.

Sequencing
Pacific Biosciences HiFi circular consensus and 10X Genomics read cloud DNA sequencing libraries were constructed according to the manufacturers' instructions.DNA sequencing was performed by the Scientific Operations core at the WSI on Pacific Biosciences SEQUEL II (HiFi) and Illumina NovaSeq 6000 (10X) instruments.Hi-C data were also generated from head and thorax tissue of ilApaMono1 using the Arima v2 kit and sequenced on the Illumina NovaSeq 6000 instrument.

Genome assembly
Assembly was carried out with Hifiasm (Cheng et al., 2021) and haplotypic duplication was identified and removed with  3 contains a list of all software tool versions used, where appropriate.

Genome annotation
The BRAKER2 pipeline (Brůna et al., 2021) was used in the default protein mode to generate annotation for the Apamea monoglypha assembly (GCA_911387795.2) in Ensembl Rapid Release.

Min-jin Han
State Key Laboratory of Resource Insects, Institute of Sericulture and Systems Biology, Southwest University, BeiBei, Chongqing, China I greatly admire the authors for completing the high-quality genome of this species.It would be even better if there could be improvements in the annotation of genomic repetitive sequences and the annotation and quality assessment of protein-coding genes.Additionally, most moths exhibit female heterochromosomes.If the genome of female individuals could be sequenced as much as possible, it would be more representative.

Hongmei Li-Byarlay
Central State University, Wilberforce, Ohio, USA This data note titled "The genome sequence of the Dark Arches Apamea monoglypha (Hufnagel, 1766)" aligns with the objectives proposed by the Darwin Tree of Life Consortium and consequently by the Earth Biogenome Project.This genome project aimed to enrich our current knowledge in genetics and conservation.It provided unique information about colour polymorphism, evolution, and ecology.
The rationale for creating the genomic data is clearly described.The protocols are appropriate and provide adequate information and detail of the technical procedure allowing others to repeat it.
The dataset is presented in a valid and accessible format, and deposited in the NCBI database.
For genetics, a female moth has a heterogametic sex chromosome system, with most females having a WZ constitution.The males are ZZ chromosome system.Is there a plan to sequence a female sample to capture the full WZ constitution?It will be great to provide the genetic information of this species in the introduction and point out how the male genomic sequence plays an role in studying this species or macro-moth species.

Is the rationale for creating the dataset(s) clearly described? Yes
Are the protocols appropriate and is the work technically sound?Yes

Are sufficient details of methods and materials provided to allow replication by others? Yes
Are the datasets clearly presented in a useable and accessible format?Yes Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Entomology, Genomics, Molecular Biology, Behavioral Ecology I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Figure 2 .
Figure 2. Genome assembly of Apamea monoglypha, ilApaMono1.2:metrics.The BlobToolKit Snailplot shows N50 metrics and BUSCO gene completeness.The main plot is divided into 1,000 size-ordered bins around the circumference with each bin representing 0.1% of the 575,683,298 bp assembly.The distribution of scaffold lengths is shown in dark grey with the plot radius scaled to the longest scaffold present in the assembly (34,587,019 bp, shown in red).Orange and pale-orange arcs show the N50 and N90 scaffold lengths (19,918,846 and 14,112,846 bp), respectively.The pale grey spiral shows the cumulative scaffold count on a log scale with white scale lines showing successive orders of magnitude.The blue and pale-blue area around the outside of the plot shows the distribution of GC, AT and N percentages in the same bins as the inner plot.A summary of complete, fragmented, duplicated and missing BUSCO genes in the lepidoptera_odb10 set is shown in the top right.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/ilApaMono1.2/dataset/CAJVQS02/snail.

Figure 3 .
Figure 3. Genome assembly of Apamea monoglypha, ilApaMono1.2:GC coverage.BlobToolKit GC-coverage plot.Scaffolds are coloured by phylum.Circles are sized in proportion to scaffold length.Histograms show the distribution of scaffold length sum along each axis.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/ilApaMono1.2/dataset/CAJVQS02/blob.

Figure 4 .
Figure 4. Genome assembly of Apamea monoglypha, ilApaMono1.2:cumulative sequence.BlobToolKit cumulative sequence plot.The grey line shows cumulative length for all scaffolds.Coloured lines show cumulative lengths of scaffolds assigned to each phylum using the buscogenes taxrule.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/ilApaMono1.2/dataset/ CAJVQS02/cumulative.

Figure 5 .
Figure 5. Genome assembly of Apamea monoglypha, ilApaMono1.2:Hi-C contact map.Hi-C contact map of the ilApaMono1.2assembly, visualised using HiGlass.Chromosomes are shown in order of size from left to right and top to bottom.An interactive version of this figure may be viewed at https://genome-note-higlass.tol.sanger.ac.uk/l/?d=QFTmrUfDS3i4DilDoNn-Cw.

Open Peer Review Current Peer Review Status: Version 1
This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Is the rationale for creating the dataset(s) clearly described? Yes Are the protocols appropriate and is the work technically sound? Partly Are sufficient details of methods and materials provided to allow replication by others? Partly Are the datasets clearly presented in a useable and accessible format? Yes Competing Interests:
No competing interests were disclosed.

have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.
This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.