The genome sequence of the Large Ear, Amphipoea lucens (Freyer, 1845)

We present a genome assembly from an individual male Amphipoea lucens (the Large Ear; Arthropoda; Insecta; Lepidoptera; Noctuidae). The genome sequence is 647.7 megabases in span. Most of the assembly is scaffolded into 31 chromosomal pseudomolecules, including the assembled Z sex chromosome. The mitochondrial genome has also been assembled and is 15.3 kilobases in length.


Background
The Large Ear, Amphipoea lucens, is a medium-sized noctuid moth, often marginally larger than other species in the genus (forewing length up to 17 mm), but with similar markings such as a usually orange orbicular and reniform stigmata on the forewing.The moth is thus hard to separate on external features, despite strong bands on the underside wings, with the male genitalia with the longer arm on the clasper usually curved distally (Waring et al., 2017) and with genitalic differences in the female also (British Lepidoptera).It is monovoltine, flying from late July to early October in the UK (Randle et al., 2019), and overwintering as an egg.
The Large Ear is found in damp habitats, such as acid moorland and marshes, the larva feeding on purple moor grass (Molinia caerulea (L.) Moench) and common cottongrass (Eriophorum angustifolium Honck.); the adult can be found at flowers of rushes and heathers (Waring et al., 2017).
Amphipoea lucens is generally widespread in the western Palaearctic, from southern Scandinavia to Italy; there seem to be relatively few records for western Russia and it is found as far East as northern Japan and Korea (GBIF Secretariat, 2022).Populations in the UK appear to have declined since 1970 (Conrad et al., 2006;Randle et al., 2019).In the UK it is local and more prevalent towards the north as far as the Shetlands (NBN Atlas Partnership, 2021) where records have been considered to represent immigrant individuals (Waring et al., 2017).
The Large Ear is currently placed in the noctuid tribe Apameini and the genus was included in a distal part of this tribe in (Toussaint et al., 2012) (Figure 1) as sister to a group containing the genera Lateroligia, Coenobia, Luperina, Mesapamea and Mesoligia.The genome sequence should not only be useful in phylogeny but in studies of speciation, considering also that the species of Amphipoea species are notoriously difficult to identify externally.The current picture is further clouded by mitochondrial data in that two DNA barcode clusters occur on BOLD (8 March 2023), both found in the UK.Some of exemplars fall in the cluster BOLD:AAB5368 (mostly identified as Amphipoea fucosa (Freyer, 1830), others as A. lucens and a few as A. crinanensis (Burrows, 1908)), whilst other exemplars fall in the BIN BOLD:AAC7752 (most identified as A. oculea (Linnaeus, 1761) and a few as A. lucens).BOLD:AAC7752 is about 2.37% pairwise divergent from BOLD:AAB5368.
The genome of Amphipoea lucens was sequenced as part of the Darwin Tree of Life Project, a collaborative effort to sequence all named eukaryotic species in the Atlantic Archipelago of Britain and Ireland.Here we present a chromosomally complete genome sequence for Amphipoea lucens, based on one male specimen from Beinn Eighe National Nature Reserve, Scotland.

Genome sequence report
The genome was sequenced from one male Amphipoea lucens (Figure 1) collected from Beinn Eighe (See Methods).A total of 36-fold coverage in Pacific Biosciences single-molecule HiFi long reads was generated.Primary assembly contigs were scaffolded with chromosome conformation Hi-C data.Manual assembly curation corrected 17 missing joins or mis-joins and removed nine haplotypic duplications, reducing the assembly length by 1.96% and the scaffold number by 10.42%.
The final assembly has a total length of 647.7 Mb in 43 sequence scaffolds with a scaffold N50 of 22.3 Mb (Table 1).Most (99.91%) of the assembly sequence was assigned to 31 chromosomal-level scaffolds, representing 30 autosomes, and the Z sex chromosome.Chromosome-scale scaffolds confirmed by the Hi-C data are named in order of size (Figure 2-Figure 5; Table 2).While not fully phased, the assembly deposited is of one haplotype.Contigs corresponding to the second haplotype have also been deposited.The mitochondrial genome was also assembled and can be found as a contig within the multifasta file of the genome submission.
The estimated Quality Value (QV) of the final assembly is 67.9 with k-mer completeness of 100%, and the assembly has a BUSCO v5.3.2 completeness of 98.7% (single 98.1%, duplicated 0.6%) using the lepidoptera_odb10 reference set (n = 5,286).
Metadata about the specimens, spectral estimates, sequencing runs, contaminants and pre-curation assembly statistics can be found here.

Sample acquisition and nucleic acid extraction
A male Amphipoea lucens specimen (ilAmpLuce6) was collected from Beinn Eighe National Nature Reserve, Scotland  (latitude 57.63, longitude -5.35) on 10 September 2021, using an aerial net.The specimen was collected and identified by David Lees (Natural History Museum) and dry frozen at -80°C.
DNA was extracted at the Tree of Life laboratory, Wellcome Sanger Institute (WSI).The ilAmpLuce6 sample was weighed and dissected on dry ice with tissue set aside for Hi-C sequencing.Head and thorax tissue was disrupted using a Nippi Powermasher fitted with a BioMasher pestle.High molecular weight (HMW) DNA was extracted using the Qiagen MagAttract HMW DNA extraction kit.HMW DNA was sheared into an average fragment size of 12-20 kb in a Megaruptor 3 system with speed setting 30.Sheared DNA was purified by solid-phase reversible immobilisation using AMPure PB beads with a 1.8X ratio of beads to sample to remove the shorter fragments and concentrate the DNA sample.
The concentration of the sheared and purified DNA was assessed using a Nanodrop spectrophotometer and Qubit Fluorometer and Qubit dsDNA High Sensitivity Assay kit.
Fragment size distribution was evaluated by running the sample on the FemtoPulse system.

Sequencing
Pacific Biosciences HiFi circular consensus DNA sequencing libraries were constructed according to the manufacturers' instructions.DNA sequencing was performed by the Scientific Operations core at the WSI on the Pacific Biosciences SEQUEL II (HiFi) instrument.Hi-C data were also generated from tissue of ilAmpLuce6 using the Arima v2 kit and sequenced on the Illumina NovaSeq 6000 instrument.

Morthala Shankara Sai Reddy
Department of Entomology, Dr Rajendra Prasad Central Agricultural University, Samastipur, Bihar, India The report presents impressive work for the field of insect genomics with the sequencing and assembly of the genome of Amphipoea lucens, a captivating medium-sized noctuid moth.The meticulous effort involved in collecting a male specimen from Scotland and employing advanced techniques like Pacific Biosciences long reads and chromosome conformation Hi-C data has resulted in a remarkable accomplishment.The comprehensive assembly, composed of 43 sequence scaffolds spanning 647.7 Mb, showcases the dedication and expertise of the researchers.The assignment of 31 chromosomal-level scaffolds, including the identification of autosomes and the Z sex chromosome, demonstrates the meticulousness of the analysis.The assembly quality, highlighted by a high QV score of 67.9 and a remarkable BUSCO completeness of 98.7%, attests to the precision and accuracy of the work.Furthermore, the successful assembly of the mitochondrial genome adds to the significance of this endeavor.This invaluable genome sequence promises to greatly contribute to our understanding of the phylogenetics and speciation processes within the fascinating genus Amphipoea.Overall, this study represents a significant leap forward in genomic research, shedding light on the intricate molecular landscape of this captivating moth species.

Is the rationale for creating the dataset(s) clearly described? Yes
Are the protocols appropriate and is the work technically sound?Yes

Are sufficient details of methods and materials provided to allow replication by others? Yes
Are the datasets clearly presented in a useable and accessible format?Yes The article was on the genome of large ear, a moth found across Europe.The method used for the assembly is good -latest potential workflows.Was great to see a great level of completion achieved and am looking forward to seeing the completion of annotation and comparative analysis thereafter.
Just couple of minor edit suggestions, I noticed large ear and scientific name is interchangeably used between paragraphs -it would be good to have consistency.version of MerquryFK is missing.
It might be good to briefly mention the throughput, N50 etc of pacbio data generated and hi-c data.
Again great genome report, looking forward to the next stage.

Is the rationale for creating the dataset(s) clearly described? Yes
Are the protocols appropriate and is the work technically sound?Yes

Are sufficient details of methods and materials provided to allow replication by others? Yes
Are the datasets clearly presented in a useable and accessible format?Yes Competing Interests: No competing interests were disclosed.

Figure 2 .
Figure 2. Genome assembly of Amphipoea lucens, ilAmpLuce6.1:metrics.The BlobToolKit Snailplot shows N50 metrics and BUSCO gene completeness.The main plot is divided into 1,000 size-ordered bins around the circumference with each bin representing 0.1% of the 647,718,368 bp assembly.The distribution of scaffold lengths is shown in dark grey with the plot radius scaled to the longest scaffold present in the assembly (33,593,231 bp, shown in red).Orange and pale-orange arcs show the N50 and N90 scaffold lengths (22,308,434 and 16,611,211 bp), respectively.The pale grey spiral shows the cumulative scaffold count on a log scale with white scale lines showing successive orders of magnitude.The blue and pale-blue area around the outside of the plot shows the distribution of GC, AT and N percentages in the same bins as the inner plot.A summary of complete, fragmented, duplicated and missing BUSCO genes in the lepidoptera_odb10 set is shown in the top right.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/ilAmpLuce6.1/dataset/CANNSA01/snail.

Figure 5 .
Figure 5. Genome assembly of Amphipoea lucens, ilAmpLuce6.1:Hi-C contact map.Hi-C contact map of the ilAmpLuce6.1 assembly, visualised using HiGlass.Chromosomes are shown in order of size from left to right and top to bottom.An interactive version of this figure may be viewed at https://genome-note-higlass.tol.sanger.ac.uk/l/?d=LlzrRwP5TQK2GbsjsUomLg.

Table 3 . Software tools and versions used. Software tool Version Source
The genome sequence is released openly for reuse.The Amphipoea lucens genome sequencing initiative is part of the Darwin Tree of Life (DToL) project.All raw sequence data and the assembly have been deposited in INSDC databases.The genome will be annotated using available RNA-Seq data and presented through the Ensembl pipeline at the European Bioinformatics Institute.Raw data and assembly accession identifiers are reported in Table1.Members of the Tree of Life Core Informatics collective are listed here: https://doi.org/10.5281/zenodo.5013541.Members of the Darwin Tree of Life Consortium are listed here: https://doi.org/10.5281/zenodo.4783558.

Guide to the Moths of Great Britain and Ireland: Third Edition.
Bloomsbury Wildlife Guides, 2017.Once Amphipoea lucens is used, except when it is used as the first word of new sentences, you can use A. lucens throughout the paper.
○Use either a common name or a scientific name throughout.○ ○Is

the rationale for creating the dataset(s) clearly described? Yes Are the protocols appropriate and is the work technically sound? Yes Are sufficient details of methods and materials provided to allow replication by others? Yes Are the datasets clearly presented in a useable and accessible format? Yes Competing Interests:
No competing interests were disclosed.

have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.
This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.