The genome sequence of the Peppered Grey, Eudonia truncicolella (Stainton, 1849)

We present a genome assembly from an individual male Eudonia truncicolella (the Peppered Grey; Arthropoda; Insecta; Lepidoptera; Crambidae). The genome sequence is 499.1 megabases in span. Most of the assembly is scaffolded into 30 chromosomal pseudomolecules, including the Z sex chromosome. The mitochondrial genome has also been assembled and is 15.38 kilobases in length.


Background
The Peppered Grey (otherwise commonly known as the Ground-moss Grey: Parsons & Clancy, 2023), Eudonia truncicolella (Stainton, 1849), is a greyish scopariine pyralid moth with a mottled appearance and roughly scaled darker markings, measuring about 18-23 mm in wingspan (Goater et al., 1986) and 9-11 mm in length (Lewis, 2023).It is superficially similar to several scopariine species, others of which exhibit a rather X-shaped dark reniform stigma and in the United Kingdom is particularly similar to the Moorland Grey E. murana (Curtis, 1827), lacking in Southern England and Ireland, and from which it can reliably be separated by dissection.The dark postmedian line of the forewing has a more indented appearance than E. murana, but the forewing of that species has a sclerotised hook on the underside in addition to the usual coupling system (Parsons & Clancy, 2023: 302).It is also quite similar to Scoparia ambigualis (Treitschke, 1829) whose forewing has greyer markings and a yellowish hue (Parsons & Clancy, 2023).For an online guide to British species see Lewis (2023).However, the female genitalia are similar to those of E. lacustrata (Panzer, 1804), and elsewhere in the Palaearctic (China), the male genitalia are similar to those of E. wolongensis (Li et al., 2012), and those of both sexes (see Lepiforum, 2023;Parsons & Clancy, 2023: 467, 478) similar to E. apicifusca Sasaki, 1999(Li et al., 2012), so DNA barcoding is also helpful for verification.
The Peppered Grey is widespread in woodland in the British Isles, also occurring in heathland and gardens and is locally common, ranging northwards as far as the Inner Hebrides and Orkney, and westwards to Isle of Man and Ireland (Goater et al., 1986;Parsons & Clancy, 2023).It is also very widespread in the Palaearctic from western Ireland and Spain and northern Scandinavia as far east as China and Japan (GBIF Secretariat, 2023;Li et al., 2012).
The adult moth flies in mid-June to October (mostly in July and August) in Britain, resting on tree trunks by day from which it is very flighty, and after dusk is recorded to visit the flowers of Field Scabious Knautia arvensis (Parsons & Clancy, 2023).The dark brownish green larva with a dorsal dark line flanked by rather quadrate spots and lateral crescentic spots on each segment (Lepiforum, 2023;Parsons & Clancy, 2023: 312) feeds from September to June in silken galleries under moss such as Hypnum cupressiforme (Lepiforum, 2023), Dicranum scoparium, and Campylotus sp.(Parsons & Clancy, 2023) or sometimes under stones (Goater et al., 1986).It usually pupates as an orange-brown pupa in moss, from June to July (Goater et al., 1986).
DNA barcode sequences on BOLD (30/11/2023) belong to a single BIN cluster, BOLD:AAB1558 with up to about 1.28% intraspecific divergence and its two nearest neighbours on BOLD exhibit from about 2.8-4% pairwise divergence; these are Eudonia "sp. 1 WL-2017" from China (BOLD:AED8759) and then Scoparia exhibitalis Walker, [1866] from Australia (BOLD:AAC3288).E. truncicolella was treated in a phylogeny by Léger et al. (2019) based on five nuclear genes and one mitochondrial gene and belongs to a clade comprising Eudonia lacustrata and E. mercurella (L., 1758) among the described species treated there.
The genome of the Peppered Grey, Eudonia truncicolella, was sequenced as part of the Darwin Tree of Life Project, a collaborative effort to sequence all named eukaryotic species in the Atlantic Archipelago of Britain and Ireland.Here we present a chromosomally complete genome sequence for Eudonia truncicolella, based on one male specimen from Beinn Eighe National Nature Reserve, Scotland.

Genome sequence report
The genome was sequenced from one male Eudonia truncicolella (Figure 1) collected from Beinn Eighe National Nature Reserve (see Methods).A total of 51-fold coverage in Pacific Biosciences single-molecule HiFi long reads was generated.Primary assembly contigs were scaffolded with chromosome conformation Hi-C data.Manual assembly curation corrected 5 missing joins or mis-joins and removed one haplotypic duplication, increasing the scaffold count by 1.
The final assembly has a total length of 499.1 Mb in 46 sequence scaffolds with a scaffold N50 of 18.4 Mb (Table 1).The snailplot in Figure 2 provides a summary of the assembly statistics, while the distribution of assembly scaffolds on GC proportion and coverage is shown in Figure 3.The cumulative assembly plot in Figure 4 shows curves for subsets of scaffolds assigned to different phyla.Most (99.89%) of the assembly sequence was assigned to 30 chromosomal-level scaffolds, representing 29 autosomes and the Z sex chromosome.The Z chromosome was assigned by alignment to that of Eudonia lacustrata (GCA_947562085.1)(Boyes et al., 2023).Chromosome-scale scaffolds confirmed by the Hi-C data are named in order of size (Figure 5; Table 2).While not   fully phased, the assembly deposited is of one haplotype.Contigs corresponding to the second haplotype have also been deposited.The mitochondrial genome was also assembled and can be found as a contig within the multifasta file of the genome submission.

Sequencing
Pacific Biosciences HiFi circular consensus DNA sequencing libraries were constructed according to the manufacturers' instructions.DNA sequencing was performed by the Scientific Operations core at the WSI on a Pacific Biosciences SEQUEL II instrument.Hi-C data were also generated from thorax tissue of ilEudTrun2 using the Arima2 kit and sequenced on the Illumina NovaSeq 6000 instrument.

Genome assembly, curation and evaluation
Assembly was carried out with Hifiasm (Cheng et al., 2021) and haplotypic duplication was identified and removed with purge_dups (Guan et al., 2020).The assembly was then scaffolded with Hi-C data (Rao et al., 2014) using YaHS (Zhou et al., 2023).The assembly was checked for contamination and corrected as described previously (Howe et al., 2021).Manual curation was performed using HiGlass (Kerpedjiev et al., 2018) and Pretext (Harry, 2022).The mitochondrial genome was assembled using MitoHiFi   et al., 2020) or MITOS (Bernt et al., 2013) and uses these annotations to select the final mitochondrial contig and to ensure the general quality of the sequence.
Table 3 contains a list of relevant software tool versions and sources.

Wellcome Sanger Institute -Legal and Governance
The materials that have contributed to this genome note have been supplied by a Darwin Tree of Life Partner.The submission of materials by a Darwin Tree of Life Partner is subject to the 'Darwin Tree of Life Project Sampling Code of Practice', which can be found in full on the Darwin Tree of Life website here.By agreeing with and signing up to the Sampling Code of Practice, the Darwin Tree of Life Partner agrees they will meet the legal and ethical requirements and standards set out within this document in respect of all samples acquired for, and supplied to, the Darwin Tree of Life Project.
Further, the Wellcome Sanger Institute employs a process whereby due diligence is carried out proportionate to the nature of the materials themselves, and the circumstances under which they have been/are to be collected and provided for use.
The purpose of this is to address and mitigate any potential legal and/or ethical implications of receipt and use of the materials as part of the research project, and to ensure that in doing so we align with best practice wherever possible.The overarching areas of consideration are: • Ethical review of provenance and sourcing of the material One aspect that could improve understanding would be to provide information on the chromosome count, since most of the assembly sequences were assigned to 30 chromosome-level scaffolds.
Additionally, I suggest including information about the extent of differences between the two haplotypes deposited in the data.This would be valuable for future study of inter-and intraspecies differences.
Lastly, it would be informative to mention the similarity between the Z chromosome of E. truncicolellai and that of E. lacustrata, considering that the Z chromosome was assigned using the sequence of the latter.
The following is a repot from the co-reviewer: I am impressed by the thorough genome data acquisition protocol and the use of the latest and most suitable tools for assembly, data processing, and evaluation.However, to make the research outcomes more widely usable by researchers specializing in moss and other insects, I have a few minor comments.
1.The significance of Eudonia truncicolella as a subject for molecular research should be described [in Background].
Although the taxonomy, distribution, and ecological knowledge of E. truncicolella are comprehensively covered in the Background, if there have been molecular studies using this or closely related species, it would be beneficial to include details of such studies.This would help convey the significance of sequencing the genome of this species more effectively to the readers.
2. From Background, it is clear that E. truncicolella often has morphological similarities with closely related species, making it difficult to fully cover the classification system based solely on morphological characteristics.In addition to the phrase "DNA barcoding is also helpful for verification," it would be helpful to add a sentence emphasizing the utility of DNA barcoding.For example, a statement like "Combining morphological species taxonomy with DNA barcoding establishes the classification system for this species" could be added.This genome note reports a high-quality assembly and annotation of the genome of the Peppered Grey, Eudonia truncicolella.
The background section is well written, and there is also plenty of species information, which adds to the data note.The methods used were appropriate for this assembly, and the resultant genome is of high quality and well validated.The data will be an invaluable resource for moth researchers in genomics and entomology.
Is the rationale for creating the dataset(s) clearly described?Yes Are the protocols appropriate and is the work technically sound?Yes

Are sufficient details of methods and materials provided to allow replication by others? Yes
Are the datasets clearly presented in a useable and accessible format?Yes Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Bioinformatics and molecular genetics of insects I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Figure 2 .
Figure 2. Genome assembly of Eudonia truncicolella, ilEudTrun2.1:metrics.The BlobToolKit Snailplot shows N50 metrics and BUSCO gene completeness.The main plot is divided into 1,000 size-ordered bins around the circumference with each bin representing 0.1% of the 499,132,009 bp assembly.The distribution of scaffold lengths is shown in dark grey with the plot radius scaled to the longest scaffold present in the assembly (27,890,741 bp, shown in red).Orange and pale-orange arcs show the N50 and N90 scaffold lengths (18,437,561 and 11,974,810 bp), respectively.The pale grey spiral shows the cumulative scaffold count on a log scale with white scale lines showing successive orders of magnitude.The blue and pale-blue area around the outside of the plot shows the distribution of GC, AT and N percentages in the same bins as the inner plot.A summary of complete, fragmented, duplicated and missing BUSCO genes in the lepidoptera_odb10 set is shown in the top right.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/Eudonia%20truncicolella/dataset/CASGGC01/snail.

Figure 5 .
Figure 5. Genome assembly of Eudonia truncicolella, ilEudTrun2.1:Hi-C contact map of the ilEudTrun2.1 assembly, visualised using HiGlass.Chromosomes are shown in order of size from left to right and top to bottom.An interactive version of this figure may be viewed at https://genome-note-higlass.tol.sanger.ac.uk/l/?d=aWzNX_veSgmZ678sikdPvw.

3.
The reference to BOLD should include a citation unless there is a specific reason not to[in  Background].see [Ref 1]    4. The method of species identification for the specimen should be briefly supplemented [in Methods].It should be briefly described whether the species identification was based on body markings, genitalia morphology, or another morphological comparison, or a combination of these factors.© 2024 Lopes T. This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.Thayná Lopes Laboratório de Citogenética e Entomologia Molecular, Universidade Estadual de Londrina, Londrina, State of Paraná, Brazil