The genome sequence of the garden grass-veneer, Chrysoteuchia culmella (Linnaeus, 1758)

We present a genome assembly from an individual male Chrysoteuchia culmella (the garden grass-veneer; Arthropoda; Insecta; Lepidoptera; Crambidae). The genome sequence is 645 megabases in span. The majority of the assembly (99.81%) is scaffolded into 31 chromosomal pseudomolecules with the Z sex chromosome assembled. The complete mitochondrial genome was also assembled and is 15.4 kilobases in length. Gene annotation of this assembly on Ensembl has identified 21,251 protein coding genes.


Background
The garden grass-veneer, Chrysoteuchia culmella (Linnaeus, 1758), is a micro moth of the Crambinae subfamily. It is common in grassland, rough meadows and gardens throughout much of Europe, including the British Isles ("Chrysoteuchia Culmella (Linnaeus, 1758)" 2010). It is recognised by its angled subterminal line, golden metallic cilia and size, with a wingspan of 20-24mm ("Chrysoteuchia Culmella (Garden Grass-Veneer)" 2012). The eggs are laid on various grasses and, after hatching, the larvae feed from September to April on the stem bases of grasses. After pupating in May from a cocoon near the ground, the species is on the wing from mid-May to mid-September (and occasionally until late October). During this time, it can be readily disturbed from grasses during the day and attracted to light during the night (Langmaid et al., 2018). C. culmella larvae are frequent hosts of the endoparasitic larvae of Eriothrix rufomaculata, a parasitoid species of fly (Paston & Rotheray, 2009). We present a complete genome assembly for C. culmella as part of the Darwin Tree of Life project, Wellcome Sanger Institute, aiming to sequence the genomes of 70,000 species of eukaryotic organisms in Britain and Ireland.

Genome sequence report
The genome was sequenced from a single male C. culmella (ilChrCulm1) collected from Wytham Woods, Berkshire, UK ( Figure 1). A total of 43-fold coverage in Pacific Biosciences single-molecule HiFi long reads and 56-fold coverage in 10X Genomics read clouds were generated. Primary assembly contigs were scaffolded with chromosome conformation Hi-C data. Manual assembly curation corrected 18 missing/misjoins and removed 2 haplotypic duplications, reducing the assembly size by 0.64% and the scaffold number by 17.39%, and increasing the scaffold N50 by 4.73%.
The final assembly has a total length of 645 Mb in 57 sequence scaffolds with a scaffold N50 of 22.8 Mb (Table 1). The majority, 99.81%, of the assembly sequence was assigned to 31 chromosomal-level scaffolds, representing 30 autosomes (numbered by sequence length) and the Z sex chromosome (Figure 2- Figure 5; Table 2).
The assembly has a BUSCO v5. 13.2 (Manni et al., 2021) completeness of 98.6% (single 98.3%, duplicated 0.3%) using the  lepidoptera_odb10 reference set (n=5,286). While not fully phased, the assembly deposited is of one haplotype. Contigs corresponding to the second haplotype have also been deposited.

Methods
Sample acquisition and nucleic acid extraction Two C. culmella specimens (ilChrCulm1, genome assembly; ilChrCulm2, Hi-C and RNA-Seq) were collected using a light trap from Wytham Woods, Berkshire, UK (latitude 51.772, longitude -1.338) by Douglas Boyes (University of Oxford). The specimens were identified by Douglas Boyes snap-frozen on dry ice.
DNA was extracted at the Tree of Life laboratory, Wellcome Sanger Institute. The ilChrCulm1 sample was weighed and dissected on dry ice. Whole organism tissue was disrupted using a Nippi Powermasher fitted with a BioMasher pestle. Fragment size analysis of 0.01-0.5 ng of DNA was then performed using an Agilent FemtoPulse. High molecular weight (HMW) DNA was extracted using the Qiagen MagAttract HMW DNA extraction kit. Low molecular weight DNA was removed from a 200-ng aliquot of extracted DNA using 0.8X AMpure XP purification kit prior to 10X Chromium sequencing; a minimum of 50 ng DNA was submitted for 10X sequencing. HMW DNA was sheared into an average fragment size between 12-20 kb in a Megaruptor 3 system with speed setting 30. Sheared DNA was purified by solid-phase reversible immobilisation using AMPure PB beads with a 1.8X ratio of beads to sample to remove the shorter fragments and concentrate the DNA sample. The concentration of the sheared and purified DNA was assessed using a Nanodrop spectrophotometer and Qubit Fluorometer and Qubit dsDNA High Sensitivity Assay kit. Fragment size distribution was evaluated by running the sample on the FemtoPulse system.
RNA was extracted from whole organism tissue of ilChrCulm2 in the Tree of Life Laboratory at the WSI using TRIzol,    Table 3 contains a list of all software tool versions used, where appropriate.

Genome annotation
The Ensembl gene annotation system (Aken et al., 2016) was used to generate annotation for the Chrysoteuchia culmella assembly (GCA_910589605.1). Annotation was created primarily through alignment of transcriptomic data to the genome, with gap filling via protein-to-genome alignments of a select set of proteins from UniProt (UniProt Consortium, 2019).  This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Ma Cristina Del Rincón-Castro
Universidad de Guanajuato,, Campus Irapuato-Salamanca, Mexico The article deals with genome assembly from a male Chrysoteuchia culmella (the garden grassveneer; Arthropoda; Insecta; Lepidoptera; Crambidae). The authors report that the genome sequence is 645 megabases in span. The majority of the assembly (99.81%) is scaffolded into 31 chromosomal pseudomolecules with the Z sex chromosome assembled. The article presents the complete information for a database of genomic data on this insect, its approval is recommended.

Is the rationale for creating the dataset(s) clearly described? Yes
Are the protocols appropriate and is the work technically sound? Yes

Are sufficient details of methods and materials provided to allow replication by others? Yes
Are the datasets clearly presented in a useable and accessible format? Yes We confirm that we have read this submission and believe that we have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.