The genome sequence of the Pinion-streaked Snout, Schrankia costaestrigalis (Stephens, 1834)

We present a genome assembly from an individual male Schrankia costaestrigalis (the Pinion-streaked Snout; Arthropoda; Insecta; Lepidoptera; Erebidae). The genome sequence is 572.0 megabases in span. Most of the assembly is scaffolded into 31 chromosomal pseudomolecules, including the Z sex chromosome. The mitochondrial genome has also been assembled and is 16.1 kilobases in length. Gene annotation of this assembly on Ensembl identified 19,453 protein coding genes.


Background
The Pinion-streaked Snout Schrankia costaestrigalis is a slender moth in the family Erebidae, found widely across Europe.The species has also been recorded in Russia, China, Japan, the Canary Islands, Australia and New Zealand (GBIF Secretariat, 2022).S. costaestrigalis rests with its narrow forewings held flat over the hindwings giving the moth a triangular appearance; the moth also has prominent palps that project forward.The forewings are pale grey-brown crossed by a darker jagged band outlined in black; this wing pattern is similar to that of a closely related species, the white-line snout S. taenialis.A third moth -the 'Autumnal Snout' -was originally given species status (S. intermedialis) but later suggested to be a possible naturally-occurring hybrid between S. costaestrigalis and S. taenialis (Waring et al., 2017); support for the hybrid hypothesis comes from a pattern of shared bands produced by PCR amplification of microsatellite DNA (Anderson et al., 2007).There is scope for testing further the hybrid hypothesis using a larger number of samples and additional genomic markers.
In Britain, S. costaestrigalis is found predominantly in marshes, bogs, fens and damp woodland across England, Wales, Scotland and Northern Ireland (NBN Atlas Partnership, 2022;Thompson & Nelson, 2003).The adult moth is on the wing from June to October in southern England; this extended flight period is thought to reflect two generations per year with the larvae of the second generation overwintering (Waring et al., 2017).There has long been uncertainty over the larval food plant used in the wild.In captivity, larvae have been reared on the flowers of wild thyme Thymus sp.(Waring et al., 2017), with an unverified report that this is supplemented by cannibalistic tendencies (Stokoe, 1948).More recent findings from China and Japan suggest that the natural food is more likely to be underground roots and tubers.For example, S. costaestrigalis has emerged as a new crop pest of potato Solanum tuberosum in Guangxi Province, China, with the larvae living underground and eating tubers; at one affected site crop losses approached 90% across almost 300 hectares of crop (Xian et al., 2022;Zeng et al., 2020).Similarly, in Tanegashima Island, Japan, S. costaestrigalis has been recorded as a pest of broad bean Vicia faba, again with larvae living underground and eating roots and root nodules (Yoshimatsu & Nishioka, 1995).The latter authors note that the adult moths can also live in underground spaces in the soil, a habit comparable to the cave-dwelling lifestyle of two Schrankia species in Hawaii (Howarth et al., 2020) and S. costaestrigalis in Tenerife (Oromí, 2018).
We report here the complete genome sequence from a specimen of S. costaestrigalis obtained from a single individual collected from a fenland habitat in southern England, scaffolded to chromosome level using chromatin conformation data from a second individual collected at the same site.A genome sequence for this species will facilitate research into the poorly understood biology of this unusual moth and may prove beneficial in designing control strategies when appropriate.

Genome sequence report
The genome was sequenced from one male Schrankia costaestrigalis (Figure 1) collected from Wytham Woods, Oxfordshire, UK (latitude 51.77, longitude -1.31).A total of 38-fold coverage in Pacific Biosciences single-molecule HiFi long reads and 79-fold coverage in 10X Genomics read clouds were generated.Primary assembly contigs were scaffolded with chromosome conformation Hi-C data.Manual assembly curation corrected 101 missing or mis-joins and removed 34 haplotypic duplications, reducing the assembly length by 2.33% and the scaffold number by 15.09%, and decreasing the scaffold N50 by 4.6%.
The final assembly has a total length of 572.0 Mb in 45 sequence scaffolds with a scaffold N50 of 20.1 Mb (Table 1).Most (99.91%) of the assembly sequence was assigned to 31 chromosomal-level scaffolds, representing 30 autosomes and the Z sex chromosome.Chromosome-scale scaffolds confirmed by the Hi-C data are named in order of size (Figure 2-Figure 5; Table 2).While not fully phased, the assembly deposited is of one haplotype.Contigs corresponding to the second haplotype have also been deposited.The mitochondrial genome was also assembled and can be found as a contig within the multifasta file of the genome submission.
Metadata for specimens, spectral estimates, sequencing runs, contaminants and pre-curation assembly statistics can be found at https://links.tol.sanger.ac.uk/species/411963.

Sample acquisition and nucleic acid extraction
Two Schrankia costaestrigalis specimens (ilSchCost1 and ilSchCost2) were collected from Wytham Woods, Oxfordshire (biological vice-county: Berkshire), UK (latitude 51.77, longitude -1.31) on 24 August 2019.The specimens were caught using a light trap in fenland habitat by Douglas Boyes (University of Oxford).The specimens were identified by the collector using a field identification, and then snap-frozen on dry ice.Individual ilSchCost1 (specimen Ox000214) was used for genome sequencing, while ilSchCost2 (specimen Ox000215) was used for Hi-C scaffolding.
DNA was extracted at the Tree of Life laboratory, Wellcome Sanger Institute (WSI).The ilSchCost1 sample was weighed and dissected on dry ice.Whole organism tissue was disrupted using a Nippi Powermasher fitted with a BioMasher pestle.
High molecular weight (HMW) DNA was extracted using the

Genome annotation
The BRAKER2 pipeline (Brůna et al., 2021) was used in the default protein mode to generate annotation for the Schrankia costaestrigalis assembly (GCA_905475405.1,Apr 2021) in Ensembl Rapid Release.

Ethics and compliance issues
The materials that have contributed to this genome note have been supplied by a Darwin Tree of Life Partner.The submission of materials by a Darwin Tree of Life Partner is subject to the Darwin Tree of Life Project Sampling Code of Practice.The genome is of high quality and the pipeline is the same as for the other genomes produced by the Darwin Tree of Life Project (DToL).All links and datasets are publicly available. Comments: -The abstract indicates that the genome assembly is from a single male, whereas the method indicates that two individuals were used.
-Interestingly, the number of annotated CDS (using the BRAKER2 pipeline in Ensembl Rapid Release) varies significantly between Lepidoptera in the DtoL.Here it is in the higher range (~19000), whereas it is often around (~16000).This suggests that these annotations include proteins associated with the transposable element.
-Should the link to the annotation available in the Ensembl Rapid Release be included in the Data Availability section?
Is the rationale for creating the dataset(s) clearly described?Yes Are the protocols appropriate and is the work technically sound?Yes

Are sufficient details of methods and materials provided to allow replication by others? Yes
Are the datasets clearly presented in a useable and accessible format?

Yes
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Molecular evolution I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Nicolas Dowdy
Milwaukee Public Museum, Milwaukee, Wisconsin, USA The article provides a good discussion of what is known about moths in the genus Schrankia (S. costaestrigalis in particular) and presents a new genomic resource for this economically important species.The quality of the results are somewhat lower than comparable Erebid genomes produced by this group in their past work, however this genome appears to be of high quality at current standards.

Is the rationale for creating the dataset(s) clearly described? Yes
Are the protocols appropriate and is the work technically sound?Yes

Are sufficient details of methods and materials provided to allow replication by others? Yes
Are the datasets clearly presented in a useable and accessible format?Yes Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Evolution and Systematics of Lepidoptera I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Figure 2 .
Figure 2. Genome assembly of Schrankia costaestrigalis, ilSchCost1.1:metrics.The BlobToolKit Snailplot shows N50 metrics and BUSCO gene completeness.The main plot is divided into 1,000 size-ordered bins around the circumference with each bin representing 0.1% of the 571,991,991 bp assembly.The distribution of scaffold lengths is shown in dark grey with the plot radius scaled to the longest scaffold present in the assembly (27,469,003 bp, shown in red).Orange and pale-orange arcs show the N50 and N90 scaffold lengths (20,082,482 and 13,139,496 bp), respectively.The pale grey spiral shows the cumulative scaffold count on a log scale with white scale lines showing successive orders of magnitude.The blue and pale-blue area around the outside of the plot shows the distribution of GC, AT and N percentages in the same bins as the inner plot.A summary of complete, fragmented, duplicated and missing BUSCO genes in the lepidoptera_odb10 set is shown in the top right.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/ilSchCost1.1/dataset/CAJQGD01.1/snail.

Figure 5 .
Figure 5. Genome assembly of Schrankia costaestrigalis, ilSchCost1.1:Hi-C contact map.Hi-C contact map of the ilSchCost1.1 assembly, visualised using HiGlass.Chromosomes are shown in order of size from left to right and top to bottom.An interactive version of this figure may be viewed at https://genome-note-higlass.tol.sanger.ac.uk/l/?d=NlebYlC5SAKV3EjzWxZAzA.

Reviewer Report 02
May 2024 https://doi.org/10.21956/wellcomeopenres.21491.r80006© 2024 Dowdy N.This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.