The genome sequence of the Coxcomb Prominent, Ptilodon capucinus (Linnaeus, 1758)

We present a genome assembly from an individual male Ptilodon capucinus (the Coxcomb Prominent; Arthropoda; Insecta; Lepidoptera; Notodontidae). The genome sequence is 348.7 megabases in span. The assembly is scaffolded into 31 chromosomal pseudomolecules, including the Z sex chromosome. The mitochondrial genome has also been assembled and is 15.38 kilobases in length. Gene annotation of this assembly on Ensembl identified 16,968 protein coding genes.


Background
Ptilodon capucinus (Coxcomb Prominent) is a notodontid moth, which is common throughout the Palearctic from Ireland to Japan.It is found across Britain and Ireland but has undergone a significant decline in both distribution and abundance in the last 50 years (Randle et al., 2019) (Randle et al., 2019).It is found in woodlands, scrub, and gardens.
This medium sized moth (forewing length 17-22 mm) varies in colour between light and dark brown (Waring et al., 2017).Like all notodontid moths, it has a small tuft of scales on its back which gives rise to its common name of prominent.It is believed that this tuft breaks up the outline of the moth, affording some protection from predators.Coxcomb refers to the white quiff of scales which decorates its head.This is thought to resemble a form of jester's hat (Marren, 2019), or perhaps the crest of a cockerel.The result of its colouration is that when in the resting position, the wings are folded down, and the moth resembles a dead leaf (Heath & Emmet, 1983).
The moth has two generations a year in the southern part of Britain, flying from April to June, and August to September.In good years there can be two generations in the northern part of its range.The caterpillar is green, with two red projections towards the end of the body.It has an interesting threat response, whereby it curls its head back over its body when alarmed.The larva is polyphagous, eating a wide range of leaves of trees and shrubs.It overwinters as a pupa, often under tree roots (Heath & Emmet, 1983).
The genome of P. capucinus was sequenced as part of the Darwin Tree of Life Project, a collaborative effort to sequence all named eukaryotic species in the Atlantic Archipelago of Britain and Ireland.Here we present a chromosomally complete genome sequence for P. capucinus based on one male specimen from Wytham Woods, Oxfordshire, UK.

Genome sequence report
The genome was sequenced from one male Ptilodon capucinus (Figure 1) collected from Wytham Woods, Oxfordshire, UK (51.77,.A total of 55-fold coverage in Pacific Biosciences single-molecule HiFi long reads and 121-fold coverage in 10X Genomics read clouds were generated.Primary assembly contigs were scaffolded with chromosome conformation Hi-C data.Manual assembly curation corrected 2 missing joins or mis-joins and removed 1 haplotypic duplication, reducing the scaffold number by 5.88%. The final assembly has a total length of 348.7 Mb in 31 sequence scaffolds with a scaffold N50 of 12.6 Mb (Table 1).All of the assembly sequence was assigned to 31 chromosomal-level scaffolds, representing 30 autosomes and the Z sex chromosome.Chromosome-scale scaffolds confirmed by the Hi-C data are named in order of size (Figure 2-Figure 5; Table 2).While not fully phased, the assembly deposited is of one haplotype.Contigs corresponding to the second haplotype have also been deposited.The mitochondrial genome was also assembled and can be found as a contig within the multifasta file of the genome submission.
Metadata for specimens, spectral estimates, sequencing runs, contaminants and pre-curation assembly statistics can be found at https://links.tol.sanger.ac.uk/species/987449.
The resulting annotation includes 17,172 transcribed mRNAs from 16,968 protein-coding genes.

Sample acquisition and nucleic acid extraction
The specimen selected for genome sequencing was a male Ptilodon capucinus (specimen number Ox000813; individual  ilPtiCapc1) collected from Wytham Woods, Oxfordshire (biological vice-county Berkshire), UK (latitude 51.77, longitude -1.34) on 2020-08-01.The specimen was taken from a woodland habitat by Douglas Boyes (University of Oxford) using a light trap.The specimen was identified by the collector and then preserved on dry ice.
DNA was extracted at the Tree of Life laboratory, Wellcome Sanger Institute (WSI).The ilPtiCapc1 specimen was weighed and dissected on dry ice with tissue set aside for Hi-C sequencing.Thorax tissue was cryogenically disrupted to a fine powder using a Covaris cryoPREP Automated Dry Pulveriser, receiving multiple impacts.High molecular weight (HMW) DNA was extracted using the Qiagen MagAttract HMW DNA extraction kit.Low molecular weight DNA was removed from a 20 ng aliquot of extracted DNA using the 0.8X AMpure XP purification kit prior to 10X Chromium sequencing; a minimum of 50 ng DNA was submitted for 10X sequencing.HMW DNA was sheared into an average fragment size of 12-20 kb in a Megaruptor 3 system with speed setting 30.Sheared DNA was purified by solid-phase reversible immobilisation using AMPure PB beads with a 1.8X ratio of beads to sample to remove the shorter fragments and concentrate the DNA sample.The concentration of the sheared and purified DNA was assessed using a Nanodrop spectrophotometer and Qubit Fluorometer and Qubit dsDNA High Sensitivity Assay kit.Fragment size distribution was evaluated by running the sample on the FemtoPulse system.
RNA was extracted from abdomen tissue of ilPtiCapc1 in the Tree of Life Laboratory at the WSI using TRIzol, according to the manufacturer's instructions.RNA was then eluted in 50 μl RNAse-free water and its concentration assessed using a     et al., 2020) andBUSCO scores (Manni et al., 2021;Simão et al., 2015) were calculated.
Table 3 contains a list of relevant software tool versions and sources.

Genome annotation
The BRAKER2 pipeline (Brůna et al., 2021) was used in the default protein mode to generate annotation for the Ptilodon capucinus assembly (GCA_914767695.1). in Ensembl Rapid Release.

Wellcome Sanger Institute -Legal and Governance
The materials that have contributed to this genome note have been supplied by a Darwin Tree of Life Partner.The A Hi-C map for the final assembly was produced using bwa-mem2 (Vasimuddin et al., 2019) in the Cooler file format (Abdennur & Mirny, 2020).To assess the assembly metrics, the k-mer completeness and QV consensus quality submission of materials by a Darwin Tree of Life Partner is subject to the 'Darwin Tree of Life Project Sampling Code of Practice', which can be found in full on the Darwin Tree of Life website here.By agreeing with and signing up to the Sampling Code of Practice, the Darwin Tree of Life Partner agrees they will meet the legal and ethical requirements and standards set out within this document in respect of all samples acquired for, and supplied to, the Darwin Tree of Life Project.
Further, the Wellcome Sanger Institute employs a process whereby due diligence is carried out proportionate to the nature of the materials themselves, and the circumstances under which they have been/are to be collected and provided for use.The purpose of this is to address and mitigate any potential legal and/or ethical implications of receipt and use of the materials as part of the research project, and to ensure that in doing so we align with best practice wherever possible.
The overarching areas of consideration are: • Ethical review of provenance and sourcing of the material

James Mallet
Harvard University, Cambridge, Massachusetts, USA There's nothing much to say.The paper reports a genome assembly of the relevant species.Most of the reads mapped to "31 chromosomal pseudomolecules," as they should, given that the mode of chromosome numbers in the Lepidoptera is 31.The mitochondrial genome was also assembled successfully.As far as I can see the identity of the chromosomes was not verified against those of other species.The Z chromosome was reportedly identified as OU611807.1,but it is not explained how, since the sequence data was from a male specimen, and the Z would therefore normally be at the same coverage as autosomes.However, this seems typical for the work in this series of papers, and the chromosomes would certainly be identifiable via comparative study in the future.
Overall, an impressive and successful assembly which will be useful for further and ongoing work on the comparative genomics of Lepidoptera.
Is the rationale for creating the dataset(s) clearly described?

Vlad Dincă
University of Oulu, Oulu, Finland In this study, the authors have successfully generated a chromosome-level genome assembly of a male Ptilodon capucinus (Lepidoptera, Notodontidae).The mitochondrial genome has been assembled as well.
Although I am not an expert in several of the methodological approaches used, as far as I can tell, the manuscript is technically sound and uses methodologies and pipelines that are fairly wellestablished through use by the Sanger Institute.
The genome appears to be of high quality and has a BUSCO v5.3.2 completeness of 98.7%.It is a pity that the W sex chromosome is lacking (a male was sequenced).
Reference genomes such as this one represent a valuable resource for the scientific community and each new addition provides new opportunities for research.
A few additional comments are included below.
Numerous sources list the species as "Ptilodon capucina".Although this study is obviously not focused on nomenclatural aspects, I wonder whether a short explanation (in the Background section) of this name discrepancy is necessary (perhaps it is a case of gender agreement issue, which is the subject of notable debate in some groups -e.g.European butterflies).Briefly addressing this may be useful given that we are dealing with the first reference genome for this species and it is good to avoid any potential ambiguities, even if slight.
I see Ptilodontinae is listed as subfamily.However, various sources seem to list this species under Notodontinae.I tried to find more information about this, but I was not able to find much, at least not based on DNA data.What I could find does not seem to provide strong support for Ptilodontinae, although these studies used either limited DNA data (Kobayashi & Nonaka 2016), or were not focused on this issue (St Laurent R, et al. 2023 [Ref 1]).It is possible that I may have missed some key study (Notodontidae are not my main area of taxonomic expertise).Nevertheless, I wanted to mention this subfamily aspect that may also need a short clarification/mention in the Background section.
I wonder how much is known in terms of genetic structure for P. capucinus.Even if limited data is

Figure 2 .
Figure 2. Genome assembly of Ptilodon capucinus, ilPtiCapc1.1:metrics.The BlobToolKit Snailplot shows N50 metrics and BUSCO gene completeness.The main plot is divided into 1,000 size-ordered bins around the circumference with each bin representing 0.1% of the 348,711,871 bp assembly.The distribution of scaffold lengths is shown in dark grey with the plot radius scaled to the longest scaffold present in the assembly (15,614,753 bp, shown in red).Orange and pale-orange arcs show the N50 and N90 scaffold lengths (12,643,213 and 8,629,352 bp), respectively.The pale grey spiral shows the cumulative scaffold count on a log scale with white scale lines showing successive orders of magnitude.The blue and pale-blue area around the outside of the plot shows the distribution of GC, AT and N percentages in the same bins as the inner plot.A summary of complete, fragmented, duplicated and missing BUSCO genes in the lepidoptera_odb10 set is shown in the top right.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/ilPtiCapc1.1/dataset/ ilPtiCapc1_1.1/snail.

Figure 5 .
Figure 5. Genome assembly of Ptilodon capucinus, ilPtiCapc1.1 alternate haplotype: Hi-C contact map of the ilPtiCapc1.1 alternate haplotype assembly, visualised using HiGlass.Chromosomes are shown in order of size from left to right and top to bottom.An interactive version of this figure may be viewed at https://genome-note-higlass.tol.sanger.ac.uk/l/?d=BqET2dZSTnSs7zhjHqvgWg.)

Open Peer Review Current Peer Review Status: Version 1
This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Partly Are the protocols appropriate and is the work technically sound? Yes Are sufficient details of methods and materials provided to allow replication by others? Yes Are the datasets clearly presented in a useable and accessible format? Yes
Reviewer Expertise: Comparative genomics, evolutionary biology of the Lepidoptera I

confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.
This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.