The genome sequence of a leaf beetle, Cryptocephalus moraei (Linnaeus, 1758)

We present a genome assembly from an individual female Cryptocephalus moraei (a leaf beetle; Arthropoda; Insecta; Coleoptera; Chrysomelidae). The genome sequence is 500.5 megabases in span. Most of the assembly is scaffolded into 15 chromosomal pseudomolecules, including the X sex chromosome. The mitochondrial genome has also been assembled and is 15.68 kilobases in length.


Background
Cryptocephalus moraei (Linnaeus, 1758) is a small beetle measuring 3 to 4 mm (Brock, 2021).It belongs to the family Chrysomelidae, commonly referred to as leaf beetles, and species from the Cryptocephalus genus are often called pot beetles.The species is shiny black, with strongly punctured striae on the elytra.It has a distinctive pattern that allows for reliable field identification: four small yellow or orange patches, one in the lateral and one in the posterior part of each elytron (Figure 1).The species is fully winged and capable of flight (Cox, 2004).
Cryptocephalus moraei feeds on the leaves and pollen of St John's-wort Hypericum sp., with a preference for H. perforatum L. (Erber, 1988;James, 2018).The beetle prefers open, rough habitat on chalk or gravel, calcareous grassland and heathland, but can also be found in broad-leaved woodland (Cox, 1991;James, 2018).Widespread in England and Wales, particularly south of The Wash, but scarcer and more scattered further north; there is a single record from Scotland: North Ayrshire on 27/07/1895 (specimen at the Glasgow Museum) (Burgess, 2020;Chrysomelidae Recording Scheme, 2023).
This beetle is oviparous.The adults mate in the spring and the female produces a few hundred eggs over several weeks.When ovipositing the female covers the fertilised egg in excrement and deposits it on the ground.There are four larval instars.The newly hatched larvae retain their case and expand it by adding more excrement.When mature, the larva attaches the case to a leaf or bark, closes the opening and pupates inside.Cryptocephalus moraei overwinters as a larva (Cox, 1991; UK Beetles, no date).The adults are active from May to September, peaking in June and July (UK Beetles, no date).
The high-quality genome of Cryptocephalus moraei was sequenced as part of the Darwin Tree of Life Project, a collaborative effort to sequence all named eukaryotic species in the Atlantic Archipelago of Britain and Ireland.Here we present a chromosomally complete genome sequence for Cryptocephalus moraei based on two specimens from Parsonage Moor and Cothill Fen Nature Reserves, England.

Genome sequence report
The genome was sequenced from one female Cryptocephalus moraei (Figure 1) collected from Dry Sandford Pit Nature Reserve (51.69,.A total of 35-fold coverage in Pacific Biosciences single-molecule HiFi long reads was generated.Primary assembly contigs were scaffolded with chromosome conformation Hi-C data.Manual assembly curation corrected 43 missing joins or mis-joins and removed 6 haplotypic duplications, reducing the assembly length by 0.36% and the scaffold number by 9.15%, and decreasing the scaffold N50 by 0.47%. The final assembly has a total length of 500.5 Mb in 129 sequence scaffolds with a scaffold N50 of 35.8 Mb (Table 1).Most (98.26%) of the assembly sequence was assigned to 15 chromosomal-level scaffolds, representing 14 autosomes and the X sex chromosome.Chromosome-scale scaffolds confirmed by the Hi-C data are named in order of size (Figure 2-Figure 5; Table 2).The X chromosome was identified using that of the Phyllotreta cruciferae genome (GCA_ 917563865.1)While not fully phased, the assembly deposited is  of one haplotype.Contigs corresponding to the second haplotype have also been deposited.The mitochondrial genome was also assembled and can be found as a contig within the multifasta file of the genome submission.
Metadata for specimens, spectral estimates, sequencing runs, contaminants and pre-curation assembly statistics can be found at https://links.tol.sanger.ac.uk/species/204949.The specimen was prepared for DNA extraction at the Tree of Life laboratory, Wellcome Sanger Institute (WSI).The icCryMora2 sample was weighed and dissected on dry ice.Whole organism tissue was disrupted using a Nippi Powermasher fitted with a BioMasher pestle.DNA was extracted at the Wellcome Sanger Institute (WSI) Scientific Operations core using the Qiagen MagAttract HMW DNA kit, according to the manufacturer's instructions.

Sequencing
Pacific Biosciences HiFi circular consensus DNA sequencing libraries were constructed according to the manufacturers' instructions.DNA sequencing was performed by the Scientific Operations core at the WSI on Pacific Biosciences SEQUEL II (HiFi) instrument.Hi-C data were also generated from whole organism tissue of icCryMora1 using the Arima2 kit and sequenced on the icCryMora2 instrument.A Hi-C map for the final assembly was produced using bwa-mem2 (Vasimuddin et al., 2019) in the Cooler file format (Abdennur & Mirny, 2020).To assess the assembly metrics, the k-mer completeness and QV consensus quality values were calculated in Merqury (Rhie et al., 2020).This work was done    et al., 2021;Simão et al., 2015) were calculated.
Table 3 contains a list of relevant software tool versions and sources.

Legal and ethical review process for Darwin Tree of Life Partner submitted materials
The materials that have contributed to this genome note have been supplied by a Darwin Tree of Life Partner.
The submission of materials by a Darwin Tree of Life Partner is subject to the 'Darwin Tree of Life Project Sampling Code of Practice', which can be found in full on the Darwin Tree of Life website here.By agreeing with and signing up to the Sampling Code of Practice, the Darwin Tree of Life Partner agrees they will meet the legal and ethical requirements and standards set out within this document in respect of all samples acquired for, and supplied to, the Darwin Tree of Life Project.
Further, the Wellcome Sanger Institute employs a process whereby due diligence is carried out proportionate to the nature of the materials themselves, and the circumstances under which they have been/are to be collected and provided for use.
The purpose of this is to address and mitigate any potential legal and/or ethical implications of receipt and use of the materials as part of the research project, and to ensure that in doing so we align with best practice wherever possible.
The overarching areas of consideration are: • Ethical review of provenance and sourcing of the material In the present study, the authors developed a reference genome for the leaf beetle Cryptocephalus moraei.This paper clearly reports the results obtained and relevant scientific information is also provided.However, in my view, some aspects can be improved.Below I have reported my comments.
Since this paper provides the reference genome of a species, in my view, it is important to give a more comprehensive overview of the ecology/phenology and distribution of the species, not referring only to the UK administrative boundaries.Regarding the distribution, I suggest referring (Staines C, et al., 2011 [Ref 1]).In literature are reported also the following host plants: 1.
The fact that some C. moraei specimens were discovered in ~10k years old deposits of Rodbaston Hall is interesting, but without insight into the evolutionary context this sentence becomes irrelevant and unlinked with the previous section.If information on the closest species to C. moraei is available in the literature, I suggest its integration to improve this short paragraph.Reviewer Expertise: Insect evolution, insect-plant-microbiota interactions, insect symbiosis, Chrysomelidae systematics We confirm that we have read this submission and believe that we have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Huai-Jun Xue
Nankai University, Tianjin, China In the present study, a genome was assembled to the chromosomal level for an individual female of a leaf beetle species Cryptocephalus moraei but without annotation for genes and further analysis.In my opinion, the methods adopted in this study are feasible.I don't have no more comments.Reviewer Expertise: Evolution, speciation I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Figure 1 .
Figure 1.Photographs of the Cryptocephalus moraei specimen NHMUK014036740 (icCryMora2) taken during sample preservation and processing. A. Habitus of the specimen in dorsal view.B. The specimen in ventral view.

Figure 2 .
Figure 2. Genome assembly of Cryptocephalus moraei, icCryMora2.1:metrics.The BlobToolKit Snailplot shows N50 metrics and BUSCO gene completeness.The main plot is divided into 1,000 size-ordered bins around the circumference with each bin representing 0.1% of the 500,560,161 bp assembly.The distribution of scaffold lengths is shown in dark grey with the plot radius scaled to the longest scaffold present in the assembly (46,835,023 bp, shown in red).Orange and pale-orange arcs show the N50 and N90 scaffold lengths (35,831,563 and 24,393,232 bp), respectively.The pale grey spiral shows the cumulative scaffold count on a log scale with white scale lines showing successive orders of magnitude.The blue and pale-blue area around the outside of the plot shows the distribution of GC, AT and N percentages in the same bins as the inner plot.A summary of complete, fragmented, duplicated and missing BUSCO genes in the endopterygota_odb10 set is shown in the top right.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/icCryMora2.1/dataset/CAMIUK01/snail.

Figure 5 .
Figure 5. Genome assembly of Cryptocephalus moraei, icCryMora2.1:Hi-C contact map of the icCryMora2.1 assembly, visualised using HiGlass.Chromosomes are shown in order of size from left to right and top to bottom.An interactive version of this figure may be viewed at https://genome-note-higlass.tol.sanger.ac.uk/l/?d=AU7k8c4tT3WzJdclLXS5zA.

2 .Figure 1 .
Figure 1.I suggest adding the scale bar in each panel.A higher magnification of the 3.

Reviewer
Report 03 November 2023 https://doi.org/10.21956/wellcomeopenres.21625.r68954© 2023 Xue H.This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

A
minor issue: "Cryptocephalus moraei feeds on the leaves and pollen of St John's-wort Hypericum sp., with a preference for H. perforatum L." -If C. moraei feeds on more than one species of Hypericum, should be "Hypericum spp." ○ Is the rationale for creating the dataset(s) clearly described?Yes Are the protocols appropriate and is the work technically sound?Yes Are sufficient details of methods and materials provided to allow replication by others?Yes Are the datasets clearly presented in a useable and accessible format?Yes Competing Interests: No competing interests were disclosed.

Table 3 . Software tools: versions and sources. Software tool Version using
Nextflow (Di Tommaso et al., 2017)DSL2 pipelines "sanger-tol/readmapping"(Surana et al., 2023a)and "sanger-tol/genomenote"(Surana et al., 2023b).The genome was analysed within the BlobToolKit environment(Challis et al.,  2020)and BUSCO scores (Manni The genome sequence is released openly for reuse.The Cryptocephalus moraei genome sequencing initiative is part of the Darwin Tree of Life (DToL) project.All raw sequence data and the assembly have been deposited in INSDC databases.The genome will be annotated using available RNA-Seq data and presented through the Ensembl pipeline at the European Bioinformatics Institute.Raw data and assembly accession identifiers are reported in Table1.