The genome sequence of the Cow Parsley Leaf Beetle, Chrysolina oricalcia (O.F. Müller, 1776)

We present a genome assembly from an individual Cow Parsley Leaf Beetle Chrysolina oricalcia (the Cow Parsley Leaf Beetle; Arthropoda; Insecta; Coleoptera; Chrysomelidae). The genome sequence is 1,423.4 megabases in span. Most of the assembly is scaffolded into 22 chromosomal pseudomolecules, including the X sex chromosome. The mitochondrial genome has also been assembled and is 16.93 kilobases in length. Gene annotation of this assembly on Ensembl identified 35,990 protein coding genes.

Chrysolina oricalcia, the Cow Parsley Leaf Beetle, is a 6-9.5 mm long, unicolourous dark blue (sometimes greenish, purple, coppery, or almost black) leaf beetle.It can be distinguished from other British species of the genus by the very regular, coarse, but rather sparse puncture rows on the otherwise smooth and shining elytra in combination with the sharp and almost straight groove running parallel to the pronotal margins (Duff, 2016).The latter character is a distinguishing feature of the subgenus Sulcicollis.
Members of Chrysomelidae are all phytophagous.While some species are polyphagous, most of them are strictly associated with certain families or species of higher plants (monophagous or oligophagous), both for their larval development and adult feeding.Chrysolina oricalcia is considered an oligophagous species, using various Apiaceae as its larval and adult host plants, but with a strong preference for Cow Parsley, Anthriscus sylvestris (Rheinheimer & Hassler, 2018).Adults of C. oricalcia are mostly crepuscular or nocturnal.They hatch in late summer, but are found most frequently from April to June, after re-emerging from diapause.Females are ovoviviparous (Bontems, 1985).Larvae are external leaf feeders on their host plants, where they can be encountered from April until late summer, before pupating in the soil (Rheinheimer & Hassler, 2018).Cox and Broad (2020) reported parasitism of the larvae of C. oricalcia by an ichneumonid wasp Nepiesta mandibularis (Holmgren).
Chrysolina oricalcia is the only species of its subgenus occurring on the British Isles.Its overall distribution includes most of Europe excluding the Iberian Peninsula and the extreme North, as well as parts of Turkey and Mongolia.Up to the 1990s it was considered a scarce and potentially declining species in the UK, given the status "Notable B" in Hyman and Parsons (1992).It has however since increased in abundance and can now be considered a common species particularly in the South-East of England, East Anglia, and the West Midlands (Cox, 2007;James, 2018), but only known from few localities in Scotland (Ramsay, 2000).It was reassessed as "Least Concern" in the most recent conservation status review by Hubble (2014).The species is most often found in humid woodland ecotones, which can include parks and gardens, wherever large stands of Anthriscus sylvestris grow (Rheinheimer & Hassler, 2018 and own observations in SE England).
The high-quality genome of Chrysolina oricalcia was sequenced from a single specimen (NHMUK014111041; SAMEA7520950) from Wigmore Park, Luton, UK (Figure 1b)).It will aid research on taxonomy, phylogeny, and biology of this species.The genome was sequenced as part of the Darwin Tree of Life Project, a collaborative effort to sequence all named eukaryotic species in the Atlantic Archipelago of Britain and Ireland.

Genome sequence report
The genome was sequenced from one male Chrysolina oricalcia (Figure 1b)) collected from Wigmore Park, Luton

Amendments from Version 1
We have responded to peer review comments by correcting "Pretext" to "PretextView" and we have corrected the assembly accession number for genome annotation.
Any further responses from the reviewers can be found at the end of the article  1).Most (99.84%) of the assembly sequence was  2).While not fully phased, the assembly deposited is of one haplotype.Contigs corresponding to the second haplotype have also been deposited.The mitochondrial genome was also assembled and can be found as a contig within the multifasta file of the genome submission.
Metadata for specimens, spectral estimates, sequencing runs, contaminants and pre-curation assembly statistics can be found at https://links.tol.sanger.ac.uk/species/1587174.

Genome annotation report
The Chrysolina oricalcia genome assembly (GCA_944452925.1)was annotated using the Ensembl rapid annotation pipeline (Table 1; https://rapid.ensembl.org/Chrysolina_oricalcia_GCA_944452925.1/Info/Index).The resulting annotation includes 36,271 transcribed mRNAs from 35,990 protein-coding genes.The  High Sensitivity Assay kit.Fragment size distribution was evaluated by running the sample on the FemtoPulse system.

Sequencing
Pacific Biosciences HiFi circular consensus and 10X Genomics read cloud DNA sequencing libraries were constructed according to the manufacturers' instructions.DNA sequencing was performed by the Scientific Operations core at the WSI on Pacific Biosciences SEQUEL II (HiFi) and HiSeq X Ten (10X) instruments.Hi-C data were also generated from remaining abdomen tissue of icChrOric1 using the Arima2 kit and sequenced on the Illumina NovaSeq 6000 instrument.

Genome assembly, curation and evaluation
Assembly was carried out with Hifiasm (Cheng et al., 2021) and haplotypic duplication was identified and removed with purge_dups (Guan et al., 2020).One round of polishing was performed by aligning 10X Genomics read data to the assembly with Long Ranger ALIGN, calling variants with FreeBayes (Garrison & Marth, 2012).The assembly was then scaffolded with Hi-C data (Rao et al., 2014) using YaHS (Zhou et al., 2023).The assembly was checked for contamination and corrected using the gEVAL system (Chow et al., 2016) as described previously (Howe et al., 2021).Manual curation was performed using gEVAL, HiGlass (Kerpedjiev et al., 2018) and PretextView (Harry, 2022).The mitochondrial genome was assembled using MitoHiFi (Uliano- Silva et al., 2023), which runs MitoFinder (Allio et al., 2020) and uses these annotations to select the final mitochondrial contig and to ensure the general quality of the sequence.
Table 3 contains a list of relevant software tool versions and sources.

Genome annotation
The BRAKER2 pipeline (Brůna et al., 2021) was used in the default protein mode to generate annotation for the Chrysolina oricalcia assembly (GCA_944452925.1) in Ensembl Rapid Release.

Wellcome Sanger Institute -Legal and Governance
The materials that have contributed to this genome note have been supplied by a Darwin Tree of Life Partner.The submission of materials by a Darwin Tree of Life Partner is subject to the 'Darwin Tree of Life Project Sampling Code of Practice', which can be found in full on the Darwin Tree of Life website here.By agreeing with and signing up to the Sampling Code of Practice, the Darwin Tree of Life Partner agrees they will meet the legal and ethical requirements and standards set out within this document in respect of all samples acquired for, and supplied to, the Darwin Tree of Life Project.
Further, the Wellcome Sanger Institute employs a process whereby due diligence is carried out proportionate to the nature of the materials themselves, and the circumstances under which they have been/are to be collected and provided for use.The purpose of this is to address and mitigate any potential legal and/or ethical implications of receipt and use of the materials as part of the research project, and to ensure that in doing so we align with best practice wherever possible.The overarching areas of consideration are: • Ethical review of provenance and sourcing of the material chromosome level assembly for the 99% of the sequences.The availability of such data is really useful, especially in the field of molecular ecology and conservation genetics.The manuscript is well written, the methods are clearly described and the results as well.
Is the rationale for creating the dataset(s) clearly described?Yes Are the protocols appropriate and is the work technically sound?Yes

Are the datasets clearly presented in a useable and accessible format? Yes
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Animal genetics and genomics I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.Overall, your Data Note is nicely and succinctly written, and also timely, as I am certain it will be appreciated in the entomological community and more broadly across evolutionary genomics.The megadiversity of Coleoptera is vastly under-sampled and under-appreciated, and thus, these new genomic resources will contribute to a growing understanding of genome diversity and evolution in this clade.I have a couple of suggestions here that deigned to improve the clarity, impact, and accessibility of your Data Note.Please don't hesitate to reach out with any questions if helpful.
A couple of suggestions: The Genome Sequence Report section could benefit from a few sentences describing the quality of the genome annotation.For example, The Rapid pipeline suggested ~36k distinct protein coding genes.Is it possible to elaborate here?This seems a little on the high-side, but BUSCO suggested only 0.8% are duplicated.For the sake understanding assembly quality, it would be helpful to at least qualitatively compare your assembly to other recently assembled beetle genomes.For example, is this number comparative to a small sampling of other insect genomes in similar quality?A brief sentence or two would be helpful to place your Data Note/Annotation quality in perspective.Data Notes discourage further analyses or conclusions, so I don't suggest any formal analyses here, just a little insight into the quality of the annotation, which many would appreciate. 1.
Increasing the size and font of your figures would help interpretability.The in-text rendering is difficult to see the small points, etc. (e.g., Fig. 3) 2.
Figure 2 could benefit from some clarification in the caption and/or figure itself.For example, color description, bp resolution, and would it be possible to add chromosome labels?

3.
Would it be possible to provide a (even course-scale) breakdown of the repeat element composition (e.g., RepeatMasker suite) of the C. oricalcia genome?

4.
As I side note, I really appreciate the inclusion of interactive figures (very nice!) 5.

Are the datasets clearly presented in a useable and accessible format? Yes
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Bioinformatics, Computational Genomics, Phylogenetics I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.The authors stated that they successfully reconstructed a high-completeness genome assembly at a chromosomal level and high-quality annotation for a male Cow Parsley Leaf Beetle Chrysolina oricalcia (O.F.Müller, 1776).The assembly was performed using three sequencing technologies: PacificBiosciences SEQUEL II, 10X Genomics Illumina, and Hi-C Illumina.The genome size is approximately 1,423.4Mb.The completeness of genome assembly was determined using BUSCO analysis which is a 99.3% of common genes were completely present.However, there are minor comments that need to be addressed: I observed that the authors stated the sample was collected from male Chrysolina oricalcia.However, according to the NCBI BioSample database, the sample's sex is listed as "NOT COLLECTED."It would be better for the BioSample information to be updated to reflect the male sex of the sample.

○
Additionally, the photo of Chrysolina oricalcia in Figure 1-b should be improved for better visualization.

○
Regarding the mitochondrial genome annotation, the use of the Mitofinder tool, which employs closely related species to guide the annotation.the authors should mention of the related species is used.Furthermore, since the authors utilized both MITOS and MitoFinder for annotation, a detailed explanation of how these tools were used to select the final mitochondrial contig would increase the reader's understanding.

○
In the Genome Assembly section, the tool "Pretext" should be changed to be "PretextView."Reviewer Expertise: Bioinformatics

Figure 1
Figure 1.a) Chrysolina oricalcia photographed by Michael Geiser in Gunnersbury Park on the 2023-05-21 (not the specimen used for genome sequencing).b) Photograph of the Chrysolina oricalcia specimen (NHMUK014111041) used for genome sequencing.

Figure 2 .
Figure 2. Genome assembly of Chrysolina oricalcia, icChrOric1.2:metrics.The BlobToolKit Snailplot shows N50 metrics and BUSCO gene completeness.The main plot is divided into 1,000 size-ordered bins around the circumference with each bin representing 0.1% of the 1,423,453,393 bp assembly.The distribution of scaffold lengths is shown in dark grey with the plot radius scaled to the longest scaffold present in the assembly (85,697,287 bp, shown in red).Orange and pale-orange arcs show the N50 and N90 scaffold lengths (69,622,592 and 53,506,149 bp), respectively.The pale grey spiral shows the cumulative scaffold count on a log scale with white scale lines showing successive orders of magnitude.The blue and pale-blue area around the outside of the plot shows the distribution of GC, AT and N percentages in the same bins as the inner plot.A summary of complete, fragmented, duplicated and missing BUSCO genes in the endopterygota_odb10 set is shown in the top right.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/icChrOric1.2/dataset/CALYCE02/snail.

Figure 5 .
Figure 5. Genome assembly of Chrysolina oricalcia, icChrOric1.2:Hi-C contact map of the icChrOric1.2assembly, visualised using HiGlass.Chromosomes are shown in order of size from left to right and top to bottom.An interactive version of this figure may be viewed at https://genome-note-higlass.tol.sanger.ac.uk/l/?d=I92pMEcZQxuyGSgz8qBnKQ.
Thank you for consideration of my review of "The genome sequence of the Cow Parsley Leaf Beetle, Chrysolina oricalcia (O.F.Müller, 1776)", which describes new genomic data, assembly, annotation, and associated resources for an interesting Chrysomelid beetle inhabiting the British Isles.The authors generated a battery of next-gen sequencing data for C. oricalcia, including HiFi, 10x Genomics, and Hi-C data.They assembled 82 contigs spanning ~79Mb of the genome (21 autosomes + X chromosome), and applied the Ensembl rapid pipeline to annotate the assembly.

©
2024 Alqahtani F. This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.Fahad Alqahtani1 King Abdulaziz City for Science and Technology, Riyadh, Saudi Arabia 2 King Abdulaziz City for Science and Technology, Riyadh, Saudi Arabia

○
Lastly, the paper has a confusion about versions: when showing chromosomes, it mentions the second version, but then goes back to the first version when talking about genome annotation.Explaining this mix-up would help keep the paper clear and accurate.○Isthe rationale for creating the dataset(s) clearly described?YesAre the protocols appropriate and is the work technically sound?YesAre sufficient details of methods and materials provided to allow replication by others?YesAre the datasets clearly presented in a useable and accessible format?YesCompeting Interests: No competing interests were disclosed.

Table 1 . Genome data for Chrysolina oricalcia, icChrOric1.2. Project accession data
. A total of 45-fold coverage in Pacific Biosciences single-molecule HiFi long reads and 23-fold coverage in 10X Genomics read clouds were generated.Primary assembly contigs were scaffolded with chromosome conformation Hi-C data.Manual assembly curation corrected 92 missing joins or misjoins and removed two haplotypic duplications, reducing the scaffold number by 43.15%, and increasing the scaffold N50 by 99.91%.The final assembly has a total length of 1423.4Mb in 82 sequence scaffolds with a scaffold N50 of 69.6 Mb (Table