The genome sequence of a click beetle, Agrypnus murinus (Linnaeus, 1758)

We present a genome assembly from an individual male Agrypnus murinus (a click beetle; Arthropoda; Insecta; Coleoptera; Elateridae). The genome sequence is 1,578.5 megabases in span. Most of the assembly is scaffolded into 9 chromosomal pseudomolecules, including the X sex chromosome. The mitochondrial genome has also been assembled and is 18.23 kilobases in length. Gene annotation of this assembly on Ensembl identified 42,204 protein coding genes.


Background
Agrypnus murinus (Order: Coleoptera, Family: Elateridae) is a species of click beetle with a Palearctic distribution, with records throughout western Europe, and parts of Asia (GBIF Secretariat, 2022).Adults grow up to 17 mm in length, and have a distinctive dense, dark brown-grey, scale-like pubescence covering the fused elytra.
Agrypnus murinus inhabits scrubland and grasslands across the UK.It is more commonly found across mid-and southern England and Wales, although there are records of it as far north as Yorkshire (NBN Atlas Partnership, 2021).It is widespread across Europe, except in the far northern regions.Eggs are laid in soil during late summer and during the autumn A. murinus emerge in their larval form, known as wireworms.The larvae feed on the roots of plants and other insects, and can cause significant damage to crops such as potatoes, and are therefore considered a significant agricultural pest (Dettner & Beran, 2000).
The genome sequence of this beetle will be important in future studies involving environmental DNA traces or investigating invertebrate communities for analysis of agricultural soils and pest control.The genome of A. murinus has now been sequenced as part of the Darwin Tree of Life Project, a collaborative effort to sequence all named eukaryotic species in the Atlantic Archipelago of Britain and Ireland.Here we present a chromosomally complete genome sequence for A. murinus, based on one male specimen from Wytham Woods, Oxfordshire, UK.

Genome sequence report
The genome was sequenced from one male Agrypnus murinus (Figure 1) collected from Wytham Woods, Oxfordshire, UK (51.77,.A total of 28-fold coverage in Pacific Biosciences single-molecule HiFi long reads and 50-fold coverage in 10X Genomics read clouds were generated.Primary assembly contigs were scaffolded with chromosome conformation Hi-C data.Manual assembly curation corrected 405 missing joins or mis-joins and removed 10 haplotypic duplications, reducing the assembly length by 0.12% and the scaffold number by 57.18%, and increasing the scaffold N50 by 7.32%. The final assembly has a total length of 1578.5 Mb in 161 sequence scaffolds with a scaffold N50 of 177.3 Mb (Table 1).Most (99.59%) of the assembly sequence was assigned to 9 chromosomal-level scaffolds, representing 8 autosomes and the X sex chromosome.Chromosome-scale scaffolds confirmed by the Hi-C data are named in order of size (Figure 2-Figure 5; Table 2).While not fully phased, the assembly deposited is of one haplotype.Contigs corresponding to the second haplotype have also been deposited.The mitochondrial genome was also assembled and can be found as a contig within the multifasta file of the genome submission.
Metadata for specimens, spectral estimates, sequencing runs, contaminants and pre-curation assembly statistics can be found at https://links.tol.sanger.ac.uk/species/195168.

Sample acquisition and nucleic acid extraction
A male Agrypnus murinus (icAgrMuri1) was collected from Wytham Woods, Oxfordshire (biological vice-county Berkshire), UK (latitude 51.77, longitude -1.34) on 2020-06-20 by netting.Liam Crowley (University of Oxford) collected and identified the specimen.The specimen was snap-frozen on dry ice.
DNA was extracted at the Tree of Life laboratory, Wellcome Sanger Institute (WSI).The icAgrMuri1 sample was weighed and dissected on dry ice with tissue set aside for Hi-C sequencing.Abdomen tissue was disrupted using a Nippi Powermasher fitted with a BioMasher pestleHigh molecular weight (HMW) DNA was extracted using the Qiagen MagAttract HMW DNA extraction kit.Low molecular weight DNA was removed from a 20 ng aliquot of extracted DNA using the 0.8X AMpure XP purification kit prior to 10X Chromium sequencing; a minimum of 50 ng DNA was submitted for 10X sequencing.HMW DNA was sheared into an average fragment size of 12-20 kb in a Megaruptor 3 system with speed setting 30.Sheared DNA was purified by solid-phase reversible immobilisation  using AMPure PB beads with a 1.8X ratio of beads to sample to remove the shorter fragments and concentrate the DNA sample.The concentration of the sheared and purified DNA was assessed using a Nanodrop spectrophotometer and Qubit Fluorometer and Qubit dsDNA High Sensitivity Assay kit.
Fragment size distribution was evaluated by running the sample on the FemtoPulse system.

Sequencing
Pacific Biosciences HiFi circular consensus and 10X Genomics read cloud DNA sequencing libraries were constructed according to the manufacturers' instructions.DNA sequencing was performed by the Scientific Operations core at the WSI on Pacific Biosciences SEQUEL II (HiFi) and Illumina NovaSeq 6000 (10X) instruments.Hi-C data were also generated from  Table 3 contains a list of relevant software tool versions and sources.

Genome annotation
The BRAKER2 pipeline (Brůna et al., 2021) was used in the default protein mode to generate annotation for the Agrypnus   murinus assembly (GCA_929113105.1) in Ensembl Rapid Release.

Wellcome Sanger Institute -Legal and Governance
The materials that have contributed to this genome note have been supplied by a Darwin Tree of Life Partner.The submission of materials by a Darwin Tree of Life Partner is subject to the 'Darwin Tree of Life Project Sampling Code of Practice', which can be found in full on the Darwin Tree of Life website here.By agreeing with and signing up to the Sampling Code of Practice, the Darwin Tree of Life Partner agrees they will meet the legal and ethical requirements and standards set out within this document in respect of all samples acquired for, and supplied to, the Darwin Tree of Life Project.
Further, the Wellcome Sanger Institute employs a process whereby due diligence is carried out proportionate to the nature of the materials themselves, and the circumstances under which they have been/are to be collected and provided for use.The purpose of this is to address and mitigate any potential legal and/or ethical implications of receipt and use of the materials

Software tool Version
as part of the research project, and to ensure that in doing so we align with best practice wherever possible.The overarching areas of consideration are: • Ethical review of provenance and sourcing of the material

Is the rationale for creating the dataset(s) clearly described? Yes
Are the protocols appropriate and is the work technically sound?Yes

Are sufficient details of methods and materials provided to allow replication by others? Yes
Are the datasets clearly presented in a useable and accessible format?Yes Competing Interests: No competing interests were disclosed.
I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.
The rationale for creating the dataset is well-justified.The sequencing and assembly methods employed are both reasonable and reproducible.Additionally, the raw data, assembly, and annotations are accessible.
I approve of the manuscript in its current form.However, there are a few comments that could enhance the manuscript and the dataset for user convenience: Although the link to the annotations is provided in the genome annotation report section, it is recommended to also include this link in the data availability section to improve visibility. 1.
In the mitochondrial genome assembly method, "MitoFinder or MITOS" was used for the annotation.Please clarify which tool was used for this species.In addition, it would be beneficial to provide the mitochondrial genome annotation as well.

2.
The predicted gene count is 42,204, which seems unusually high based on my experience with beetle genomes, unless a large-scale duplication has occurred in this species.This high number might result from fragmented gene predictions by BRAKER2, particularly if only the protein mode was utilized.I have noticed transcriptomic sequencing data under BioProject accession PRJEB46297.A more refined version of gene predictions using transcriptomic data and the complete Ensembl pipeline would be advantageous for users of this genome.

Is the rationale for creating the dataset(s) clearly described? Yes
Are the protocols appropriate and is the work technically sound?Yes

Are sufficient details of methods and materials provided to allow replication by others? Yes
Are the datasets clearly presented in a useable and accessible format?

Yes
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: comparative genomics, evolutionary biology, bioinformatics, phylogenetics, beetle I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

R Axel W Wiberg
Stockholm University, Stockholm, Sweden The Data Note describes the genome assembly of beetle species Agrypnus murinus using the DTOL standardised pipeline.Overall I think the manuscript is very good.
The rationale for creating the dataset is well described.Although I think there could be some important additions here (see below).
It is technically sound with the methods/protocols well described and referenced.
The references to the DTOL nextflow pipelines and other software used ensure that replication would be straightforward.
The results and datasets are mostly clearly presented (see below).The genome sequence and the annotation are easily available.
I think the article is valid in it's current form, but do have a few additional comments: 1) In the 1 st paragraph of the "Background" section the authors write: "Adults grow up to 17 mm in length.." There is often a sexual dimorphism in insects, I would like to see sizes for males and females separately, or if there is no sexual dimorphism for that to be stated.
2) Coleopterans vary a lot in their sex-determination systems and sex-chromosome karyology.While most are XY, XO systems are common in several groups.For this species, and especially because the specimen sequenced here is male, it would be really useful to note in the text that, while the karyology of Agrypnus murinus seems unknown at the moment, other close relatives are XO.Indeed, among the Elateridae, XO systems reach nearly 70% of examined species.Thus, a Y chromosome is not expected in the assembly.Adding this information would add context to those using the genome, increase it the generality of the note.
See the very useful database here: http://coleoguy.github.io/karyotypes/and the citations here: In addition, the data note mentions 8 autosomes and the X chromosome in the assembly.Is the general karyology of the species well known?Are 8 autosomes expected, I'm not sure if there is information on the full karyotype of the species?
3) The ~42,000 genes seems on the high end from my experience.In another Coleopteran genome that I was involved in, annotation was quite fragmented at least in part due to the high number of repetitive elements.I think it would be useful to include in this Data Note (and perhaps in DTOL's standardised pipelines in general) a simple RepeatModeler and RepeatMasker analysis to get at least a rough picture of the repeat content.
RepeatModeler: https://www.repeatmasker.org/RepeatModeler/RepeatMasker: https://www.repeatmasker.org/It is in general a useful thing for a genome assembly to have as it helps in all sorts of down-stream applications.

4)
Figure 5 would benefit from some kind of x-and y-axis labels at least for the largest scaffolds.Which is the X chromosome?One can work it out from table 2, but axis labels would be better I think.

5)
As far as I can tell, the mtGenome is present as a contig in the assembly file, but the annotation files do not include the mtGenome.Would it be possible to integrate the MITOS annotation somehow?Or provide it separately?This would make the resource much more accessible and useful.

Yes
Are the datasets clearly presented in a useable and accessible format?Partly Competing Interests: No competing interests were disclosed.
Reviewer Expertise: evolutionary biology, comparative and population genomics, sexual selection and sexual conflict I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Figure 2 .
Figure 2. Genome assembly of Agrypnus murinus, icAgrMuri1.1:metrics.The BlobToolKit Snailplot shows N50 metrics and BUSCO gene completeness.The main plot is divided into 1,000 size-ordered bins around the circumference with each bin representing 0.1% of the 1,578,523,859 bp assembly.The distribution of scaffold lengths is shown in dark grey with the plot radius scaled to the longest scaffold present in the assembly (255,265,360 bp, shown in red).Orange and pale-orange arcs show the N50 and N90 scaffold lengths (177,293,933 and 135,112,989 bp), respectively.The pale grey spiral shows the cumulative scaffold count on a log scale with white scale lines showing successive orders of magnitude.The blue and pale-blue area around the outside of the plot shows the distribution of GC, AT and N percentages in the same bins as the inner plot.A summary of complete, fragmented, duplicated and missing BUSCO genes in the endopterygota_odb10 set is shown in the top right.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/icAgrMuri1.1/dataset/CAKMYN01/snail.

Figure 3 .
Figure 3. Genome assembly of Agrypnus murinus, icAgrMuri1.1:BlobToolKit GC-coverage plot.Scaffolds are coloured by phylum.Circles are sized in proportion to scaffold length.Histograms show the distribution of scaffold length sum along each axis.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/icAgrMuri1.1/dataset/CAKMYN01/blob.

Figure 4 .
Figure 4. Genome assembly of Agrypnus murinus, icAgrMuri1.1:BlobToolKit cumulative sequence plot.The grey line shows cumulative length for all scaffolds.Coloured lines show cumulative lengths of scaffolds assigned to each phylum using the buscogenes taxrule.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/icAgrMuri1.1/dataset/CAKMYN01/ cumulative.

Figure 5 .
Figure 5. Genome assembly of Agrypnus murinus, icAgrMuri1.1:Hi-C contact map of the icAgrMuri1.1 assembly, visualised using HiGlass.Chromosomes are shown in order of size from left to right and top to bottom.An interactive version of this figure may be viewed at https://genome-note-higlass.tol.sanger.ac.uk/l/?d=EbECSrE6ShqDN7sF_hOHsA.

Table 3 . Software tools: versions and sources.
This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.