The genome sequence of a soldier beetle, Malthinus flaveolus (Herbst, 1786)

We present a genome assembly from an individual female Malthinus flaveolus (soldier beetle; Arthropoda; Insecta; Coleoptera; Cantharidae). The genome sequence is 236.7 megabases in span. Most of the assembly is scaffolded into 6 chromosomal pseudomolecules, including the X sex chromosome. The mitochondrial genome has also been assembled and is 19.27 kilobases in length. Gene annotation of this assembly on Ensembl identified 16,617 protein coding genes.

in others the black stripes are reduced in size, or even completely disappear, leaving the pronotum completely yellow.At 5-6 mm, M. flaveolus is one of the largest members of Malthininae in Britain, but still small compared to most Cantharinae.As in most soldier beetles, females are on average larger than males.
M. flaveolus is a species endemic to Europe, widely distributed between northern Spain and Russia west of the Urals, including Britain and Ireland and parts of Fennoscandia (Kazantsev & Brancucci, 2007).In Britain and Ireland, it is widespread and common in England and Wales, relatively scarce in Scotland and Ireland, and seemingly absent from the Isle of Man and all the islands surrounding Scotland (Alexander, 2003;NBN Atlas Partnership, 2021).
M. flaveolus is a species of woodlands and hedgerows, with adults active mainly in June to July (Bretzendorfer, 2017).Adult beetles are assumed to be predators hunting smaller arthropods on foliage, even though exact observations on the feeding habits of Malthininae are unfortunately still scarce (Alexander, 1991;Ramsdale, 2002).They are frequently found resting on the underside of leaves of various trees and shrubs, particularly along the edges of forests, and can be collected using a beating tray or a flight interception trap.Larvae of Malthininae are saproxylic predators, usually found under bark and in dead wood, while larvae of Cantharinae usually inhabit leaf litter (Alexander, 1991).Their larval biology explains why all British members of Malthininae seem to be associated with woodlands, while some members Cantharinae and Silinae frequent more open areas.It may also explain the clear drop-off in abundance between England and the less forested areas in northern Scotland.The larval morphology of M. flaveolus was studied by Fitton (1976).
Here we present a chromosomally complete genome sequence for Malthinus flaveolus, derived from a female specimen from Wytham Woods, Oxfordshire.It is hoped that this genome will aid research in the taxonomy and phylogeny of the group, as well as potential applications for environmental DNA and research on the pre-imaginal stages.

Genome sequence report
The genome was sequenced from one female Malthinus flaveolus (Figure 1) collected from Wytham Woods, Oxfordshire,

Background
The family Cantharidae, commonly known as "soldier beetles" comprises over 6,000 described species worldwide, of which 41 species are recorded from Britain and Ireland (Duff, 2018).The British species are grouped into three subfamilies: Cantharidae, with 24 species, Silinae with just one (Silis ruficollis) and the remaining 16 species are classified under Malthininae.The subfamilies of Cantharidae have recently been subject to a molecular phylogenetic study (Motyka et al., 2023).Malthininae includes the smallest and most difficultto-identify species, and also some of the least-known British soldier beetles.Malthinus flaveolus Herbst, 1786 is the most frequently recorded of the four British species in the genus Malthinus Latreille, 1806.
In some of the literature from continental Europe, the species is listed under the name M. punctatus Fourcroy, 1785, which, however, is a junior homonym.M. flaveolus is accepted as the valid name for the species in most recent taxonomic catalogues (e.g.Constantin, 2014;Fanti, 2014;Kazantsev, 2012;Kazantsev & Brancucci, 2007).
The British members of Malthininae can be identified using Duff (2020) or Dahlgren (1979).They include the smallest members of Cantharidae with a body length between 1.5 and 6 mm.A key feature of the British Malthininae are the somewhat shortened elytra, which leave a part of the hind wings exposed when the beetle is resting.This, however, is not true for all members of this subfamily worldwide.Malthinus is distinguished from Malthodes by the distinctive head shape, with a rather narrow "neck" and protruding eyes (particularly large in the males), located at the widest part of the head.While all British members of Malthodes are black or dark brown when fully coloured (often excluding the apices of the elytra and/or the prothorax), Malthinus have at least partially yellow legs, and often a mostly yellowish or pale brownish underside.Malthinus flaveolus is distinguished from the other three British Malthinus species by the combination of colour (entirely yellow legs and underside, bright yellow elytral apices) and the absence of distinct rows of impressed punctures on the elytra (which are present in M. seriepunctatus and M. balteatus).The pattern of the pronotum is also a good key character: In M. frontalis, the pronotum is entirely black; in M. seriepunctatus and M. balteatus, it features a central hourglass-shaped black pattern on a yellow or reddish-brown background (sometimes reduced to two disconnected triangles at the base and apex); in M. flaveolus, the pronotum is yellow with a black stripe or dot on each side of the middle line, not extending to the margins.In some M. flaveolus individuals, these two stripes can be expanded into a single black blotch (leaving the margins and parts of the middle line yellow), UK (51.77,.A total of 103-fold coverage in Pacific Biosciences single-molecule HiFi long reads was generated.Primary assembly contigs were scaffolded with chromosome conformation Hi-C data.Manual assembly curation corrected 4 missing joins or mis-joins, reducing the scaffold number by 11.11%. The final assembly has a total length of 236.7 Mb in 7 sequence scaffolds with a scaffold N50 of 36.5 Mb (Table 1).The snail plot in Figure 2 provides a summary of the assembly statistics, while the distribution of assembly scaffolds on GC proportion and coverage is shown in Figure 3.The cumulative assembly plot in Figure 4 shows curves for subsets of scaffolds assigned to different phyla.Most (99.98%) of the assembly sequence was assigned to 6 chromosomal-level scaffolds, representing 5 autosomes and the X sex chromosome.Chromosome-scale scaffolds confirmed by the Hi-C data are named in order of size (Figure 5; Table 2).The X chromosome was identified based on synteny with the genome assembly of Cantharis rufa (GCA_947369205.1)(Sivell et al., 2023).While not fully phased, the assembly deposited is of one haplotype.Contigs corresponding to the second haplotype have also been deposited.The mitochondrial genome was also assembled and can be found as a contig within the multifasta file of the genome submission.

Genome annotation report
The Malthinus flaveolus genome assembly (GCA_950108345.1)was annotated at the European Bioinformatics Institute (EBI) using the Ensembl rapid annotation pipeline.The resulting annotation includes 16,934 transcribed mRNAs from 16,617 protein-coding genes (Table 1; https://rapid.ensembl.org/Malthi-nus_flaveolus_GCA_950108345.1/Info/Index).  of core procedures: sample preparation; sample homogenisation, DNA extraction, fragmentation, and clean-up.In sample preparation, the icMalFlav1 sample was weighed and dissected on dry ice (Jay et al., 2023).Tissue from the whole organism was homogenised using a PowerMasher II tissue disruptor (Denton et al., 2023a).

Sample acquisition and nucleic acid extraction
HMW DNA was extracted in the WSI Scientific Operations core using the Automated MagAttract v2 protocol (Oatley et al., 2023).The DNA was sheared into an average fragment size of 12-20 kb in a Megaruptor 3 system with speed setting 31 (Bates et al., 2023).Sheared DNA was purified by solid-phase reversible immobilisation (Strickland et al., 2023): in brief, the method employs a 1.8X ratio of AMPure PB beads to sample to eliminate shorter fragments and concentrate the DNA.The concentration of the sheared and purified DNA was assessed using a Nanodrop spectrophotometer and Qubit Fluorometer and Qubit dsDNA High Sensitivity Assay kit.Fragment size distribution was evaluated by running the sample on the FemtoPulse system.
Protocols developed by the WSI Tree of Life laboratory are publicly available on protocols.io(Denton et al., 2023b).

Sequencing
Pacific Biosciences HiFi circular consensus DNA sequencing libraries were constructed according to the manufacturers' instructions.DNA sequencing was performed by the Scientific Operations core at the WSI on a Pacific Biosciences SEQUEL II instrument.Hi-C data were also generated from remaining tissue of icMalFlav1 using the Arima2 kit and sequenced on the Illumina NovaSeq 6000 instrument.
The mitochondrial genome was assembled using MitoHiFi (Uliano-Silva et al., 2023), which runs MitoFinder (Allio et al., 2020) or MITOS (Bernt et al., 2013) and uses these annotations to select the final mitochondrial contig and to ensure the general quality of the sequence.
Table 3 contains a list of relevant software tool versions and sources.

Genome annotation
The BRAKER2 pipeline (Brůna et al., 2021) was used in the default protein mode to generate annotation for the Malthinus  flaveolus assembly (GCA_950108345.1) in Ensembl Rapid Release at the EBI.

Wellcome Sanger Institute -Legal and Governance
The materials that have contributed to this genome note have been supplied by a Darwin Tree of Life Partner.The submission of materials by a Darwin Tree of Life Partner is subject to the 'Darwin Tree of Life Project Sampling Code of Practice', which can be found in full on the Darwin Tree of Life website here.By agreeing with and signing up to the Sampling Code of Practice, the Darwin Tree of Life Partner agrees they will meet the legal and ethical requirements and standards set out within this document in respect of all samples acquired for, and supplied to, the Darwin Tree of Life Project.
Further, the Wellcome Sanger Institute employs a process whereby due diligence is carried out proportionate to the nature of the materials themselves, and the circumstances under which they have been/are to be collected and provided for use.The purpose of this is to address and mitigate any potential legal and/or ethical implications of receipt and use of the materials as part of the research project, and to ensure that in doing so we align with best practice wherever possible.The overarching areas of consideration are: • Ethical review of provenance and sourcing of the material

Yangzi Wang
Johannes Gutenberg University Mainz Institute of Organismic and Molecular Evolution (Ringgold ID: 163392), Mainz, Rhineland-Palatinate, Germany The article presents a high-quality genome assembly of Malthinus flaveolus, generated using Pacific Biosciences HiFi long-read sequencing and Hi-C chromosomal conformation capture.This work has resulted in a valuable genomic resource for the scientific community, providing the first complete genome assembly and gene annotation for this species.The raw sequencing data, the final assembled genome, and the gene annotations are well documented and deposited in public database, making the data easily accessible for future research.Overall, the project appears to have been conducted with a high level of professionalism and technical expertise.However, there are some issues in the manuscript that prevent it from being suitable for Indexing at this stage: 1. Typos, e.g.: In the first paragraph of "Background", the authors mention three subfamilies within the family Cantharidae: "Cantharidae, with 24 species, Silinae with just one (Silis ruficollis), and the remaining 16 species are classified under Malthininae."Here, "Cantharidae" seems incorrect.Please review and correct this.Third Paragraph of "Background": The sentence "A key feature of the British Malthininae are the somewhat shortened elytra, which leave a part of the hind wings exposed when the beetle is resting."should use "is" instead of "are".

An issue concerning the identification of sex chromosome:
In the "genome sequence report" section, the manuscript states, "The X chromosome was identified based on synteny with the genome assembly of Cantharis rufa (GCA_947369205.1)(Sivell et al., 2023)."While it is exciting that the authors have attempted to identify the sex chromosome, the reference paper for the Cantharis rufa genome (Sivell et al., 2023) does not provide a clear explanation of how the sex chromosomes were identified.This makes the current claim less reliable.The authors should provide more robust evidence or additional analyses to support the identification of the sex chromosome.

Na Ra Shin
University of Memphis, Memphis, USA This data note provides a high-quality reference genome assembly for the soldier beetle, Malthinus flaveolus, and includes a detailed introduction to the species.The clarity of the data note is enhanced by its comprehensive description of sample information, HMW DNA extraction procedures, and sequencing steps.

Minor comments:
1) It would be helpful to include the BUSCO score in the abstract.
2) Please ensure that the full name is provided before using any abbreviations.(e.g., M. punctatus) Is the rationale for creating the dataset(s) clearly described?Yes Are the protocols appropriate and is the work technically sound?

Are sufficient details of methods and materials provided to allow replication by others? Yes
Are the datasets clearly presented in a useable and accessible format?Yes Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Insect genome and transcriptome I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Figure 2 .
Figure 2. Genome assembly of Malthinus flaveolus, icMalFlav1.1:metrics.The BlobToolKit snail plot shows N50 metrics and BUSCO gene completeness.The main plot is divided into 1,000 size-ordered bins around the circumference with each bin representing 0.1% of the 236,723,918 bp assembly.The distribution of scaffold lengths is shown in dark grey with the plot radius scaled to the longest scaffold present in the assembly (95,637,330 bp, shown in red).Orange and pale-orange arcs show the N50 and N90 scaffold lengths (36,462,998 and 26,549,421 bp), respectively.The pale grey spiral shows the cumulative scaffold count on a log scale with white scale lines showing successive orders of magnitude.The blue and pale-blue area around the outside of the plot shows the distribution of GC, AT and N percentages in the same bins as the inner plot.A summary of complete, fragmented, duplicated and missing BUSCO genes in the endopterygota_odb10 set is shown in the top right.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/icMalFlav1_1/dataset/icMalFlav1_1/snail.
A female Malthinus flaveolus (specimen ID Ox001636, ToLID icMalFlav1) was collected from Wytham Woods, Oxfordshire (biological vice-county Berkshire), UK (latitude 51.77, longitude -1.33) on 2021-07-08 by potting.The specimen was collected and identified by Mark Telfer (independent researcher), and then snap-frozen on dry ice.The workflow for high molecular weight (HMW) DNA extraction at the Wellcome Sanger Institute (WSI) includes a sequence

Figure 3 .
Figure 3. Genome assembly of Malthinus flaveolus, icMalFlav1.1:BlobToolKit GC-coverage plot.Sequences are coloured by phylum.Circles are sized in proportion to sequence length.Histograms show the distribution of sequence length sum along each axis.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/icMalFlav1_1/dataset/icMalFlav1_1/blob.

Figure 4 .
Figure 4. Genome assembly of Malthinus flaveolus, icMalFlav1.1:BlobToolKit cumulative sequence plot.The grey line shows cumulative length for all sequences.Coloured lines show cumulative lengths of sequences assigned to each phylum using the buscogenes taxrule.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/icMalFlav1_1/dataset/icMalFlav1_1/cumulative.

Figure 5 .
Figure 5. Genome assembly of Malthinus flaveolus, icMalFlav1.1:Hi-C contact map of the icMalFlav1.1 assembly, visualised using HiGlass.Chromosomes are shown in order of size from left to right and top to bottom.An interactive version of this figure may be viewed at https://genome-note-higlass.tol.sanger.ac.uk/l/?d=eujcxwmbQrixlQ_Cei5ejA.

©
2024 Shin N.This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Is the rationale for creating the dataset(s) clearly described? Yes Are the protocols appropriate and is the work technically sound? Partly Are sufficient details of methods and materials provided to allow replication by others? Yes Are the datasets clearly presented in a useable and accessible format? Yes Competing Interests:
No competing interests were disclosed.

have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.
Reviewer Report 16 September 2024 https://doi.org/10.21956/wellcomeopenres.23326.r96342