The genome sequence of Philonthus cognatus (Stephens, 1832) (Coleoptera, Staphylinidae), a rove beetle

We present a genome assembly from an individual male Philonthus cognatus (a rove beetle; Arthropoda; Insecta; Coleoptera; Staphylinidae). The genome sequence is 1,030.6 megabases in span. Most of the assembly is scaffolded into 12 chromosomal pseudomolecules, including the X and Y sex chromosomes. The mitochondrial genome has also been assembled and is 20.7 kilobases in length. Gene annotation of this assembly on Ensembl identified 29,629 protein coding genes.


Background
Philonthus cognatus is a relatively large (8-11 mm), widespread rove beetle (family Staphylinidae), commonly found in damp soils in woodland and grassland habitats across the UK and western Palearctic. It was introduced into North America in the 19th century and is currently present in the USA and Canada (Majka & Klimaszewski, 2008). Adults are typically black in colour, although some individuals may have a bronze or greenish sheen, and the underside of the basal antennal segment is yellow, making members of this species one of the more easily recognised UK rove beetles. Breeding takes place in late spring to early summer and as a result there are two peaks in abundance, with adults most commonly found in the spring and early summer (March to June) and again in autumn (August to October) (NBN Atlas Partnership, 2021), although they are present all year round. P. cognatus, like many other rove beetles, are important predators of agricultural pests such as aphids (Dennis & Wratten, 1991;Kollat-Palenga & Basedow, 2000).
The species was named by James Stephens in his multi-volume synopsis of British insects, in the 1832 volume perhaps more famous for referencing some specimens collected by a "C. Darwin Esq." while he was still a student at Cambridge (Barlow, 1958;Stephens, 1832). Stephens described P. cognatus as being found "within the metropolitan district, but not common", which seems to contradict their current status as a very common and widespread beetle across the UK and their 'least concern' classification in a recent review of the status of the beetles of Great Britain (Boyce, 2022).
Rove beetles possess defensive glands that produce complex chemical secretions for defence (Brückner et al., 2021;Parker, 2017) or that have antimicrobial activity (Lusebrink et al., 2008), and the P. cognatus genome assembly will likely prove useful in identifying the biosynthetic pathways involved in the production of these secretions, as well as resolving the polyphyletic status of the species-rich genus Philonthus (Chani-Posse et al., 2018).

Genome sequence report
The genome was sequenced from one male Philonthus cognatus ( Figure 1) collected from Wytham Woods, Oxfordshire, UK (latitude 51.77, longitude -1.34). A total of 29-fold coverage in Pacific Biosciences single-molecule HiFi long reads and 47-fold coverage in 10X Genomics read clouds were generated. Primary assembly contigs were scaffolded with chromosome conformation Hi-C data. Manual assembly curation corrected four missing or mis-joins and removed two haplotypic duplications, reducing the assembly length by 0.31% and the scaffold number by 2.17%, and increasing the scaffold N50 by 0.67%.
The final assembly has a total length of 1,030.6 Mb in 45 sequence scaffolds with a scaffold N50 of 10.8.9 Mb (Table 1). Most (98.54%) of the assembly sequence was assigned to 12 chromosomal-level scaffolds, representing 10 autosomes, and the X and Y sex chromosomes. Chromosome-scale scaffolds confirmed by the Hi-C data are named in order of size (Figure 2- Figure 5; Table 2). While not fully phased, the assembly deposited is of one haplotype. Contigs corresponding to the second haplotype have also been deposited. The mitochondrial genome was also assembled and can be found as a contig within the multifasta file of the genome submission.

Genome annotation report
The Philonthus cognatus genome assembly GCA_932526585.1 was annotated using the Ensembl rapid annotation pipeline (Table 1; Ensembl accession number GCA_932526585.1). The resulting annotation includes 29,922 transcribed mRNAs from 29,629 protein-coding genes.

Methods
Sample acquisition and nucleic acid extraction A male P. cognatus (icPhiCogn1) was collected from Wytham Woods, Oxfordshire (biological vice-county: Berkshire), UK (latitude 51.77, longitude -1.34) on 8 December 2020. The specimen was taken from woodland habitat by Liam Crowley (University of Oxford) by potting. The specimen was identified by Mark Telfer (independent researcher) and snap-frozen on dry ice.   and dissected on dry ice with head tissue set aside for Hi-C sequencing. Abdomen tissue was disrupted using a Nippi Powermasher fitted with a BioMasher pestle. High molecular weight (HMW) DNA was extracted using the Qiagen MagAttract HMW DNA extraction kit. Low molecular weight DNA was removed from a 20-ng aliquot of extracted DNA using the 0.8X AMpure XP purification kit prior to 10X Chromium sequencing; a minimum of 50 ng DNA was submitted for 10X sequencing. HMW DNA was sheared into an average fragment size of 12-20 kb in a Megaruptor 3 system with     Biosciences SEQUEL II (HiFi), Illumina NovaSeq 6000 (RNA-Seq and 10X) instruments. Hi-C data were also generated from head tissue of icPhiCogn1 using the Arima v2 kit and sequenced on the Illumina NovaSeq 6000 instrument.

Genome assembly, curation and evaluation
Assembly was carried out with Hifiasm (Cheng et al., 2021) and haplotypic duplication was identified and removed with purge_dups (Guan et al., 2020). One round of polishing was performed by aligning 10X Genomics read data to the assembly with Long Ranger ALIGN, calling variants with FreeBayes (Garrison & Marth, 2012). The assembly was then scaffolded with Hi-C data (Rao et al., 2014) using YaHS (Zhou et al., 2023). The assembly was checked for contamination and corrected using the gEVAL system (Chow et al., 2016) as described previously (Howe et al., 2021). Manual curation was performed using gEVAL, HiGlass (Kerpedjiev et al., 2018) and Pretext (Harry, 2022). The mitochondrial genome was assembled using MitoHiFi (Uliano-Silva et al., 2022), which performed annotation using MitoFinder (Allio et al., 2020).
To evaluate the assembly, MerquryFK was used to estimate consensus quality (QV) scores and k-mer completeness (Rhie et al., 2020). The genome was analysed and BUSCO scores (Manni et al., 2021;Simão et al., 2015) were generated within the BlobToolKit environment (Challis et al., 2020). Table 3 contains a list of software tool versions and sources.

Ladislav Bocak
Biodiversity & Molecular Evolution, Czech Advanced Technology and Research Institute, Olomouc, Czech Republic Crowley et al. have presented a chromosome-level genome assembly of an individual male Philontus cognatus, a rove beetle from the Staphylinidae family in Coleoptera: Polyphaga. The genome sequence data has been scaffolded into 12 chromosomal pseudomolecules, with the inclusion of the assembled X and Y sex chromosome. The total genome size is over 1 GB, and the assembly is of highly complete. The study is a great contribution to the sources for further phylogenomic research.
The authors include a photograph of the sequenced specimen, but in contrast with the quality of the assembly, the photo is extremely poor and does not contribute to the presentation (lateral view with low resolution and poor focus). I would recommend that the authors provide a more detailed higher classification of the genus -subfamily, superfamily, and suborder.
The presentation of the research is clear and holds importance for future advancements in the field.

Is the rationale for creating the dataset(s) clearly described? Yes
Are the protocols appropriate and is the work technically sound? Yes

Are sufficient details of methods and materials provided to allow replication by others? Yes
Are the datasets clearly presented in a useable and accessible format? Yes