The genome sequence of the silver-studded blue, Plebejus argus (Linnaeus, 1758)

We present a genome assembly from an individual male Plebejus argus (silver-studded blue; Arthropoda; Insecta; Lepidoptera; Lycaenidae). The genome sequence is 382 megabases in span. The entire assembly (100%) is scaffolded into 23 chromosomal pseudomolecules with the Z sex chromosome assembled. The complete mitochondrial genome was also assembled and is 27.4 kilobases in length. Gene annotation of this assembly on Ensembl identified 12,693 protein coding genes.


Background
The silver-studded blue butterfly, Plebejus argus (Linnaeus, 1758), belongs to the Lycaenidae family. The species is distributed across the Palearctic, including the UK, where it has declined in numbers considerably over the last 100 years (Harris, 2008). The butterfly derives its name from the submarginal row of silvery-blue 'studs' present on the underside of the hindwing. In males, the upper side of the wings is blue with black borders, whilst in females it is brown with a row of submarginal orange spots.
The silver-studded blue is mainly found on coastal heathland, mossland, grassland, and dunes in the UK (Thomas, 1993), but across its range it can be found also in mountainous areas, up to alpine habitats. The silver-studded blue is a poor disperser, with few individuals travelling further than 20-50 m daily (Hovestadt & Nieminen, 2009;Ravenscroft, 1990;Sielezniew et al., 2011), hindering movement of the species to new habitats.
The silver-studded blue seeks relatively warm environments in the northern part of its distribution range. UK populations are mainly restricted to the south and east of England and Wales, and it is considered extinct in the North (Thomas, 1993). However, where populations do occur, these often consist of large numbers of individuals. While P. argus is considered a species of Least Concern according to the IUCN Red List for Europe (van Swaay et al., 2010), it is listed as vulnerable on the UK Red List (Fox et al., 2022). Factors contributing to its decline in the UK include habitat loss and fragmentation arising from urbanisation, agriculture and habitat succession (Brookes et al., 1997;Thomas, 1993;Thomas et al., 1999).
The silver-studded blue is mostly univoltine, except for some populations that may have more than a generation every year.
The caterpillars are polyphagous, and a notable variety of host plants have been recorded. Additionally, P. argus has an obligate mutualistic relationship with ants from the genus Lasius, which safeguard its eggs (which are laid close to Lasius ant nests) and tend its caterpillars, protecting them from parasites and predators, in return for a sugary secretion produced by the caterpillar. The caterpillars are nocturnal and spend the day protected inside the ant nest, where they pupate (Jordano et al., 1992;Ravenscroft, 1990;Seymour et al., 2003;Thomas, 1993). The requirement for the presence of their ant mutualists further limits the habitats that the silver-studded blue can colonise. Furthermore, the species will seldom recolonise habitats if the distance between habitat fragments becomes too large, owing to its poor capacity for dispersal (Ravenscroft, 1990;Thomas, 1993).
P. argus has 23 chromosome pairs (Lorković, 1941). The genome sequence of the silver-studded blue may help to enhance understanding of its genetic diversity and population biology, and ultimately assist with conservation efforts (Brookes et al., 1997). In particular, there are many recorded subspecies of the silver-studded blue, and population genetic studies using genome re-sequencing data may help to resolve the validity and relationships among these.

Genome sequence report
The genome was sequenced from a single male P. argus ( Figure 1) collected from Românași, Zalău, Romania. A total of 23-fold coverage in Pacific Biosciences single-molecule HiFi long reads and 109-fold coverage in 10X Genomics read clouds were generated. Primary assembly contigs were scaffolded with chromosome conformation Hi-C data. Manual assembly curation corrected 182 missing/misjoins and removed 26 haplotypic duplications, reducing the assembly size by 2.31% and the scaffold number by 66.38%, and increasing the scaffold N50 by 18.92%.
The final assembly has a total length of 382 Mb in 38 sequence scaffolds with a scaffold N50 of 16.7 Mb (Table 1). The assembled sequence (100%) was assigned to 23 chromosomal-level scaffolds, representing 22 autosomes (numbered by sequence length) and the Z sex chromosome (Figure 2- Figure 5; Table 2). The assembly has a BUSCO v5.3.2 (Manni et al., 2021) completeness of 96.9% (single 96.5%, duplicated 0.4%) using the lepidoptera_odb10 reference set (n = 5,286). While not fully phased, the assembly deposited is of one haplotype. Contigs corresponding to the second haplotype have also been deposited.

Sample acquisition and nucleic acid extraction
A single male P. argus specimen (ilPleArgu1; genome assembly, Hi-C) was collected using a handnet from Românași, Zalău, Romania (latitude 47.119, longitude 23.165) by Alex Hayward (University of Exeter), Konrad Lohse, Dominik Laetsch (University of Edinburgh) and Roger Vila (Institut de Biologia    instruments. Hi-C data were generated in the Tree of Life laboratory from remaining whole organism tissue of ilPleArgu1 using the Arima v1 kit and sequenced on a HiSeq 10X instrument.

Genome assembly
Assembly was carried out with Hifiasm (Cheng et al., 2021) and haplotypic duplication was identified and removed with purge_dups (Guan et al., 2020). One round of polishing was performed by aligning 10X Genomics read data to the assembly with longranger align, calling variants with freebayes (Garrison & Marth, 2012). The assembly was then scaffolded with Hi-C data (Rao et al., 2014) using SALSA2 (Ghurye et al., 2019. The assembly was checked for contamination and corrected using the gEVAL system (Chow et al., 2016) as described previously (Howe et al., 2021). Manual curation (Howe et al., 2021) was performed using gEVAL, HiGlass (Kerpedjiev et al., 2018) and Pretext (Harry, 2022). The mitochondrial genome was assembled using MitoHiFi (Uliano-Silva et al., 2021), which performs annotation using MitoFinder (Allio et al., 2020). The genome was analysed and BUSCO scores were generated within the BlobToolKit environment (Challis et al., 2020). Table 3 contains a list of all software tool versions used, where appropriate.

Genome annotation
The Ensembl gene annotation system (Aken et al., 2016) was used to generate annotation for the Plebejus argus assembly (GCA_905404155.1). Annotation was created primarily through alignment of transcriptomic data to the genome, with gap filling via protein-to-genome alignments of a select set of proteins from UniProt (UniProt Consortium, 2019).

Bin Liang
College of Life Science, University of Inner Mongolia, Hohhot, China This paper completes high-quality genomic denovo assembly for Plebejus argus and combined with transcriptomic data and UniProt for genome annotation. The authors used Pacific Biosciences, Illumina HiSeq 10X and Hi-C data for mixed assembly, scaffolded into 23 chromosomal pseudomolecules with the Z sex chromosome and mitochondrial genome. Meanwhile, the N50 index, BUSCO gene integrity, GC coverage, cumulative sequence, Hi-C contact map and chromosome assembly information of the assembly results are displayed, reflecting the completeness and accuracy of the research results. In summary, the establishment of its highquality genome provides an important reference genome for the genomics and ecology study of blue butterfly.
Only one question is about the size of mitochondrial genome. In abstract, author mentioned mitogenome size is 27.4 kilobases in length. I think this is an error. Its length is about 15,390 bp (accession: FR989949.3, similar with another published report.

Is the rationale for creating the dataset(s) clearly described? Yes
Are the protocols appropriate and is the work technically sound? Yes

Are sufficient details of methods and materials provided to allow replication by others? Yes
Are the datasets clearly presented in a useable and accessible format? Yes Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Entomology, bioinformatics, ecology I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.