The genome sequence of the black needle fly, Leuctra nigra (Olivier, 1811)

We present a genome assembly from an individual male Leuctra nigra (black needle fly; Arthropoda; Insecta; Plecoptera; Leuctridae). The genome sequence is 536.3 megabases in span. Most of the assembly is scaffolded into 13 chromosomal pseudomolecules , including the X sex chromosome. The mitochondrial genome has also been assembled and is 17.6 kilobases in length.


Background
The stonefly Leuctra nigra ( Figure 1) is a western Palearctic species found across central Europe from France to Ukraine and north to Fennoscandia. It is found throughout Britain and Ireland, generally in northern and western areas, although there are scattered records from south-east England and the south of Ireland.
It is considered a eurytherm (Ravizza & Vinçon, 1998) and is typically found in low densities in silty, sandy and fine gravel habitats, but also amongst woody debris, twigs roots and logs (Baars & Kelly-Quinn, 2006;O'Connor & Costello, 1997). Some of these streams can be very small and easily overlooked. This species is known to predominate in acidified moorland and coniferous forest streams in both Denmark and Great Britain (Friberg et al., 1998;Thomsen & Friberg, 2002;Stoner et al., 1984;Murphy et al., 2013). In Ireland, L. nigra is much rarer although it has been recorded in episodically acidic streams (Feeley, 2012;Feeley & Kelly-Quinn, 2014). Adults are generally also found in low densities amongst vegetation nearby associated streams (O'Connor & Costello, 1997).
The life cycle of Leuctra nigra is highly variable across Europe. Several studies have reported a one year, univoltine, life cycle (Brinck, 1949;Hynes, 1941). However, studies have found that larvae take two years to develop in northern England and Denmark (Elliott, 1987;Thomsen & Friberg, 2002), or a mixture of both one and two years in acidic and iron-rich watercourses (Hildrew et al., 1980). Larvae are opportunistic feeders utilising a range of allochthonous and autochthonous plant material (Thomsen & Friberg, 2002;Graf et al., 2002;Graf et al., 2009;Henderson et al., 1990;Lillehammer, 1988). The high-quality genome sequence described here is, to our knowledge, the first reported for Leuctra nigra, and has been generated as part of the Darwin Tree of Life project. It will aid in understanding the biology, physiology and ecology of the species.

Genome sequence report
The genome was sequenced from one male Leuctra nigra collected from River Taff Fawr, Garwnant, UK (latitude 51.808259, longitude -3.44498). A total of 32-fold coverage in Pacific Biosciences single-molecule HiFi long reads was generated. Primary assembly contigs were scaffolded with chromosome conformation Hi-C data. Manual assembly curation corrected 458 missing joins or mis-joins and removed seven haplotypic duplications, reducing the scaffold number by 47.9%, and increasing the scaffold N50 by 4.62%.
The final assembly has a total length of 536.3 Mb in 174 sequence scaffolds with a scaffold N50 of 39.2 Mb (Table 1). Most (98.52%) of the assembly sequence was assigned to 13 chromosomal-level scaffolds, representing 12 autosomes and the X sex chromosome. The X chromosome was found at half coverage and no Y chromosome was found. Chromosome-scale scaffolds confirmed by the Hi-C data are named in order of size (Figure 2- Figure 5; Table 2). The assembly has a BUSCO v5.3.2 (Manni et al., 2021) completeness of 98.9% (single 97%, duplicated 1.9%) using the insecta_odb10 reference set. While not fully phased, the assembly deposited is of one haplotype. Contigs corresponding to the second haplotype have also been deposited.

Sample acquisition and nucleic acid extraction
Two Leuctra nigra specimens (ipLeuNigr1 and ipLeuNigr2) were collected from River Taff Fawr, Garwnant, Wales, UK (latitude 51.808259, longitude -3.44498) using a kick-net on 19 March 2019. The specimens were collected and identified by Caleala Clifford (Natural Resources Wales) and snap-frozen in a dry shipper at the Natural History Museum, London.
DNA was extracted at the Tree of Life laboratory, Wellcome Sanger Institute (WSI). The ipLeuNigr2 sample was weighed and dissected on dry ice. The tissue was cryogenically disrupted to a fine powder using a Covaris cryoPREP Automated Dry Pulveriser, receiving multiple impacts. HMW DNA was sheared into an average fragment size of 12-20 kb in a Megaruptor 3 system with speed setting 30. Sheared DNA was purified by solid-phase reversible immobilisation using AMPure PB beads with a 1.8X ratio of beads to sample to remove the shorter fragments and concentrate the DNA sample. The concentration of the sheared and purified DNA was assessed using a Nanodrop spectrophotometer and Qubit Fluorometer and Qubit dsDNA High Sensitivity Assay kit. Fragment size distribution was evaluated by running the sample on the FemtoPulse system. Operations core at the WSI on Pacific Biosciences SEQUEL II (HiFi) instrument. Hi-C data were also generated from ipLeuNigr1 using the Arima v2 kit and sequenced on the HiSeq X Ten instrument.

Genome assembly
Assembly was carried out with Hifiasm (Cheng et al., 2021) and haplotypic duplication was identified and removed with purge_dups (Guan et al., 2020). The assembly was then scaffolded with Hi-C data (Rao et al., 2014) using YaHS (Zhou et al., 2023). The assembly was checked for contamination and corrected as described previously (Howe et al., 2021). Manual curation was performed using HiGlass (Kerpedjiev et al., 2018) and Pretext (Harry, 2022). The mitochondrial genome was assembled using MitoHiFi (Uliano-Silva et al., 2022), which performed annotation using MitoFinder (Allio et al., 2020). The genome was analysed and BUSCO scores generated within the BlobToolKit environment (Challis et al., 2020).      The genome sequence is released openly for reuse. The Leuctra nigra genome sequencing initiative is part of the Darwin Tree of Life (DToL) project. All raw sequence data and to minimise the suffering of animals used for sequencing. Each transfer of samples is further undertaken according to a Research Collaboration Agreement or Material Transfer the assembly have been deposited in INSDC databases. The genome will be annotated using available RNA-Seq data and presented through the Ensembl pipeline at the European Bioinformatics Institute. Raw data and assembly accession identifiers are reported in Table 1.

Shigeyuki Koshikawa
Hokkaido University, Sapporo, Japan Takuma Niida Graduate School of Environmental Science, Hokkaido University, Sapporo, Hokkaido Prefecture, Japan This is a report on the sequencing of the genome sequence of a species of stonefly Leuctra nigra. It is a well-organized and concise report, especially from a technical point of view.
In the Introduction, the authors described the habitat, but they should clearly state whether they are talking about nymphs or adults. Are we correct in understanding that the larvae are in water, such as streams, and the adults are on land? Are they nymphs or adults that are among the logs? >Genome sequence report >Manual assembly curation corrected 458 missing joins or mis-joins and >removed seven haplotypic duplications, reducing the scaffold number by >47.9%, and increasing the scaffold N50 by 4.62% In the "Genome sequence report", we wondered what information was used as the basis for manually fixing the assembly. Does it mean that there were references obtained from short reads? Do seven haplotypic duplications mean that there were seven pairs of similar sequences in the scaffolds?
In Figure 5, we thought it would be easier to understand if the names (numbers) of the chromosomes were written on the vertical and horizontal axes.
In the Methods, were nymphs or adults used in this study? Since the authors described that they were collected by kick-net, I assume they used nymphs collected from the water.
Is the rationale for creating the dataset(s) clearly described?