The genome sequence of a tachinid fly, Nowickia ferox (Panzer, 1809)

We present a genome assembly from an individual female Nowickia ferox (a tachinid fly; Arthropoda; Insecta; Diptera; Tachinidae). The genome sequence is 670.7 megabases in span. Most of the assembly is scaffolded into 6 chromosomal pseudomolecules, including the X sex chromosome. The mitochondrial genome has also been assembled and is 17.19 kilobases in length. Gene annotation of this assembly on Ensembl identified 27,893 protein coding genes.


Background
Nowickia ferox (Panzer, 1809), is a large tachinid fly with a maximum body length of ~18 mm.It is common south of a line drawn between South Wales and the Wash, with a few scattered records further north.It is most likely to be confused with Tachina fera (Linnaeus, 1761) but N. ferox is dark black and orange with black legs; while T. fera is brown and orange with orange/brown legs.The host for this species is usually the Dark Arches moth, Apamea monoglypha (Hufnagel, 1766) (Belshaw, 1993).The adults nectar on a wide range of flowers and can be seen from mid-June to late September, on heathland, woodland margins and gardens, in one brood (Tschorsnig & Herting, 1994).
There has been a move (O'Hara, 2020) to classify Nowickia as subgenus of Tachina, but European taxonomists are yet to adopt this reorganisation.
The genome of Nowickia ferox was sequenced as part of the Darwin Tree of Life Project, a collaborative effort to sequence all named eukaryotic species in the Atlantic Archipelago of Britain and Ireland.Here we present a chromosomally complete genome sequence for Nowickia ferox, based on one female specimen from Wytham Woods.

Genome sequence report
The genome was sequenced from one female Nowickia ferox (Figure 1) collected from Wytham Woods, Oxfordshire (biological vice-county Berkshire), UK (51.77,.A total of 41-fold coverage in Pacific Biosciences single-molecule HiFi long reads was generated.Primary assembly contigs were scaffolded with chromosome conformation Hi-C data.Manual assembly curation corrected 79 missing joins or mis-joins and removed 7 haplotypic duplications, increasing the assembly length by 2.4% and the scaffold number by 22.56%, and increasing the scaffold N50 by 7.23%. The final assembly has a total length of 670.7 Mb in 103 sequence scaffolds with a scaffold N50 of 116.6 Mb (Table 1).Most (99.95%) of the assembly sequence was assigned to 6 chromosomal-level scaffolds, representing 5 autosomes and the X sex chromosome.Chromosome-scale scaffolds confirmed by the Hi-C data are named in order of size (Figure 2-Figure 5; Table 2).While not fully phased, the assembly deposited is of one haplotype.Contigs corresponding to the second haplotype have also been deposited.The mitochondrial genome was also assembled and can be found as a contig within the multifasta file of the genome submission.
Metadata for specimens, spectral estimates, sequencing runs, contaminants and pre-curation assembly statistics can be found at https://links.tol.sanger.ac.uk/species/613196.

Sample acquisition and nucleic acid extraction
A female Nowickia ferox (idNowFero1) was collected from Wytham Woods, Oxfordshire (biological vice-county Berkshire), UK (latitude 51.77, longitude -1.33) on 2020-08-04 by netting.The specimen was collected and identified by Steven Falk (University of Oxford) and snap-frozen on dry ice.
DNA was extracted at the Tree of Life laboratory, Wellcome Sanger Institute (WSI).The idNowFero1 sample was weighed and dissected on dry ice with tissue set aside for Hi-C sequencing.Thorax tissue was cryogenically disrupted to a fine powder using a Covaris cryoPREP Automated Dry Pulveriser, receiving multiple impacts.High molecular weight (HMW) DNA was extracted using the Qiagen MagAttract HMW DNA extraction kit.HMW DNA was sheared into an average fragment size of 12-20 kb in a Megaruptor 3 system with speed setting 30.Sheared DNA was purified by solid-phase reversible immobilisation using AMPure PB beads with a 1.8X ratio of beads to sample to remove the shorter

Sequencing
Pacific Biosciences HiFi circular consensus DNA sequencing libraries were constructed according to the manufacturers' instructions.DNA sequencing was performed by the Scientific Operations core at the WSI on a Pacific Biosciences SEQUEL II (HiFi) instrument.Hi-C data were also generated from head tissue of idNowFero1 using the Arima2 kit and sequenced on the Illumina NovaSeq 6000 instrument.et al., 2020) and BUSCO scores (Manni et al., 2021;Simão et al., 2015) were calculated.
Table 3 contains a list of relevant software tool versions and sources.

Genome annotation
The BRAKER2 pipeline (Brůna et al., 2021) was used in the default protein mode to generate annotation for the Nowickia ferox assembly (GCA_936439885.1) in Ensembl Rapid Release.

Wellcome Sanger Institute -Legal and Governance
The materials that have contributed to this genome note have been supplied by a Darwin Tree of Life Partner.The submission of materials by a Darwin Tree of Life Partner is subject to the 'Darwin Tree of Life Project Sampling Code of Practice', which can be found in full on the Darwin Tree of Life website here.By agreeing with and signing up to the Sampling Code of Practice, the Darwin Tree of Life Partner agrees they will meet the legal and ethical requirements and standards set out within this document in respect of all samples acquired for, and supplied to, the Darwin Tree of Life Project.
Further, the Wellcome Sanger Institute employs a process whereby due diligence is carried out proportionate to the nature of the materials themselves, and the circumstances under which they have been/are to be collected and provided for use.The purpose of this is to address and mitigate any potential legal and/or ethical implications of receipt and use of the materials

Software tool Version
as part of the research project, and to ensure that in doing so we align with best practice wherever possible.The overarching areas of consideration are: • Ethical review of provenance and sourcing of the material

Phillip Shults
USDA-ARS, Manhattan, USA Here the author presents the genome of a tachinid fly, Nowickia ferox.
The methods for how the specimen was sequenced and the pipelines used in genome assembly are clearly explained and align with current methods.The appropriate statistics and data are reported to assess the quality of the genome.A major positive for me was the interactive versions of the figures.The authors did a very nice job of this work.
My only real question is about the HiC map.Why are there so many contact points across different chromosomes?
Here are a few additional comments to consider.
Regarding "the Wash", perhaps there is a different way to explain the distribution of this species.
In table 2, chromosome 2 is not centered.
Consider rephrasing to "…Nanodrop spectrophotometer and Qubit Fluorometer using a Qubit dsDNA High Sensitivity Assay kit" Is the rationale for creating the dataset(s) clearly described?Yes Are the protocols appropriate and is the work technically sound?Yes

Are sufficient details of methods and materials provided to allow replication by others? Yes
Are the datasets clearly presented in a useable and accessible format?
I have examined the assembly results and find them to be excellent.I searched for the data in the repositories indicated and saw they were available as described.Their assembly results are in line with other publications from this initiative, including of tachinid flies by the authors.

Is the rationale for creating the dataset(s) clearly described? Yes
Are the protocols appropriate and is the work technically sound?Yes

Are sufficient details of methods and materials provided to allow replication by others? Yes
Are the datasets clearly presented in a useable and accessible format?Yes Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Evolutionary Biology, Bioinformatics, Comparative Genomics I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Figure 2 .
Figure 2. Genome assembly of Nowickia ferox, idNowFero1.1:metrics.The BlobToolKit Snailplot shows N50 metrics and BUSCO gene completeness.The main plot is divided into 1,000 size-ordered bins around the circumference with each bin representing 0.1% of the 670,715,075 bp assembly.The distribution of scaffold lengths is shown in dark grey with the plot radius scaled to the longest scaffold present in the assembly (154,429,712 bp, shown in red).Orange and pale-orange arcs show the N50 and N90 scaffold lengths (116,629,194 and 104,111,281 bp), respectively.The pale grey spiral shows the cumulative scaffold count on a log scale with white scale lines showing successive orders of magnitude.The blue and pale-blue area around the outside of the plot shows the distribution of GC, AT and N percentages in the same bins as the inner plot.A summary of complete, fragmented, duplicated and missing BUSCO genes in the diptera_odb10 set is shown in the top right.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/idNowFero1.1/dataset/CAKZFG01/snail.

Figure 3 .
Figure 3. Genome assembly of Nowickia ferox, idNowFero1.1:BlobToolKit GC-coverage plot.Scaffolds are coloured by phylum.Circles are sized in proportion to scaffold length.Histograms show the distribution of scaffold length sum along each axis.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/idNowFero1.1/dataset/CAKZFG01/blob.

Figure 4 .Figure 5 .
Figure 4. Genome assembly of Nowickia ferox, BlobToolKit cumulative sequence plot.The grey line shows cumulative length for all scaffolds.Coloured lines show cumulative lengths of scaffolds assigned to each phylum using the buscogenes taxrule.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/idNowFero1.1/dataset/CAKZFG01/ cumulative.

Table 1 . Genome data for Nowickia ferox, idNowFero1.1. Project accession data
fragments and concentrate the DNA sample.The concentration of the sheared and purified DNA was assessed using a Nanodrop spectrophotometer and Qubit Fluorometer and Qubit dsDNA High Sensitivity Assay kit.Fragment size distribution was evaluated by running the sample on the FemtoPulse system.