The genome sequence of the meadow field syrph, Eupeodes latifasciatus (Macquart, 1829)

We present a genome assembly from an individual female Eupeodes latifasciatus (meadow field syrph; Arthropoda; Insecta; Diptera; Syrphidae). The genome sequence is 846 megabases in span. The majority of the assembly (96.8%) is scaffolded into 4 chromosomal pseudomolecules with the X sex chromosome assembled. The complete mitochondrial genome was also assembled and is 18.5 kilobases in length. Gene annotation of this assembly on Ensembl has identified 12,848 protein coding genes.


Background
The meadow field syrph, Eupeodes latifasciatus, is a type of hoverfly from the Syrphid family. Its wingspan is between 6.5 to 8.5 mm. It is similar to another hoverfly species, Eupeodes corollae, but it can be distinguished from the yellow markings on its body which are fused into bands on segments three and four (van Veen, 2004).
E. latifasciatus can be found across the Palaearctic from the south of Fennoscandia to the Mediterranean basin (Peck, 1988). It is widespread in the UK but occurs more frequently in the south, preferring lush vegetation and damp meadows to gardens (Eupeodes Latifasciatus (Macquart, 1829) n.d.). Some of the common flowers that E. latifasciatus visits are white umbellifers, Euphorbia, and Ranunculus (de Buck, 1990). While adults feed only on nectar, E. latifasciatus larvae feed on small insects from the insect order Hemiptera such as aphids and scale insects (Stubbs & Falk, 1983). The flight period is usually from May to September but occurs from April to October in southern Europe. This high-quality E. latifasciatus genome was assembled as part of the Darwin Tree of Life project which aims to genetically describe all species found in the UK.

Genome sequence report
The genome was sequenced from a single female E. latifasciatus collected from Wytham Woods, Berkshire, UK (Figure 1). A total of 27-fold coverage in Pacific Biosciences singlemolecule HiFi long reads and 47-fold coverage in 10X Genomics read clouds were generated. Primary assembly contigs were scaffolded with chromosome conformation Hi-C data. Manual assembly curation corrected 492 missing/ misjoins and removed 17 haplotypic duplications, reducing the assembly size by 1.20% and the scaffold number by 38.33%, and increasing the scaffold N50 by 437.36%.
The final assembly has a total length of 846 Mb in 436 sequence scaffolds with a scaffold N50 of 189.4 Mb ( Table 1). The majority, 96.8%, of the assembly sequence was assigned to 4 chromosomal-level scaffolds, representing 3 autosomes (numbered by sequence length) and the X sex chromosome (Figure 2- Figure 5; Table 2). Two regions of this assembly are particularly fragmented: the centromeric and pericentromeric region of chromosome 1 and all of chromosome X.

Genome annotation report
The idEupLati1.1 genome has been annotated using the Ensembl rapid annotation pipeline (    Illumina HiSeq 4000 (RNA-Seq) instruments. Hi-C data were generated in the Tree of Life laboratory from head tissue of idEupLati1 using the Arima v2 kit and sequenced on a NovaSeq 6000 instrument.

Genome assembly
Assembly was carried out with Hifiasm (Cheng et al., 2021); haplotypic duplication was identified and removed with purge_dups (Guan et al., 2020). One round of polishing was performed by aligning 10X Genomics read data to the assembly with longranger align, calling variants with freebayes (Garrison & Marth, 2012). The assembly was then scaffolded with Hi-C data (Rao et al., 2014) using SALSA2 (Ghurye et al., 2019. The assembly was checked for contamination as described previously (Howe et al., 2021). Manual curation was performed using HiGlass (Kerpedjiev et al., 2018) and Pretext. The mitochondrial genome was assembled using MitoHiFi (Uliano-Silva et al., 2021), which performs annotation using MitoFinder (Allio et al., 2020). The genome was analysed and BUSCO scores generated within the BlobToolKit environment (Challis et al., 2020). Table 3 contains a list of all software tool versions used, where appropriate.

Genome annotation
The Ensembl gene annotation system (Aken et al., 2016)