The genome sequence of the Early Bumblebee, Bombus pratorum (Linnaeus, 1761)

We present a genome assembly from an individual female Bombus pratorum (the Early Bumblebee; Arthropoda; Insecta; Hymenoptera; Apidae). The genome sequence is 285.1 megabases in span. Most of the assembly is scaffolded into 18 chromosomal pseudomolecules. The mitochondrial genome has also been assembled and is 21.5 kilobases in length. Gene annotation of this assembly on Ensembl identified 13,746 protein coding genes.


Background
The Early Bumblebee, Bombus pratorum, is one of the seven most common and widespread species of bumblebees in the UK.It is generally common throughout its range, that includes much of Europe across to the near East.It can be found in many different habitats, predominantly gardens and woodlands.It has been shown that the abundance of this species exhibits a positive correlation to the degree of gardens and allotments in the landscape (Foster et al., 2017).It is a small bumblebee species less than 17 mm, covered in black hairs with bands of yellow hairs across the pronotum and second tergite, and red hairs on the apical segments of the abdomen.It is a eusocial species with reproductive queens and males, and non-reproductive workers.Males have yellow hairs on the head and face.
It typically has the earliest appearing workers of any UK bumblebee, with workers of the first brood appearing from as early as February (Benton, 2006).Bombus pratorum is frequently bivoltine in the UK, particularly in the south, becoming increasingly univoltine further northwards (Prŷs-Jones & Corbet, 1987).Males and new queens from the first colony cycle can be produced from May and June respectively.The colony cycle is remarkably short, with reproductives being produced in as little as 3 months after founding (Edwards & Jenner, 2005) and new queens being one of the first UK bumblebee species to enter overwintering diapause (Alford, 1969).Nests are constructed in a variety of situations, both above-and below-ground, including in old small-mammal burrows and aerial cavities such as bird boxes and holes in trees (Lye et al., 2012).Colonies are relatively small, usually peaking at fewer than 100 workers.In UK agricultural landscapes, nest density is estimated at 26 nests per square kilometre, and minimum estimated maximum foraging range is 674 m (Knight et al., 2005).
It is polylectic, although a preference for pollen from Fabaceae has been found (Goulson & Darvill, 2004).It visits a wide range of flowers for nectar, particularly favouring Blackthorn (Prunus spinosa), Bramble (Rubus fruticosus agg.) and Raspberry (Rubus idaeus), and is an important pollinator of soft fruits.
A complete genome sequence for this species will facilitate studies into the evolution of eusociality, conservation of important pollinator species, reproductive evolution and foraging behaviour.

Genome sequence report
The genome was sequenced from one female Bombus pratorum specimen (Figure 1) collected from Wytham Woods, Oxfordshire, UK (latitude 51.77, longitude -1.33).A total of 60-fold coverage in Pacific Biosciences single-molecule HiFi long reads and 88-fold coverage in 10X Genomics read clouds were generated.Primary assembly contigs were scaffolded with chromosome conformation Hi-C data.Manual assembly curation corrected 32 missing joins or mis-joins, reducing the scaffold number by 24.24%, and increasing the scaffold N50 by 20.98%.
The final assembly has a total length of 285.1 Mb in 50 sequence scaffolds with a scaffold N50 of 16.5 Mb (Table 1).Most (96.23%) of the assembly sequence was assigned to 18 chromosomal-level scaffolds.
Chromosome-scale scaffolds confirmed by the Hi-C data are named in order of size (Figure 2-Figure 5; Table 2).While not fully phased, the assembly deposited is of one haplotype.Contigs corresponding to the second haplotype have also been deposited.

Sample acquisition and nucleic acid extraction
A female Bombus pratorum (iyBomPrat1) was collected from Wytham Woods, Oxfordshire (biological vice-county: Berkshire), UK (latitude 51.77, longitude -1.33) on 20 August 2019.The specimen was taken from woodland habitat by Liam Crowley (University of Oxford) by netting.The specimen was identified by the collector and snap-frozen on dry ice.This specimen was used for genome sequencing and Hi-C scaffolding.
A second female B. pratorum specimen (iyBomPrat2) was used for RNA sequencing.The iyBomPrat2 specimen was collected by Olga Sivell (Natural History Museum) from woodland edge in Luton, UK (latitude 51.88, longitude -0.37) on 6 May 2020.The specimen was identified by Duncan Sivell (Natural History Museum) and snap-frozen on dry ice.
DNA was extracted at the Tree of Life laboratory, Wellcome Sanger Institute (WSI).The iyBomPrat1 sample was weighed and dissected on dry ice with abdomen tissue set aside for Hi-C sequencing.Head and thorax tissue was disrupted using RNA was extracted from head tissue of iyBomPrat2 in the Tree of Life Laboratory at the WSI using TRIzol, according to the manufacturer's instructions.RNA was then eluted in 50 μl RNAse-free water and its concentration assessed using a Nanodrop spectrophotometer and Qubit Fluorometer using the Qubit RNA Broad-Range (BR) Assay kit.Analysis of the integrity of the RNA was done using Agilent RNA 6000 Pico Kit and Eukaryotic Total RNA assay.et al., 2020).To evaluate the assembly, MerquryFK was used to estimate consensus quality (QV) scores and k-mer completeness (Rhie et al., 2020).The genome was analysed and BUSCO scores (Manni et al., 2021;Simão et al., 2015) were calculated within the BlobToolKit environment (Challis et al., 2020).Table 3 contains a list of software tool versions and sources.

Genome annotation
The Ensembl gene annotation system (Aken et al., 2016) was used to generate annotation for the Bombus pratorum

Software tool Version
Table 3: Please include MitoFinder information.

○
Is the rationale for creating the dataset(s) clearly described?Yes Are the protocols appropriate and is the work technically sound?Yes

Are sufficient details of methods and materials provided to allow replication by others? Yes
Are the datasets clearly presented in a useable and accessible format?Yes Competing Interests: No competing interests were disclosed.
I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Figure 2 .
Figure 2. Genome assembly of Bombus pratorum, iyBomPrat1.1:metrics.The BlobToolKit Snailplot shows N50 metrics and BUSCO gene completeness.The main plot is divided into 1,000 size-ordered bins around the circumference with each bin representing 0.1% of the 285,072,354 bp assembly.The distribution of scaffold lengths is shown in dark grey with the plot radius scaled to the longest scaffold present in the assembly (22,727,794 bp, shown in red).Orange and pale-orange arcs show the N50 and N90 scaffold lengths (16,494,161 and 11,389,935 bp), respectively.The pale grey spiral shows the cumulative scaffold count on a log scale with white scale lines showing successive orders of magnitude.The blue and pale-blue area around the outside of the plot shows the distribution of GC, AT and N percentages in the same bins as the inner plot.A summary of complete, fragmented, duplicated and missing BUSCO genes in the hymenoptera_odb10 set is shown in the top right.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/iyBomPrat1.1/dataset/CAKNEW01/snail.

Figure 3 .
Figure 3. Genome assembly of Bombus pratorum, iyBomPrat1.1:GC coverage.BlobToolKit GC-coverage plot.Scaffolds are coloured by phylum.Circles are sized in proportion to scaffold length.Histograms show the distribution of scaffold length sum along each axis.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/iyBomPrat1.1/dataset/CAKNEW01/blob.

Figure 4 .
Figure 4. Genome assembly of Bombus pratorum, iyBomPrat1.1:cumulative sequence.BlobToolKit cumulative sequence plot.The grey line shows cumulative length for all scaffolds.Coloured lines show cumulative lengths of scaffolds assigned to each phylum using the buscogenes taxrule.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/iyBomPrat1.1/dataset/ CAKNEW01/cumulative.

Figure 5 .
Figure 5. Genome assembly of Bombus pratorum, iyBomPrat1.1:Hi-C contact map.Hi-C contact map of the iyBomPrat1.1 assembly, visualised using HiGlass.Chromosomes are shown in order of size from left to right and top to bottom.An interactive version of this figure may be viewed at https://genome-note-higlass.tol.sanger.ac.uk/l/?d=X2H_v2CFRT2Nc-jNlyp_Tg.