Identification of runs of homozygosity in Western honey bees (Apis mellifera) using whole‐genome sequencing data

Abstract Runs of homozygosity (ROH) are continuous homozygous segments that arise through the transmission of haplotypes that are identical by descent. The length and distribution of ROH segments provide insights into the genetic diversity of populations and can be associated with selection signatures. Here, we analyzed reconstructed whole‐genome queen genotypes, from a pool‐seq data experiment including 265 Western honeybee colonies from Apis mellifera mellifera and Apis mellifera carnica. Integrating individual ROH patterns and admixture levels in a dynamic population network visualization allowed us to ascertain major differences between the two subspecies. Within A. m. mellifera, we identified well‐defined substructures according to the genetic origin of the queens. Despite the current applied conservation efforts, we pinpointed 79 admixed queens. Genomic inbreeding (F ROH) strongly varied within and between the identified subpopulations. Conserved A. m. mellifera from Switzerland had the highest mean F ROH (3.39%), while queens originating from a conservation area in France, which were also highly admixed, showed significantly lower F ROH (0.45%). The majority of A. m. carnica queens were also highly admixed, except 12 purebred queens with a mean F ROH of 2.33%. Within the breed‐specific ROH islands, we identified 14 coding genes for A. m. mellifera and five for A. m. carnica, respectively. Local adaption of A. m. mellifera could be suggested by the identification of genes involved in the response to ultraviolet light (Crh‐BP, Uvop) and body size (Hex70a, Hex70b), while the A. m. carnica specific genes Cpr3 and Cpr4 are most likely associated with the lighter striping pattern, a morphological phenotype expected in this subspecies. We demonstrated that queen genotypes derived from pooled workers are useful tool to unravel the population dynamics in A. mellifera and provide fundamental information to conserve native honey bees.

genome queen genotypes, from a pool-seq data experiment including 265 Western honeybee colonies from Apis mellifera mellifera and Apis mellifera carnica. Integrating individual ROH patterns and admixture levels in a dynamic population network visualization allowed us to ascertain major differences between the two subspecies. Within A. m. mellifera, we identified well-defined substructures according to the genetic origin of the queens. Despite the current applied conservation efforts, we pinpointed 79 admixed queens. Genomic inbreeding (F ROH ) strongly varied within and between the identified subpopulations. Conserved A. m. mellifera from Switzerland had the highest mean F ROH (3.39%), while queens originating from a conservation area in France, which were also highly admixed, showed significantly lower F ROH (0.45%). The majority of A. m. carnica queens were also highly admixed, except 12 purebred queens with a mean F ROH of 2.33%. Within the breed-specific ROH islands, we identified 14 coding genes for A. m. mellifera and five for A. m. carnica, respectively. Local adaption of A. m. mellifera could be suggested by the identification of genes involved in the response to ultraviolet light (Crh-BP, Uvop) and body size (Hex70a, Hex70b), while the A. m. carnica specific genes Cpr3 and Cpr4 are most likely associated with the lighter striping pattern, a morphological phenotype expected in this subspecies. We demonstrated that queen genotypes derived from pooled workers are useful tool to unravel the population dynamics in A. mellifera and provide fundamental information to conserve native honey bees.

| INTRODUC TI ON
The Western honey bee (Apis mellifera, hereafter honey bee), is a key pollinator of agricultural crops (Klein et al., 2007). To date, more than 27 subspecies have been reported globally, which can be grouped into four distinct lineages, namely M (Western and Northern Europe), C (Eastern Europe), O (Near East and Central Asia), and A (Africa; Cridland et al., 2017;Ruttner, 1988). These lineages are characterized by genetic differences leading to variable morphology, physiology, and behavior (Ruttner, 1988). Honey bees are commonly kept in hives for honey production and pollination purposes. Varying selection pressures have been applied by humans to honey bees within their native range: in Europe, several selection programs have been initiated to increase their productivity (Adam, 1983;Büchler et al., 2010;Chauzat et al., 2013;Guichard et al., 2020;Uzunov et al., 2017), while in Africa the majority of honey bees evolved without large-scale selection (Dietemann et al., 2009).
In the beginning of the 19th century, importation of foreign honey bees among European regions began to increase, which profoundly reshaped the genetic structure of this species (Parejo et al., 2020). Historically, native honey bees of Europe mainly belong to A, M, and C evolutionary lineages. They are locally adapted to different climatic and geographical regions, resulting in several subspecies (Momeni et al., 2021;Ruttner, 1988). Nevertheless, beekeepers in Northern Europe continue to replace native honey bees (e.g., A. m. mellifera) with honey bees of South-European origin (e.g., A. m. carnica and A. m. ligustica), as these subspecies are considered to be more productive, gentle and calm (Bouga et al., 2011;Guichard et al., 2021). In North America, most honey bees are hybrids of these two historically imported strains, and key selection traits in United States (US) breeding programs are productivity and resistance traits to certain pathogens (Saelao et al., 2020). In Northern Europe, the favored use of South-European honey bees has led to multiple admixture events between subspecies and the extinction of native honey bees (Bieńkowska et al., 2021;Ruttner, 1995). Furthermore, these bees are also threatened by the widespread use of stabilized hybrid strains such as Buckfast (Adam, 1983;Bieńkowska et al., 2021).
The relocation of subspecies accompanied by admixture is a major risk factor for the loss of local adaptation and genetic diversity of honey bees (De la Rúa et al., 2009). Therefore, in Europe, several conservation programs have been initiated to maintain the genetic diversity of native honey bees, by establishing conservation areas on islands (e.g., Denmark, Scotland, and the Canary Islands) or on the mainland (e.g., France, Norway, Slovenia, and Austria), and excluding hybrids and invasive breeds mainly by their morphotype or behavior  (Soland-Reckeweg et al., 2009). Nowadays, an additional conservation area exists in canton Obwalden. The two conservatories encompass a total area of 830 km 2 and ~1050 colonies (Parejo et al., 2016). To limit admixture events with other non-native subspecies (e.g., A. m. carnica and Buckfast), these areas are typically located in remote alpine valleys. Besides the maintenance of the conservation areas, the breeding association of A. m. mellifera (mellifera.ch) established a selection program including several mating stations.
These stations are also located in geographically isolated areas and consist of 10 to 20 selected drone-producing colonies. Currently, an ancestry-informative marker panel (microsatellites or single nucleotide polymorphisms; SNPs) is applied to determine the hybridization of conserved and selected A. m. mellifera queens, and queens with an admixture level greater than 10% are replaced with purebred A. m. mellifera (Parejo et al., 2018). However, the replacement of admixed queens is expected to lead to an increase in inbreeding that could be detrimental to the small conserved A. m. mellifera population. Given that the survival of honey bees is strongly dependent on their genetic diversity (Jones et al., 2004;Kryger, 1990;Mattila et al., 2012;Mattila & Seeley, 2014;Oldroyd et al., 1992), monitoring of inbreeding in small conserved populations, such as A. m. mellifera in Switzerland, is crucial.
Estimates of inbreeding indicate the probability that an animal receives alleles that are identical by descent from each parent. This can be estimated using genetic markers, while pedigree-based estimations require prior knowledge of individual ancestry (Kardos et al., 2015), which in case of the honey bee is often not available.
Runs of homozygosity (ROH), caused by inheritance of parental haplotypes that are identical by descent, are one of the common methods to estimate inbreeding levels without ancestry information (McQuillan et al., 2008). The length of ROH segments can be used to ascertain historical changes in population size and structure including admixture (few and short ROH segments), current inbreeding (multiple and long ROH segments), and a recent bottleneck (multiple and short ROH segments); see (Ceballos et al., 2018) for a complete review. Furthermore, it is possible to derive the genomic inbreeding coefficient (F ROH ) of an animal by dividing the sum of all homozygous segments (S ROH ) by the length of the analyzed genome (McQuillan et al., 2008). Numerous studies have demonstrated that overlapping ROH segments across individuals, so-called ROH islands can be found in breed-specific selection signatures in cattle (Purfield et al., 2012;Zhang et al., 2015), sheep (Mastrangelo et al., 2017;Purfield et al., 2017;Signer-Hasler et al., 2019), and horses Grilz-Seger et al., 2018;Metzger et al., 2015), as well as in cultivated plants such as avocados

K E Y W O R D S
Apis mellifera carnica, Apis mellifera mellifera, conservation, genetic diversity, pooled sequences, selection signatures

T A X O N O M Y C L A S S I F I C A T I O N
Conservation genetics (Rubinstein et al., 2019), almonds (Pavan et al., 2021), and pears (Kumar et al., 2020).
To date, mostly drone genomes were used to assess the genetic diversity of honey bees, as their haploid nature facilitates cost-efficient whole-genome sequencing (Parejo et al., 2016). Due to the hemizygosity of drones, ROH cannot be estimated based on such data and it becomes likely to overestimate genetic relationships and subsequently inbreeding, compared to other livestock species . Another disadvantage of honey bee drones is that they only explain part of the genetic diversity, as multiple paternal origins are involved in the formation of honey bee colonies (Estoup et al., 1994;Neumann et al., 1999;Tarpy et al., 2004). However, genotyping of honey bee queens for the evaluation of admixture and genomic inbreeding without harming them remains difficult (Bubnič et al., 2020;Madella et al., 2020).
Therefore, a novel method for deriving queen genotypes based on pooled sequences of diploid worker bees was recently presented , which could enable more genomic studies requiring diploid data in honey bees and other haplo-diploid eusocial insects.
In this study, we investigated the utility of queen genotypes derived from pooled worker sequences to ascertain population substructures and to identify ROH segments in honey bees.
Furthermore, we integrated estimates of individual admixture and F ROH in a dynamic population network visualization to enhance the genetic monitoring of conserved A. m. mellifera. Finally, we screened the genomes for ROH islands to detect genes associated with geographic adaptations and human-mediated selection within A. m. mellifera and A. m. carnica.

| Sampled colonies
We sampled 265 honey bee colonies from two different subspecies, namely A. m. mellifera (MEL) and A. m. carnica (CAR).
Conserved MEL colonies were sampled in Switzerland (CS_CH) and France (CS_FR). The majority of the MEL colonies came from the selection program in Switzerland (SL_CH), which represents five different paternal origins (P1-P5), that is, drone-producing colonies headed by sister queens. The sample size, geographic origin, and location of the five different paternal origins and conserved MEL colonies are summarized in Table 1. It should be noted that P1 is located in close proximity to the conservation area (CS_CH) and that P4 and P5 have a common maternal origin. The

| Reconstruction of queen genotypes and quality control
We used the method described in Eynard et al. (2022) to reconstruct honeybee queen genotypes. In brief, this method follows a two-step procedure using two statistical models. First, the genetic composi- was performed across colonies within such a group. On average our pool-seq data showed the same sequencing depth (~30X), which was used to simulate the aforementioned statistical models. Therefore, we expect the same genotype errors and accuracies, previously reported by Eynard et al. (2022).
After the reconstruction of genome-wide queen genotypes, we removed 99,555 SNPs with multiple alternative alleles and 207,904 SNPs with an excessively high and low sequencing depth.
Furthermore, we excluded 771,835 homozygous SNPs to account for the very large non recombining, low polymorphic regions within the honey bee genome (Wragg et al., 2022). Finally, missing genotypes of the remaining 5,944,683 SNPs were imputed with BEAGLE 5.2 (Browning et al., 2018) to detect ROH segments along the genome, while for the population structure analyses queen genotypes were further edited for minor allelic frequency (MAF > 5%), which resulted in 1,609,447 genome-wide SNPs.

| Dynamic population network
To ascertain the high-resolution population structure of honey bees, we performed a dynamic population network visualization. The different components involved in the so-called NetView approach are described in detail by Neuditschko et al. (2012) and Steinig et al. (2016). Briefly, we computed genetic distances by subtracting pairwise relationships identical-by-state (IBS), as provided by PLINK v.1.9 (Chang et al., 2015), from 1 and applied the algorithm in its default setting (number of k nearest neighbors k-NN = 10). To illustrate the genetic relatedness between neighboring honey bee queens, we associated the thickness of edges (connecting lines) with the magnitude of the genetic distance, with thicker edges corresponding to lower genetic distances. To identify highly inbred honey bee queens, we scaled the node size of each queen based on the individual F ROH .
The node color denotes the sampled subpopulations and the individual level of admixture at K = 2 and K = 7 (the optimal number of clusters).

| Admixture
Queen admixture levels and genetic distances (F ST ) between the subspecies were determined using the program Admixture 1.23 (Alexander et al., 2009). We ran Admixture for 100 iterations increasing K from 2 to 10. Convergence between independent runs at the same K was monitored by comparing the resulting log-likelihood scores (LLs) following 100 iterations, and was inferred from stabilized LLs with less than 1 LL unit of variation between runs. Cross validation error estimation for each K was performed to determine the optimal number of clusters. Admixture results increasing K from 2 to 7 were visualized with the program Distruct 1.1 (Rosenberg, 2004) and integrated in the dynamic population network, as described above.

| Runs of homozygosity
Continuous homozygous segments were determined with an overlapping window approach implemented in PLINK v.1.9 (Chang et al., 2015) including the aforementioned 5,944,683 genome-wide SNPs. The following settings were applied: a minimum SNP density of one SNP per 40 kb, a maximum gap length of 100 kb, and a minimum length of homozygous segment of 200 kb. The total number of ROH (N ROH ), the total length of ROH segments (S ROH ), and the average length of ROH (L ROH ) were summarized for the two subspecies (CAR and MEL) and the respective subpopulations. The genomic-based inbreeding coefficients (F ROH ) were calculated by dividing S ROH by the length of the autosomal genome (L AUTO ), which was set to 220.76 Mb (Wallberg et al., 2019). Differences between subspecies and subpopulations were investigated using t-tests (for the two subspecies) and ANOVA with post hoc Tukey's honestly significant difference (HSD) tests at a significance level of α < 0.05 as implemented in the R package multcompView (Graves et al., 2015). We also correlated F ROH with the admixture proportions at K = 2 for each subspecies as implemented in the statistical computing software R (R Core Team, 2020). Furthermore, we compared F ROH of 74 SL_CH queens with pedigree-based inbreeding coefficients (F PED ). Pedigree-based inbreeding coefficients of the selected queens were calculated following the method described by Brascamp and Bijma (2014) based on the pedigree information of 1082 A. m. mellifera queens (Guichard et al., 2020) born between 1991 and 2017. The identity of the queen, of her mother and the grand-mother (queen of the drone producing colonies) were largely known and used to establish a pedigree file, from which an inverse relationship matrix between all entries was calculated to determine F PED (Guichard et al., 2020).

| Homozygosity islands and gene functions
Homozygosity islands of the three different groups (CAR, SL_CH, and CS_CH) were determined based on overlapping homozygous regions present in more than 50% of the queens with <10% admixture applying the R package detectRUNS (Biscarini et al., 2019).
Considering the small sample size, we used all CAR with admixture proportions <10% for the identification of breed-specific ROH islands. Finally, we used the NCBI genome data viewer (https://www. ncbi.nlm.nih.gov/genom e/gdv/), and the reference genome assembly Amel_HAv3.1 (Wallberg et al., 2019) to identify genes located in ROH islands and specified the known functions of the identified genes by conducting a literature review.

| Dynamic population network
The dynamic population network separated CAR (Figure 1a, were directly connected with four CS_CH queens, while the remaining CS_FR queens were frequently distributed over the network. Furthermore, CS_CH queens were the nearest neighbors of five SL_ CH queens originating from three different strains (P1, P4, and P5), while P1 showed the strongest genetic relationship with this cluster.
Compared to P1 and P2, the three remaining strains (P3-P5) did not build a distinct population cluster. P3 queens were distributed over the network without a discernible pattern and the majority of P4 and P5 queens were highly related to each other, while especially P5 queens built two small sub-clusters each including a P4 queen. Such a small sub-cluster was also evident in CAR including seven highly related queens. The association of the node size with F ROH illustrates that the majority of CS_CH and three CAR queens, included in the F I G U R E 1 Dynamic population network and model-based clustering of honey bees (Apis mellifera). (a) Dynamic population network, where each queen is illustrated by a node, with individual node size proportional to F ROH , while the node color represents the sample origin.
The thickness of edges varies in the proportion of the genetic distance to visualize individual relationships between the colonies. The topology of the network clearly differentiated Apis mellifera carnica (CAR, dashed circle) from Apis mellifera mellifera (MEL) and described well-defined substructure within MEL according to the genetic origin. MEL queens allocated in the immediate neighborhood of CAR are indicated by "*", while CS_FR queens directly connecting with CS_CH queens are highlighted by "+". (b) Model-based clustering assignment of honey bees using 2-7 clusters (K). Queens are presented by a single vertical column divided into K colors. Each color represents one cluster and the length of the colored segment corresponds to the individual membership proportion in that cluster.
aforementioned sub-cluster, showed the highest F ROH . Furthermore, it can be noted that CAR located in the neighborhood of MEL (and vice versa), as well as queens not clustering with their strains show in general lower F ROH (Figure 1a).

| Admixture
Based on the cross-validation error estimation increasing K from 2 to 10, an optimal cluster solution at K = 7 was determined ( Figure S1).

| Runs of homozygosity
The number of ROH segments (N ROH ), the total length of ROH (S ROH ), and the F ROH were significantly different between MEL and CAR, while the mean segment length was equal (L ROH = 0.34 ± 0.10 Mb, Table 2). in the new subset (MEL <10% ). Only 12 CAR queens remained after removing all samples with an admixture proportion >10% (CAR <10% ), 10 from Switzerland (CAR_CH), and two from the US (CAR_US).
The number, total length, mean length of segments, and mean genomic inbreeding coefficient all increased (N ROH = 13.67 ± 5.16, S ROH = 5.14 ± 2.53, L ROH = 0.36 ± 0.06, F ROH = 2.33 ± 1.15), but the SD increased in all parameters except for the L ROH .
Summarizing the F ROH results of all MEL queens according to the a priori defined subpopulations underscored some results from the dynamic population network (Figure 3)
Chromosomes 4, 7, 12, 13, 14, and 16 did not bear any homozygosity islands for either MEL subpopulation. Five homozygosity islands were located near the starting end of the chromosomes. The largest homozygosity island was on chromosome 2 for CS_CH and covered 943 kb (common with SL_CH <10% over two smaller regions of 16 and 36 kb, respectively). The shortest was for MEL <10% on chromosome 3 and spanned 15 kb. There were substantially more uncharacterized genes than annotated genes within the ROH islands (i.e., 264 uncharacterized loci and 12 annotated genes in MEL <10% , respectively, 447 and 10 for CS_CH).

| DISCUSS ION
We demonstrated that queen genotypes derived from pooled honey bee workers can be successfully applied to ascertain high-resolution population structures, including the computation of F ROH and the detection of breed-and subpopulation-specific ROH islands. However, it should be noticed, that the applied queen reconstruction procedure assumes that queens are mated to drones of similar ancestry.
Our pool-seq data, where the majority of colonies were derived from breeders and conservatories fulfilled this prerequisite, while for other data settings (e.g., artificial insemination), the applicability of the queen reconstruction procedures needs to be further investigated. The applied population structure analyses clearly differentiated MEL from CAR (F ST = 0.45), despite the occurrence of highly admixed MEL and CAR queens and simultaneously highlighted the challenges to conserve native honey bees due to lack of control over mating. Therefore, the dynamic population network illustrated that a successful honey bee conservation program requires an appropriate F I G U R E 2 Dynamic network visualizations of honey bees (Apis mellifera) associated with admixture proportions. Each queen is illustrated by a node, with individual node size proportional to F ROH , while the node color represents the individual levels of admixture at K = 2 (a) and K = 7 (optimal cluster solution (b)). The thickness of edges varies in the proportion of the genetic distance to visualize individual relationship between the colonies. The topology of both networks illuminates that in general highly admixed queens also show low F ROH .

K=2 K=7
(b) (a) CAR. The population structure of CS_FR and the genetic origin of some SL_CH indicate that current applied conservation strategies including the geographical locations are not suitable for in situ conservation. Ex situ conservation by means of artificial insemination (Cobey et al., 2013), could be a more efficient alternative to maintain the gene pool of native honey bees.
In spite of the fact that SL_CH are carefully selected to contribute to the local genetic diversity, MEL showed significantly higher livestock populations such as cattle (Purfield et al., 2012) and goats (Bertolini et al., 2018). The population admixture also had an effect on the relationship between F ROH and F PED , with a slightly negative correlation (r = −.22) indicating poor agreement between the two methods, compared to commonly observed values in livestock, such as goats (r = .50; Burren et al., 2016) and sheep (0.18 < r < .70; Purfield et al., 2017). In absence of instrumental insemination, the paternal origin of honey bees is not precisely known, as honey bee queens naturally mate in flight with 10 to 20 drones (polyandrous mating system; Estoup et al., 1994;Neumann et al., 1999;Tarpy et al., 2004). Hence, the paternal origin must be estimated by restricting paternal origins to the drone-producing colonies located at the mating station. However, the proportion of foreign drones contributing to the mating remains unknown. Therefore, F PED of queens from insufficiently isolated mating stations (with higher admixture proportions) is overestimated, while a low pedigree completeness results in lower F PED compared to F ROH . To improve the pedigree quality of honeybees, we suggest confirming the parental origin with a marker-based parentage analysis or by performing artificial insemination.
Within MEL-specific homozygosity islands, we identified two genes that are directly associated with the current applied selection traits, including increased productivity and swarming drive (Bouga et al., 2011;Guichard et al., 2021). Based on highly selected A. m.
ligustica strains, it has already been demonstrated that RpL35, identified in the SL_CH <10% -specific island, controls royal jelly production and larval growth (Ararso et al., 2018). Furthermore, the differential expression of Ndufs1, found in a CS_CH <10% -specific ROH island, may also increase foraging behavior (Guo et al., 2019), and consequently, productivity.
The gene Crh-BP embedded in a ROH island for SL_CH <10% , is involved in the resistance to ultraviolet (UV) exposure, and therefore suggests adaptive mechanisms due to the ancestral geographical origin of the subspecies. The gene Crh-BP was shown to be upregulated in honey bees in response to UV exposure and heat stress (Even et al., 2012). Therefore, homozygosity in this gene could indicate local adaptation to lower sun exposure and temperatures of Northern Europe by potential downregulation of this gene. The homozygous state of the Uvop gene in SL_CH <10% is associated with retinal development and the circadian rhythm (Lichtenstein et al., 2018), which may enable SL_CH <10% to deal with seasonally more variable sun exposure. Diurnal mammal species also produce different quantities of UV-sensitive pigments depending on their ecological niche (Emerling et al., 1819). Furthermore, it has recently been demonstrated that genes involved in the response to UV exposure are associated with the local adaptation of horse breeds .
Another indication of signatures of selection related to the geographical distribution of MEL can be found in the homozygosity of the genes Hex70a and Hex70b, encoding for two hexamerin proteins of the same name. Similar to vitellogenin, hexamerins bind to juvenile hormone (JH) and are storage proteins in the larval fat body, providing amino acids for the development into the adult stage (Martins et al., 2010;Telfer & Kunkel, 1991). Hexamerins also appear to be involved in ovary and testes development, and spermatogenesis in drones (Martins et al., 2011). More storage proteins in the larval fat body imply both larger and more long-lived bees, essential for colony survival during the winter. Although the quantity of storage proteins mostly depends on the pollen supply and quality (Basualdo et al., 2013;Frias et al., 2016) (Ruttner, 1988), therefore objective studies measuring multiple workers from diverse Apis mellifera subspecies are necessary to confirm this hypothesis.
Several characterized genes in a homozygous state for MEL shared functions associated with stress response (ATP5G2 (Watts et al., 2018), Crh-BP (Even et al., 2012), Hex70b (Aronstein et al., 2010)), and immunity (Ctl5 (Lin et al., 2020)). Two genes, Wrnexo and Tert, are involved in DNA structure and integrity, and therefore are thought to be associated with longevity (Hornstein, 2008;Robertson & Gordon, 2006;Rossi et al., 2010). However, at the current stage of research, it is not clear whether the homozygosity state of these genes has a positive or negative effect on the aforementioned functions. Therefore, fine-tuned gene expression studies are required to assess the direction of selection within the MEL subspecies.
We also identified genes related to the exterior phenotype used to distinguish the two subspecies. The gene Chmp1 identified in a CS_CH <10% ROH island is known to influence the veining pattern in Drosophila (Valentine et al., 2014), which might explain the morphological differences in vein patterns used to classify individuals into subspecies (Bouga et al., 2011;Ruttner, 1988). Several cuticular protein-coding genes (Cpr3 and Cpr4 in particular), present in the private ROH island of CAR <10% may be involved in the CAR-specific morphotype of broader hairy stripes (Figure 1a), as they affect the thickness and coloring of the exoskeleton (Costa et al., 2016;Soares et al., 2013). We found another gene related to the cuticle (Grp, glycine-rich cuticle protein) in a CS_CH <10%specific island, although its functions have not been formally de- In summary, we have described a number of novel aspects to investigate the genetic diversity of honey bees that are of potential interest. First, the application of queen genotypes derived from pooled honey bee workers to ascertain fine-scale population structures. Second, the identification of ROH segments to compute genomic inbreeding of honey bee queens. Finally, the identification of genes associated with geographic adaptation and human-mediated selection by means of ROH islands. Therefore, we believe that ROH derived from whole-genome sequencing data will be of invaluable benefit to investigate complex population structures in honey bees and other insects.

ACK N OWLED G M ENTS
We thank Lucie Genestout (

DATA AVA I L A B I L I T Y S TAT E M E N T
Swiss, Swedish, Norwegian, and American bee sequence data will be deposited at the European Nucleotide Archive (ENA: http://www. ebi.ac.uk/ena), while French bee sequence data remain the property of the Beestrong Consortium. However, data are available from the authors upon reasonable request.