Potential costs of learning have no detectable impact on reproductive success for bumble bees

.

Learning and memory are cognitive traits that allow animals to acquire, retain and recall information about their environments (Shettleworth, 2010) to make decisions that may increase fitness and survival prospects (e.g., Dukas & Bernays, 2000;Maille & Schradin, 2016;Shaw et al., 2019;Sonnenberg et al., 2019), but are also associated with physiological costs.These are generally grouped into costs that are constitutive or induced (Burns et al., 2010).Constitutive costs describe evolutionary costs associated with maintaining neural infrastructure and are paid by an individual irrespective of whether this infrastructure is put to use (Aiello & Wheeler, 1995;Niven & Laughlin, 2008).For example, in Drosophila melanogaster, individuals bred from high-learning lines that do not undergo any learning trials show decreased survival probability and larval competitive ability, compared with control lines, suggesting that evolutionary investment in learning ability comes at a cost (Burger et al., 2008;Mery & Kawecki, 2003).Conversely, the active processes of learning and memory formation also consume energy, resulting in proximate trade-offs with other traits that are also energetically expensive (induced costs; Mery & Kawecki, 2004).Here, we investigated the induced costs of memory formation in a social insect model, the bumble bee Bombus terrestris audax, at a stage in the life cycle when energy budget requirements are particularly high.
Increased energetic requirements during learning and memory formation may be driven by structural and molecular changes that occur in the brain.Formation of long-term memories requires de novo protein synthesis (Menzel, 2012;Tully et al., 1994), and previous work has also identified changes to neural structures that occur during learning and memory formation (Cabirol et al., 2018;Hourcade et al., 2010;Li et al., 2017).This reconfiguration process is likely to be costly (Niven, 2016), and in fruit flies (D. melanogaster) neurons in the mushroom bodies (part of the brain involved in learning and memory in insects) show increased energy consumption following long-term memory formation (Plaçais et al., 2017).Accordingly, classical conditioning is associated with a subsequent increase in sucrose consumption (Plaçais et al., 2017).These energetic costs may reduce available energy budgets for investment in other processes, since elicitation of long-term memory formation has been shown to reduce survival and egg laying in the same species (Mery & Kawecki, 2004, 2005).In honey bees, Apis mellifera, associative learning trials have been shown to be followed by lower levels of trehalose (a precursor to glucose) in haemolymph, and again, with reduced survival (Jaumann et al., 2013).Proximate trade-offs have also been identified between learning and immune system activation (Alghamdi et al., 2008;Mallon et al., 2003).
Potential costs are likely to have relatively greater impacts on individuals at vulnerable stages during the life cycle.Bumble bees are annual eusocial insects in which a colony is founded by a single queen in the spring (Goulson, 2010).When mated queens emerge from diapause, they are effectively solitary individuals until their colony is founded and must therefore perform all tasks that will later be taken on by workers, in addition to nest searching, nest building and reproduction (Riveros & Gronenberg, 2009).This includes foraging to feed the brood, which places demands on learning and memory (Klein et al., 2017) and has been linked to foraging success (Pull et al., 2022;Raine & Chittka, 2008).Accordingly, bumble bee queens have been shown not only to successfully complete associative learning tasks quickly, but also to perform them better than workers (Evans & Raine, 2014;Muth, 2021), suggesting they invest relatively heavily in learning and memory processes.
Here we asked whether the demands of learning to identify rewarding flower species come at a cost to colony founding success in bumble bee queens, using a laboratory protocol that allows isolation of learning from movement between flowers.We trained queens on a visual associative learning task over a 6-day period, in which they had to repeatedly learn to associate a colour (blue/yellow) with a sugar reward, over multiple reversals.In honey bees, there is evidence to suggest that learning even a single association leads to synaptic reorganization in the brain.For example, Hourcade et al. (2010) trained individuals to a single odour association and found an increase in synaptic densities in the mushroom body lip, compared with bees in the control groups, suggesting that neural changes can be elicited by learning a single stimulus association.
We compared colony initiation success, brood production and offspring size in queens exposed to the learning protocol with control groups of queens either (1) exposed to the same stimuli and rewards, in the absence of pairing (to preclude learning), or (2) not exposed to the task at all.Since the effects of any stressor may not become apparent unless energy is restricted, we exposed half of our queens in all groups to a low-quality diet, by providing them with a lower concentration of ad libitum sucrose solution throughout the experiment, following a fully crossed design.We tracked colony formation all the way from initial egg laying to emergence of the first brood.Thus, our study represents a unique direct test of the impact of repeated exposure to a learning task on reproductive success.

Queen Production and Diapause
Bumble bee queens (N ¼ 210) were hatched and mated at Koppert Biological Systems, Slovakia and placed into diapause at 2e4 C and 90e95% relative humidity in three experimental blocks, each staggered by 1 week.Queens were free from parasites for the duration of the experiment.At weeks 8e10 of diapause, queens were shipped (at 4 C, without breaking their diapause) to Royal Holloway University of London, U.K., where they completed the remaining period of diapause (total diapause time ¼ 12 weeks).Queens were then weighed and placed into individual Perspex nestboxes (67 Â 127 mm and 50 mm high) and maintained in a temperature-controlled room at 27e29 C and 55e60% relative humidity.Ten queens did not awake from diapause and thus did not contribute further to the experiment.Egg laying was stimulated through introduction of an 8:16 h light:dark cycle for the first 14 days postdiapause, and queens were then housed under continuous darkness/red light for the remainder of the experiment.Queens were assigned to diet and learning treatment groups (see below) based on their mass, such that there was an equal distribution of mass in each treatment group (linear model, postdiapause mass did not significantly improve the model fit, the Akaike information criterion difference (DAIC) between full model and null model ¼ 9.35; mean ± SE mass ¼ 814.61 ± 4.79 mg).

Diet Treatment
Queens were allocated to either a high-quality (40% w/w sucrose solution) or low-quality (20% w/w sucrose solution) diet.When carbohydrates are restricted in this way, consumption may be slightly increased, but bees are unable to compensate fully for the lower sugar concentration by consuming a proportionately higher volume of liquid (Brown & Brown, 2020; see also our Results).Fresh sucrose of the assigned concentration was provided ad libitum for the duration of the experiment.We also provided queens from both diet treatments with the same ad libitum pollen (polyfloral freshfrozen honeybee-collected corbicular pollen, Agralan, U.K.; pollen patty 4:1 pollen:water).Fresh sucrose and pollen were replaced every 3 days, and we measured the consumption of sucrose and pollen when replacing food (Advanced Portable Balance Scout STX123 120 g, OHAUS Corporation ±1 mg).Nestboxes were also cleaned on these days.

Learning Trials
Following Muth et al. (2018) and Muth (2021), we adapted a visual associative learning task in which queens learnt to associate a colour with a sucrose reward in an unrestrained set-up (Fig. 1).Two hours before learning trials commenced, queens were removed from their boxes and placed into individual tubes (clear acrylic tubes, 24 mm internal diameter, 150 mm length), which were sealed at one end and contained a Perspex disc with openings for reward delivery at the other.Tubes were covered with black material such that only the first 3 cm and entrance disc were exposed to light.Queens were initially kept in the tubes for 2 h in the dark, to allow time to acclimatize.Trials were performed under natural daylight at approximately 24 C.Each day we began by giving queens two motivation trials, during which they were presented with a clear pipette tip filled with 5 ml 50% w/w sucrose solution and allowed to drink this fully.
Queens were then presented with the first of three blocks of 10 learning trials over 2 days (five trials per day), in which we simultaneously presented a blue and a yellow painted pipette tip (Beautiful Blue Gloss Finish, Paint Factory; Sunshine Yellow Gloss Finish, Paint Factory) filled with either 5 ml 50% w/w sucrose solution (reward) or 5 ml distilled water (no reward).We adapted the original protocol to use this novel method of reward delivery so that all queens received equal volumes of sucrose across the experiment.Once a queen had made a choice (approach, antennation and extension of the proboscis to the colour, or prolonged biting of the coloured tip) the liquid was dispensed from the pipette tip so that she could drink it, and the pipette tip was removed from the set-up.To ensure exposure to both the positive and negative associations, in each trial, the queen was additionally allowed to drink from the colour she had not initially selected, prior to it being removed.Trials during which a queen inspected both colours but did not make a choice for over 5 min were marked as unsuccessful.The intertrial interval was 12 min, which is known to induce longterm memory formation (Menzel, 2012;Menzel et al., 2001).On the final trial of each block (i.e.trial 10) we tested queens with both colours unrewarded (both filled with distilled water).Starting colours were randomized between groups.For each queen, we then repeated the entire process twice, reversing the colour association in each case (but see small adjustment for block 1 in Fig. A1 and Appendix).Thus, each queen participated in learning trials for 6 consecutive days, with an unrewarded test trial at the end of every 2-day learning bout followed by reversal of the rewarding colour.
We included two control groups: (1) an exposure control and (2) a full control.Queens in the exposure control group underwent the same protocol as queens in the learning treatment, but the presented coloured pipette tips were empty (we allowed queens to interact with the pipette tips in each trial).Queens were then fed the total volume of 50% sucrose solution that they would have received across the learning trials at the end of training each day.Queens in the full control group were not removed from their nestboxes and received an equivalent volume of 50% sucrose solution delivered directly into their boxes for each day of training.The volume of sucrose was adjusted based on the number of rewarding trials experienced by queens in the learning treatment that day (accounting for the unrewarded test trial every 10th trial).
We noticed that some queens became inactive in between trials, and they appeared to be resting.To allow queens to participate in trials voluntarily, we left inactive queens undisturbed and recommenced from the last completed trial when the queen became active again.We measured the additional time spent by queens in the tube for each day of learning (see Appendix).

Colony Monitoring
Following testing, we carried out daily inspections and noted the presence of eggs, callows and their sex, and any larvae or pupae that had been discarded, to monitor colony founding.Callows were removed from the colony within 24 h of hatching, and their thorax width measured (mm; Axminster Digital Electronic Callipers, ±0.01 mm).After 55 days, queens were euthanized, and their thorax width was also measured.We dissected the remaining brood and counted the eggs, larvae and pupae remaining.

Ethical Note
No licences were required for these experiments.However, we ensured high welfare of our bumble bees through regular feeding and cleaning of housing conditions as described above, and use of red light to minimize disturbance.All queens were euthanized using liquid nitrogen.Our experiments meet the approval of our Institutional ethics board.

Data Analysis
We used (generalized) linear models and mixed-effects models ((G)LM/(G)LMM) for data analysis.For each model set, we created a full model, the null model and all subsets of fixed factors, while retaining any random factors.We selected the best model from this set based on the AIC (where two competing best models were within DAIC 2.00 of each other, we selected the simplest; Burnham & Anderson, 2002).We then estimated each parameter estimate and its 95% confidence interval (CI) from the final model.
To confirm that queens in the learning treatment learnt the association, we modelled the proportion of correct choices (response variable) against trial number, including diet treatment and rewarding colour as additional fixed effects, and individual as a random effect (binomial error structure; link function ¼ 'logit').Trial numbers reflected the trial (1e10) for that colour prior to a reversal.To analyse unrewarded test trials (the last trial for each colour in that set of 10 trials), we followed a similar approach but did not include trial number.To check that additional time spent in tubes by queens was not an effect of diet treatment, and did not affect learning, we tested for differences between means of groups (t test) and correlated additional time and learning success, respectively.For all models of colony founding measures (oviposition probability, oviposition timing, callow production, callow thorax width, total brood production and discarded brood), we used diet treatment, learning treatment and their interaction as fixed predictors.In addition, we included queen thorax width as a covariate in models of oviposition probability, and callow type (worker or male) as a covariate in models of callow thorax width.In models of callow thorax width, we accounted for multiple callows per colony by including colony as a random factor.Total brood production was calculated by adding together the total numbers of callows, larvae and pupae in the brood at the end of the experiment, and any larvae, pupae or callows found dead during the experiment.Discarded brood was the measure of any larvae, pupae or callows found dead during or at the end of the experiment.The response variable error structures were binomial (link function ¼ 'logit') for oviposition probability (yes/no), Poisson (link function ¼ 'log') for total brood production, negative binomial (link function ¼ 'logit') for discarded brood, and normal for callow thorax width.To account for some queens producing no callows, callow production was modelled using a zero-inflated model with a binomial (probability of callow production) and negative binomial (callow count) error structure.To determine whether oviposition timing differed between treatments, we used a Cox proportional hazards model with diet treatment, learning treatment and their interaction as predictors, and day of oviposition as the response.Additionally, we used only queens in the learning treatment to ask whether oviposition probability, and separately the total number of callows produced, were predicted by individual learning performance (i.e. the total number of correct choices made across all learning trials for that individual).We used a binomial and negative binomial error structure for these models, respectively.
Finally, we modelled whether the consumption of sucrose and pollen differed between treatments.We used diet treatment, learning treatment and their interaction as covariates, and queen as a random effect.For measures of sucrose consumption, we adjusted values for evaporation, as low diet sucrose had a 1.25Â higher evaporation rate.Pollen measures were not adjusted for evaporation, as the pollen provided was the same across both diet treatments, and measuring pollen evaporation is inaccurate as queens will often add sucrose to the pollen.For pollen consumption, we performed a square root transformation and included day after oviposition for each measure of pollen consumption as a covariate.

Learning Trials
A total of 68 queens performed 2040 individual learning trials across a 6-day period.The probability of making a correct choice increased significantly across each set of 10 trials (GLMM: trial parameter estimate: 0.21; 95% CI: 0.18 to 0.25; Fig. 2a, Table A1), and was also significantly higher when the rewarding colour was yellow compared with blue (colour parameter estimate: 0.99; 95% CI: 0.78 to 1.21; Fig. A2, Table A1).There was no effect of diet quality on learning (diet parameter was not included in the final model; Fig. 2b).When looking only at the unrewarded test trials (the last trial for each colour in that set of 10 trials), the probability of making a correct choice was 0.84 ± 0.14 (mean ± variance).There was no effect of diet or colour type on correct choices made during test trials (GLMM; neither diet nor colour parameters were included in the final model; Table A2).Thus, overall, the conditioning protocol was effective in eliciting learning, as intended.

Colony Founding
There was no effect of learning treatment on any colony founding measures (treatment was not included in the final model).However, diet quality significantly decreased the probability of egg laying, with 49 versus 83% of queens laying eggs on a low-quality versus high-quality diet, respectively, over the 55-day experimental period (GLM: diet parameter estimate ¼ À1.63; 95% CI: À2.30 to À0.99; Fig. 3a, Table A3).Queens on the low-quality diet also appeared to lay eggs later than queens on a high-quality diet (Cox proportional hazards model: diet coefficient: À1.04; 95% CI: À1.40 to À0.68; median (range) day low-quality diet: 55 (10e55); highquality diet: 24.5 (7e55); Fig. 3b, Table A4).There was no interaction between learning treatment and diet on oviposition probability or timing (the interaction term was not included in the final model).
When looking only at queens in the learning treatment, learning performance (i.e. the proportion of correct choices made across all learning trials) did not predict probability of egg laying (GLM: trial choice did not make the final model; Fig. A3a, Table A7) or the number of offspring produced (GLM: callow number did not make the final model; Fig. A3b, Table A8).
Total brood production (all callows, plus larvae and pupae in the brood at the end of the experiment, and any dead or discarded larvae, pupae and callows) was significantly lower for queens on a low-quality diet (GLM: diet parameter estimate: À0.80; 95% CI: À1.24 to À0.34; Fig. A4, Table A9).Discarded brood (dead larvae, pupae and callows discarded during the experiment or found in the brood at the end of the experiment) were found in 95 of 200 colonies.Queens on a low-quality diet had significantly fewer discarded brood (GLM: diet parameter estimate: À0.68; 95% CI: À1.19 to À0.16; Table A10).
Queen mortality was low, with only eight queens dying over the 55-day experimental period (five in the low-quality diet treatment and three in the high-quality diet treatment).

Consumption
Queens on the low-quality diet consumed 1.25Â more sucrose than queens on the high-quality diet (mean ± SE consumption over a 3-day period: low diet ¼ 3452 ± 60 mg; high diet ¼ 2771 ± 35 mg; diet parameter estimate: 646.80; 95% CI: 213.71 to 1079.97;Fig. A5a, Table A11).Pollen consumption was significantly lower for queens on the low-quality diet (diet parameter estimate: À2.72; 95% CI: À3.58 to À1.85; Fig. A5b) and increased significantly with days after oviposition across all treatments (day postoviposition parameter estimate: 0.34; 95% CI: 0.32 to 0.36; Table A12).There was no difference in sucrose or pollen consumption between learning treatments (learning treatment was not retained in the final model).

DISCUSSION
Aspects of learning and memory are predicted to bring about induced energetic costs (Burns et al., 2010), yet we have limited evidence of the potential proximate impacts on other energydemanding processes that result when an individual invests in the learning process.We predicted a trade-off between investment in learning and reproductive output based on previous studies in which individuals that learn show lower survival, egg laying and immune function compared with nonlearning individuals (e.g.Mery & Kawecki, 2004;Jaumann et al., 2013;Mallon et al., 2003).Our bumble bee queens successfully learnt and reversed the association between a colour and a sucrose reward, with, on average, a >80% chance of making a correct choice during unrewarded test trials.However, we found no evidence for an impact of learning on reproductive success.
Energetic investment in learning is likely to be dependent on the difficulty of the learning task.We designed our learning task to maximize complexity by (1) using an intertrial interval of 12 min to induce long-term memory formation, which is likely to carry a relatively higher cost compared with other memory phases as it requires de novo protein synthesis (Menzel, 2012); (2) adding a reversal learning element, during which interference from previously rewarding memories may increase the cost in coding/ overwriting with new ones (Tello-Ramos et al., 2019); and (3) performing trials over 6 consecutive days, mimicking the potential foraging patterns of new queens after emergence from hibernation (Goulson, 2010).Given that foraging individuals often focus on a single flower species until it becomes unrewarding (known as floral constancy; Chittka et al., 1999), we expected our task to be ecologically relevant.Nevertheless, foraging bees likely employ multiple modes of learning, including using visual and olfactory cues to identify flowers (Menzel, 1993), thus performing only one learning assay may be an oversimplification of the investment that occurs when foraging in the wild.Furthermore, individuals may use both short-and long-term memory to make within-and betweenpatch decisions when foraging (Pull et al., 2022).There is evidence to suggest potential trade-offs between memory phases (Lagasse et al., 2012); thus, underlying costs may not be revealed when measuring a single memory type under laboratory conditions.
Individuals may compensate for the expression of costly traits by increasing their energetic intake (Plaçais et al., 2017), and memory formation may be restricted under energy-limited scenarios (Plaçais & Preat, 2013).To limit potential compensation for energy invested in memory formation in our queens, and because costs may not be revealed unless other stressors are present, we added a nutritional limitation by feeding half of our queens on a low-quality diet with a 50% lower carbohydrate concentration.Queens in this group were exposed to the low-quality diet from emergence and for 48 h prior to starting learning trials, and then for the duration of the experiment.We found queens fed a lowquality diet consumed 1.25Â more artificial nectar compared with queens fed a high-quality diet; however, this did not vary between learning treatment groups.We therefore found no evidence for compensatory energy intake in learners on a low-quality diet, and feeding on a low-quality diet did not appear to affect learning performance.
One potential explanation for the lack of a detected cost is that our study precluded stressors other than nutrition, such that queens did not incur the costs of flight, thermoregulation or infection that they would in the wild ( 1Àp), where p is the probability of making a correct choice across all bees for that trial) of correct choices for each trial across the 6-day learning period (five trials per day, total trials ¼ 30, N queens per trial ¼ 63e68).Grey bars show the final trial of each colour, which was unrewarded.The rewarding colour (blue/yellow) was reversed every 10 trials.(b) Proportion of correct choices made by queens fed a high-quality and low-quality diet (40% and 20% w/w sucrose solution, respectively) across all learning trials.The horizontal line shows the median, the box shows the interquartile range, the whiskers show the values within 1.5Â the interquartile range and the points show the outliers.2000; Silvola, 1984).Uniquely, our queens were laboratory bred, and therefore aseasonal, free from parasites, with no previous exposure to the external environment and a standardized diapause time.While this allowed us to standardize for potential confounds (e.g.parasites may negatively affect learning; Gegear et al., 2006; previous experiences affect learning ;Cheng & Wignall, 2006), in the wild stressors may act synergistically (Goulson et al., 2015;Siviter et al., 2021), meaning a potential trade-off may not be large enough to detect in such a controlled set-up.Further, the ecological costs incurred during the learning process may be greater than potential physiological costs, and these are difficult to recreate in laboratory studies (Liefting, 2022).These include making unfavourable decisions during the learning process (Dunlap & Stephens, 2016;Laverty & Plowright, 1988), memory interference of previous memories leading to mistakes in new learning tasks (Cheng & Wignall, 2006), and certain memory types being maladaptive in different environments (Pull et al., 2022).The consequences of such mistakes are likely to be more severe in the wild than in the laboratory (e.g.there is no cost of predation or extreme weather in the laboratory), meaning a potential cost is not detected in such controlled studies.For example, bumble bee workers that showed relatively better learning abilities in a laboratory assay had shorter foraging careers when released into the wild, suggesting a potential cost to learning proficiency was revealed when foraging in a natural setting (Evans et al., 2017).However, within the confines of our protocol, we can suggest that the potential costs of a simple associative learning task do not appear to negatively affect reproductive success in laboratory-bred bumble bee queens.
While we did not detect a cost of learning that impacted life history traits, we found that diet treatment was a strong predictor of colony founding success.Queens fed a relatively low-quality diet, manipulated by providing artificial nectar with a 50% lower sucrose concentration compared to the high-quality diet, had a >50% lower probability of egg laying, delayed egg laying, and were ca.50% less likely to rear offspring.Nectar is the main source of carbohydrates for bees (Brodschneider & Crailsheim, 2010), and we predict that the low-quality diet restricted energy available for ovary maturation and brood incubation (Cartar & Dill, 1991;Vogt et al., 1998), resulting in reduced and/or delayed egg laying and fewer offspring.However, nectar quality did not appear to affect offspring size, suggesting a potential number versus size trade-off in offspring production, with all queens investing similar energy in each of their offspring, but energy-deprived queens being limited in the number of offspring they can invest in.Fewer and later production of workers is likely to negatively impact colony growth and the future production of sexuals (Pomeroy & Plowright, 1982).
Another factor that affects brood production and offspring size is pollen quality (Brodschneider & Crailsheim, 2010), which we did not vary between diet treatments in our experiment.We chose not to manipulate pollen diets, as the brain uses glucose as its primary energy source (Sokoloff, 1999) and nectar quality is thus likely to be more relevant for learning and memory traits.Further, the ratio of protein to carbohydrate could be important to consider.This ratio affects survival, growth and ovary activation in honey bees (Helm et al., 2017;Pirk et al., 2010) and may have impacted the number of offspring produced by queens on our low-quality diet.Our results add to a growing body of evidence suggesting that both nectar availability and pollen quality and/or diversity are important for queen-right colony growth and development (Leza et al., 2018;Rotheray et al., 2017;Watrobska et al., 2021;Watrous et al., 2019;Woodard et al., 2019).Given ongoing bumble bee population declines that have been linked in part to land-use change and floral resource availability (Goulson et al., 2015;Woodard & Jha, 2017), further work is needed to identify the nutritional needs of bumble bee queens at the colony-founding life cycle stage.
In conclusion, we found that any potential energetic costs of visual associative learning do not appear to impact reproductive success in bumble bee queens when they emerge from diapause in a controlled laboratory set-up, suggesting that detecting a potential cost is dependent on environmental variables and the interaction with other stressors.Nectar quality did not affect learning performance but was an important predictor of colony success across all treatment groups, with queens fed an artificial nectar of higher sucrose concentration showing relatively higher reproductive success compared with queens fed a diet with a lower sucrose concentration.Our results draw into question the widespread assumption that investing in the active learning process is a tradeoff against other energetically costly traits.

Methods: Learning Trials
During the first block of the experiment, we began the first learning trials by switching the rewarding colour after five, instead of 10, trials.However, we were not convinced that five trials allowed queens enough time to sufficiently learn the association (Fig. A1).We therefore decided to extend each colour learning time to 10 trials, as per the protocol outlined in the methods of the main text.This allowed queens to consolidate memories from the first day of trials during the second day, before having to reverse the association.The switch in protocol occurred on day 3 of learning trials for the first block (meaning only the first 2 days of trials for the first block of the experiment were affected).For the main analysis of learning (probability of making a correct choice), we excluded trials 6e10 for queens from block 1 (N¼110 of 2040 trials removed).However, including these data did not change the overall result, and continued to show that the probability of making a correct choice increased significantly with trial number (GLMM: trial parameter estimate: 0.20; 95% CI: 0.16 to 0.23) and was significantly higher when the rewarding colour was yellow (colour parameter estimate: 1.03; 95% CI: 0.82 to 1.24).

Results: Additional Time Spent in Tubes
Sixty-four of 68 queens appeared inactive during one or more learning trials, and therefore spent additional time in their learning tubes.The queens appeared to be resting, so we left them undisturbed until they became active again, and then picked up from the last completed trial.Forty-one trials (from eight different queens) were marked as unsuccessful as queens did not become active again despite waiting until the end of the day.The total additional time spent in a tube by a queen across the 6 days did not differ between diet treatments (mean additional time per queen each day ± SE ¼ 56 ± 5 min; t test: t 62 ¼ 0.002, P ¼ 1.00) and additional time in the tube was also not correlated with learning score (Pearson correlation: r 377 ¼ 0.06, P ¼ 0.23). .Proportion mean ± variance of correct choices made during associative learning trials by queens in block 1 of the experiment (we had three experimental blocks, each staggered by 1 week).Queens completed five learning trials per day, on 6 consecutive days.Solid grey bars show the final test trial (unrewarded) before a colour reversal.Initially, the rewarding colour was reversed after five trials (days 1e2), but we decided to extend the number of trials for each colour, so that reversals occurred every 10 trials.Queens in experimental blocks 2 and 3 (not shown) underwent colour reversals every 10 trials, as per the protocol in the main text.  .Total brood production (mean ± SE), which is the total number of callows that hatched over the 55-day experiment, plus any larvae and pupae dissected from colonies at the end of the experiment (including any dead larvae or pupae found during or at the end of the experiment but excluding eggs) across learning and diet treatments: high-quality diet (solid lines); low-quality diet (dashed lines); learning group (pink); exposure control (yellow); full control (green).

Figure 1 .
Figure1.Bumble bee queens were trained on an associative learning protocol adapted fromMuth et al. (2018) andMuth (2021).(a) Timeline of learning trials for queens in the learning group.Queens were presented with a yellow and a blue pipette tip, one filled with 5 ml 50% w/w sucrose (rewarded) and the other filled with 5 ml water (unrewarded), for five trials per day over a 6-day period.The rewarding colour was reversed every 10 trials.On trial 10 (the final trial prior to colour reversal, indicated by grey shading), both pipette tips were presented unrewarded.(b) Queens in the learning and exposure control groups were both presented with two motivation trials at the start of each training day (clear pipette tip baited with 5 ml sucrose, indicated by a grey droplet).Queens in the learning group were then simultaneously presented with one rewarded (grey droplet) and one unrewarded (5 ml water, white droplet) pipette tip, whereas queens in the exposure control group were presented with both pipette tips unfilled and allowed to interact with them but were not rewarded.Queens in the full control group did not leave their nestboxes.Queens in both control groups received the equivalent volume of sucrose as queens in the learning group at the end of trials for that day, and the volume was adjusted each day based on how many rewarding trials queens in the learning treatment had experienced.(c) Example of queens in the set-up.Queens in the learning and exposure control groups were transferred to cylindrical tubes for learning trials.Tubes had a Perspex disc with an opening for reward delivery at one end and were covered with black material to reduce stress.Coloured pipette tips were presented through openings in the Perspex discs.

Figure 2 .
Figure2.(a) Learning curve showing the proportion mean ± variance (calculated as variance ¼ p(1Àp), where p is the probability of making a correct choice across all bees for that trial) of correct choices for each trial across the 6-day learning period (five trials per day, total trials ¼ 30, N queens per trial ¼ 63e68).Grey bars show the final trial of each colour, which was unrewarded.The rewarding colour (blue/yellow) was reversed every 10 trials.(b) Proportion of correct choices made by queens fed a high-quality and low-quality diet (40% and 20% w/w sucrose solution, respectively) across all learning trials.The horizontal line shows the median, the box shows the interquartile range, the whiskers show the values within 1.5Â the interquartile range and the points show the outliers.

Figure 3 .
Figure 3. Colony founding measures taken over the 55-day experimental period in bumble bee queens.Treatments: high-quality diet (solid lines); low-quality diet (dashed lines); learning group (pink); exposure control (yellow); full control (green).(a) Mean ± variance probability of oviposition (N¼200 queens across all treatments); (b) timing of oviposition; (c) mean ± SE number of offspring (workers and males combined) produced by queens in each treatment (points show the raw data); (d) thorax widths of offspring (workers and males combined) that hatched during the experiment.The horizontal line shows the median, the box shows the interquartile range, the whiskers show the values within 1.5Â the interquartile range, the dark points show the outliers and the light points show the raw data.
Figure A1.Proportion mean ± variance of correct choices made during associative learning trials by queens in block 1 of the experiment (we had three experimental blocks, each staggered by 1 week).Queens completed five learning trials per day, on 6 consecutive days.Solid grey bars show the final test trial (unrewarded) before a colour reversal.Initially, the rewarding colour was reversed after five trials (days 1e2), but we decided to extend the number of trials for each colour, so that reversals occurred every 10 trials.Queens in experimental blocks 2 and 3 (not shown) underwent colour reversals every 10 trials, as per the protocol in the main text.

Figure A2 .Figure A3 .
Figure A2.Proportion mean ± variance of correct choices made by queens presented with a yellow (yellow line) or blue (blue line) rewarding stimulus, across 30 trials.Every 10th trial was an unrewarded test trial (grey dashed line), after which the rewarding colour was reversed.Error bars show the variance around the proportion.