Motivation, accuracy and positive feedback through experience explain innovative problem solving and its repeatability

.

Acquiring resources in changing environments is a major challenge faced by animals and a key determinant of fitness.Innovation, the generation of a novel behaviour or use of a known behaviour in a novel context, most commonly achieved through a problemsolving process, is one mechanism that a wide range of animals use to meet this challenge (Seed & Mayer, 2017).Comparative analysis has provided evidence for selection acting on innovativeness across species, because it helps animals find new food sources, or adapt to new environments and seasonal changes (Daniels, Fanelli, Gilbert, & Benson-Amram, 2019;Lefebvre, Reader, & Sol, 2004;Reader, 2003;Reader & Laland, 2002;Sol, Lefebvre, & Rodríguez-Teijeiro, 2005;Webster & Lefebvre, 2001).Furthermore there is growing evidence of a link between innovativeness and fitness within populations (Cauchard, Boogert, Lefebvre, Dubois, & Doligez, 2013;Cole, Morand-Ferron, Hinks, & Quinn, 2012;Preiszner et al., 2017), and that innovation enables invasive or urbanized species to make use of novel resources (Daniels et al., 2019;Griffin & Diquelou, 2015;Griffin, Diquelou, & Perea, 2014).Although the underlying proximate causes of individual variation in innovativeness are diverse (for example, infection by parasites, Dunn, Cole, & Quinn, 2011;social factors, Thornton & Samson, 2012; natal environment effects, Kotrschal & Taborsky, 2010), repeatability analyses suggest differences between individuals are consistent, pointing to intrinsic, potentially additive genetic, sources of variation (Cauchoix et al., 2018;Cole, Cram, & Quinn, 2011;Morand-Ferron, Cole, Rawles, & Quinn, 2011).One of the major challenges in the field is that innovativeness is a composite trait driven by a range of disparate behavioural processes that selection may act on independently and that may explain consistent differences in performance between individuals.These processes include cognition and motivation, as well as personality traits like exploration, persistence and neophobia (Griffin & Guez, 2014;Lermite, Peneaux, & Griffin, 2017;Seed & Call, 2010;Taylor, Hunt, Medina, & Gray, 2009).Thus, a key objective is to determine which processes drive innovativeness and explain the consistent individual differences observed.
Personality, defined as within-individual behavioural consistency across time and contexts (R eale, Reader, Sol, McDougall, & Dingemanse, 2007), provides a framework for exploring constraints on behavioural plasticity (Dall, Houston, & McNamara, 2004) and individual problem-solving performance (Hopper et al., 2014;Morton, Lee, & Buchanan-Smith, 2013).Personality traits have attracted particular attention because they predict individual variation in a wide range of behavioural traits (Aplin, Farine, Mann, & Sheldon, 2014;Cole & Quinn, 2012, 2014).Studies in the wild (Dingemanse, Both, Drent, van Oers, & van Noordwijk, 2002;Highcock & Carter, 2014) and in the laboratory (David, Auclair, & C ezilly, 2012;van Oers & Naguib, 2013) show that the personality trait 'early-life exploratory behaviour' (more specifically in this case, repeatable differences in the reaction to both a novel environment and objects; Drent, van Oers, & van Noordwijk, 2003) can influence how individuals retrieve information from their environment (Smit & van Oers, 2019), how quickly they solve problems (Hopper et al., 2014), and the degree of behavioural flexibility shown (Coppens, de Boer Sietse, & Koolhaas Jaap, 2010).In particular, fast-exploring (hereafter 'fast') individuals may be quicker to interact with or solve tasks (Benson-Amram & Holekamp, 2012;Trompf & Brown, 2014) but show less behavioural plasticity (Amy, van Oers, & Naguib, 2012;Jolles, Briggs, Araya-Ajoy, & Boogert, 2019;Logan, 2016c et al., 2012;Zandberg, Quinn, Naguib, & van Oers, 2017).Additionally, neophobia (the fear of novel food, objects or places; Greenberg & Mettke-Hofmann, 2001) can constrain both the latency to approach a novel object and engagement in tasks.For example, individual hyenas, Crocuta crocuta, that showed greater persistence, activity or lower neophobia were faster to solve a problem (Johnson-Ulrich et al., 2018).However, the evolutionary significance of links between innovation and personality traits, as defined in R eale et al. (2007), is often unclear because the genetic basis for the personality variation is usually unknown (Cole et al., 2011), except in those few cases where personality-selective breeding lines have been used (Drent et al., 2003;van Oers Drent, de Goede, & van Noordwijk, 2004, van Oers, de Jong, van Noordwijk, Kempenaers, & Drent, 2005).Moreover, the role of other personality traits at different stages of innovative problem solving (e.g.interacting with a problem, solving a problem and ceasing to perform outdated solutions) and its interactions with other factors such as stress and motivation remain largely unexplored.Individual differences may be especially pronounced under stress (Suomi, 2004), but this has scarcely been tested.Note that although all behavioural variation can be defined as personality in a statistical sense (e.g.Dingemanse & Dochtermann, 2013), here we follow R eale et al. ( 2007) who focused on five kinds of behavioural traits, including exploration behaviour, that inherently capture variation in many other behavioural traits.
Motivation is expected to be an important driver of innovative behaviour (Laland & Reader, 1999;Sol et al., 2012) and to affect all stages of innovation.The 'necessity drives innovation' hypothesis states that innovative behaviours commonly occur when individuals are in need (Reader & Laland, 2003), that is, when they are motivated (Laland & Reader, 1999).For example, subdominant or juvenile individuals are often assumed to be more likely to innovate because they are less competitive when foraging (Morand-Ferron et al., 2011;Thornton & Samson, 2012).The rarely tested assumption in these studies is that hunger acts as the motivating factor driving innovation.In animal behaviour studies, food deprivation is commonly applied to ensure trial participation (Birch, 1945;Overington et al., 2011;Sol et al., 2012), or when attempting to control for confounding effects of motivation (Ebel & Call, 2018;van Horik & Madden, 2016).However, the extent to which motivation may influence innovative problem-solving behaviour at an individual level has scarcely been examined explicitly (Griffin & Guez, 2014).
Here we explore behavioural processes that are predicted to cause variation during sequential innovative problem solving, using second-and third-generation birds selected for personality.Selection lines are a powerful means to investigate inherent effects of personality on problem-solving performance as opposed to simple phenotypeephenotype correlations.We used a device that incorporated three different extractive foraging access points to provide a more complete measure of individual performance.The solutions relied on different motor skills, thus limiting the effects of individual motor skill bias, and previous motor skill experience carrying over to solving new access points.We examined variation in three different behavioural assays involved in innovative problem solving: (1) latency to touch the novel apparatus; (2) accuracy when interacting with any access point on the device; (3) problemsolving success within each trial.Then we examined (4) the individual's overall innovativeness (the number of different access points solved at least once across all trials).We considered a range of potential explanatory factors for these different behavioural facets, including extrinsic motivation (hunger state, the only experimentally manipulated factor), inhibitory control, previous experience and personality (fast/slow selection lines).In line with the theory that individual differences may be more pronounced under stress (Suomi, 2004), we investigated the interaction between motivation and personality, assuming that birds in the highmotivation (food-deprived) treatment were more stressed than those in the low-motivation treatment.Finally, to determine whether individual differences were consistent, we estimated repeatability for (1)e(3) and examined whether controlling for fixed effects modified our estimates of repeatability.Repeatability sets the upper limit of heritability and is fundamental in studies on the evolutionary ecology of innovation and behaviour generally.Although uncontrolled confounding effects can potentially lead to an underestimate of repeatability, more commonly they lead to overestimates (pseudorepeatability) and sometimes explain repeatability entirely (Catry, Ruxton, Ratcliffe, Hamer, & Furness, 1999;Dingemanse & Dochtermann, 2013;Westneat, Hatch, Wetzel, & Ensminger, 2011).
We predicted that: (1) birds in the high-motivation treatment group would have a reduced latency to touch the device, show increased accuracy (i.e. a high proportion of interactions with functional components, rather than nonfunctional components of the device), be more likely to solve an access point and solve more of them; (2) fast explorers would have a shorter latency to touch the device and lower accuracy when interacting with the device than slow explorers, but they may have a higher likelihood of solving due to higher exploration of the device; (3) previous experience would enable innovation, by causing a decrease in latency to touch the device, an increase in accuracy when interacting with it, and an increased likelihood of solving; (4) likelihood of solving in a trial would increase with accuracy (i.e. with higher frequency of interactions with functional components); and (5) birds with higher inhibition ability would be more likely to adjust their behaviour to solve multiple access points.

METHODS
All experiments were carried out at the Netherlands Institute of Ecology (NIOO-KNAW), on 36 captive-bred great tits, Parus major.All birds included in the study were adult (2 years or older).Seventeen birds were not related to each other, five had one sibling and 14 shared more than one sibling; we assume relatedness between individuals had no bearing on the results.Birds were housed individually in standard cages (0.9 Â 0.5 m and 0.5 m high) containing three perches and a water bath.Birds were in auditory contact but were visually isolated to prevent social learning.All birds had ad libitum access to water and a maintenance diet (ground beef heart, commercial egg food, fruit and calcium) unless otherwise stated.One bird did not participate in any of the experiments and was thus excluded from any analysis.

Personality
Birds came from the second and third generation of bidirectionally phenotypically selected great tits, based on personality for 'fast exploration' (fast, N ¼ 18) and 'slow exploration' (slow, N ¼ 18).The measure of 'exploration' used during the selection process was a combination of two novel object tests where the latency to touch a novel object was recorded (e.g. a pink panther toy or an AA battery taped to a wooden stick), and one novel environment test where birds were released into a room and the latency to land on the fourth out of five artificial trees was recorded (for further details on selection and personality lines see Drent et al., 2003).The birds in the final selection lines used here underwent these same assays after fledging to confirm their personality type.As the specific aim of this study was to investigate the effects of artificially selected personality lines on problem solving and because the bird behaviour matched their selected personality type, we analysed personality according to their selection history only (i.e.fast or slow selection lines).

Motivation
Individuals were randomly assigned to one of two motivation treatment groups for the duration of the experiment based on hunger state.The low-motivation group consisted of sated individuals, given full access to maintenance diet up to the start of the trial.Additionally, to ensure they were sated, they were given three wax moth, Achroia grisella, larvae 30 min before trials began, and invariably ate them all.The high-motivation group consisted of food-deprived birds, which had all sources of food removed from their cage for 1 h before the trial (H€ am€ al€ ainen, Rowland, Mappes, & Thorogood, 2019).For welfare purposes, all birds had access to water during the trials.Motivation treatment was spread across the selection lines in four categories: high motivation, fast (female N ¼ 4 and male N ¼ 5); high motivation, slow (female N ¼ 5 and male N ¼ 4); low motivation, fast (female N ¼ 5 and male N ¼ 4); low motivation, slow (female N ¼ 5 and male N ¼ 3).

Lever-pulling Propensity
All trials described here and in the following section were carried out in the birds' individual home cages, under natural winter diurnal light cycles.To establish whether the birds had a preexisting tendency to lever pull (von Bayern, Heathcote, Rutz, & Kacelnik, 2009), and because some birds may have had previous experience lever pulling in previous experiments while others had not, we measured lever-pulling propensity prior to testing them on the multiaccess device.We presented all birds with an opaque PVC rectangular tube containing a lever-supported platform with half a wax moth larva (Zandberg et al., 2017).We used an opaque device to test whether birds had a propensity to pull a stick, independent of a visual food reward cue, because the previous device that had been used was also opaque (Zandberg et al., 2017), and because we did not want the birds to have experience with the main innovation test device beforehand.All birds were given up to four trials (30 min per trial) to obtain the food reward, by pulling the lever horizontally causing the platform and reward to drop.Individuals that solved this opaque task at least once were classified (in the main analysis on the multiaccess problem-solving task described below) as having previous experience with solving the leverpulling task.All birds progressed to the multiaccess task irrespective of their performance in this opaque device (Fig. 1).

Multiaccess Problem-Solving Task
Birds were presented with a multiaccess problem-solving apparatus (Fig. 2) with three distinct solutions that required different motor skills (see below), to obtain a preferred food reward (a wax moth larva).The apparatus was an upright Perspex cylinder (5 cm diameter and 16 cm high), with a platform holding the food reward.The platform was supported by a lever, which when pulled from the outside of the device caused the platform to drop, releasing the food reward below the device (solution 1).A second possible solution was to move a door that could be pushed left or right, to gain access to the food reward on the platform (solution 2).A third possible solution was to pull a string from the top of the device, which was attached to a second larva (solution 3).Each of these access points involved different motor action(s) including pulling (solution 1), pushing (solution 2) and coordinating both grasping and pulling (solution 3).
Experiments were scheduled evenly across mornings and afternoons for both treatments and personality lines.In each trial, subjects were presented with the device and given 30 min to solve any of the access points.Birds were given two trials per day, backto-back, without being fed between trials.Following their second trial, their maintenance diet was returned until testing the following day.The experiment ended when they had solved all three access points three times, or when they had failed to solve any over three consecutive trials (total number of trials 3e13).Once an individual solved the same access point across three separate trials, that access point (door, lever or string) was fused, mimicking natural depletion of that food source, which meant that solving that access point was no longer possible, although it remained present and visible.We allowed birds to solve each access point three times to increase the chance that the behaviour became fixed in their repertoire.To solve a novel solution, they would need to behave flexibly, which we predicted would be guided by inhibitory control.Great tits from selection lines in this facility readily participate in experiments, so we assumed the three trials were sufficient to allow them to overcome any neophobic response.
All trials were recorded using a Panasonic HC-V250EB-K camera mounted on a tripod, covered in camouflage tape and positioned 1 m from the cage.Videos were analysed using Behavioural Observation Research Interactive Software (BORIS; Friard & Gamba, 2016).Observers were blind to the personality assigned to the birds but were aware of the motivation treatment group.Ten per cent of videos were coded by a second person.Interrater reliability was assessed using a Kendall's tau correlation test for agreement on the following measures: total number of touches to the device per trial, P < 0.001; touches to functional access points on the device per trial, P < 0.01; touches to anything other than functional access points on the device per trial, P < 0.005.

Inhibition Task
To generate an independent estimate of each individual's motor inhibition, we used a classical detour-reaching task (Beran, 2015;Boogert, Anderson, Peters, Searcy, & Nowicki, 2011;Rothbart & Posner, 1985), which tests to what extent the birds could control the prepotent response of pecking straight towards a food reward visible within a transparent Perspex tube.To pass the test, birds had to obtain the reward by accessing it through the opening on the side (Thorndike, 1911).The detour task was performed on a subset of 20 birds, prior to the problem-solving task (number of days between end of the detour-reaching task and first test day on the multiaccess device: mean ± SE ¼ 11 ± 0.46, minimum ¼ 8, maximum ¼ 12) to control for carryover experience with the transparent Perspex.Birds were not food deprived before this task.To quantify their previous experience and propensity to pull sticks, individuals were initially presented with an opaque tube with a lever.They had four trials (30 min each) in which to pull the lever.Once they solved the task once, they were classified as having previous experience solving a lever task.All birds progressed to the multiaccess problem-solving task where they were presented with the transparent experimental device in which three access points were functional.Each bird had to solve the task using the same access point three times, before moving onto the next phase, where the previously solved access point was fused, leaving the remaining functional access points.This process was repeated for the other two access points.At any point of the testing, if a bird failed over three consecutive trials, participation in the experiment ended for that bird.Dashed arrows indicate there is an alternative progression to complete the experiment.
Figure 2. The multiaccess problem-solving device given to birds in their home cage.The apparatus had three different access types to retrieve the food reward inside: a lever, a swing door and a string.
There were three phases to this task: habituation, training and test phases.Birds participated in one phase per day, with progression through the phases occurring over consecutive days (duration of testing: mean ± SE ¼ 1.64 ± 0.18 days, minimum ¼ 1, maximum ¼ 4).In the habituation and training phases, the Perspex tube was opaque (covered with black tape).To familiarize the birds with the apparatus, a wax moth larva was placed at the opening edge of the tube.Birds passed the habituation phase when they had eaten the reward three consecutive times.During the training phase, individuals had to obtain the food reward located in the centre of the opaque tube without touching any other part of the device.Training was completed when this was done successfully during four of five consecutive trials, ensuring the birds had the motor skills and experience necessary to move around the tube to successfully obtain the larva.During the test phase, the food reward was placed in the centre of a transparent tube.Birds had to remove the food reward without pecking on any other part of the device to complete the trial successfully.Inhibitory control scores were quantified as the number of trials it took individuals to complete four of five consecutive trials correctly.All trials were a maximum of 3 min each and observed remotely by livestreaming to a mobile phone using a Wi-Fi-enabled SJCAM SJ4000 camera (Shenzhen Zhencheng Technology, Shenzhen, China).

Ethical Note
We performed the experiment in accordance with the ASAB/ABS guidelines.All experiments were approved by an ethical committee (DEC-KNAW licence no.NIOO 14.12 to K.V.O.) and daily health checks were carried out to ensure the birds' welfare.Birds were returned to the stock population after the experiment.

Statistical Analysis
We tested whether multiple factors influenced different response variables at different stages of sequential innovative problem-solving performance: (1) latency to touch, (2) accuracy, (3) likelihood of solving and (4) innovativeness.Separate analyses were conducted using R Studio (R Studio Team, 2019) on each of the four phases described above (1e4), and we repeated these models on the subset of birds (N ¼ 22) that completed the inhibition task.For touch latency and accuracy, we conducted general linear mixed models (GLMMs) using the nlme package (Pinheiro, Bates, DebRoy, Sarkar, & R Core Team, 2019) fitted with a normal distribution; for likelihood of solving we ran a GLMM using the lme4 package (Bates, M€ achler, Bolker, & Walker, 2015) fitted with a binomial distribution; and for innovativeness we ran a general linear model (GLM) with Poisson distribution (see Table A1 for a full list of variables and their definitions).In line with Whittingham, Stephens, Bradbury, and Freckleton (2006), we retained all variables of biological significance in the initial models to test specific hypotheses.For model selection, we used Akaike's information criterion (AIC) to measure goodness of fit (reported in table legends) and likelihood ratio tests to determine which model explained more variance.We compared full models (with the interaction between motivation and personality) to null models, and then compared full models to reduced models (i.e.without the interaction between motivation and personality).We dropped the interaction term from the model if the likelihood ratio test was nonsignificant (alpha ¼ 0.05).To confirm that this hypothesis testing approach did not lead to a Type 2 error due to overfitting, we further reduced each model to the minimum adequate model using backwards reduction (see Tables A2eA5).
We checked that all models met assumptions (homogeneity, normality of residuals and collinearity of explanatory variables) using the DHARMa package in R (Hartig, 2020).We calculated confidence intervals (CI) for the random factor and residuals in each model using the package nlme in R (Pinheiro & Bates, 2006).In the legend of each table, we report marginal R 2 (defined as the proportion of variance in the dependent variable that is explained by the fixed factors only), and conditional R 2 (defined as the proportion of variance in the dependent variable that is explained by the fixed and random factors), or pseudo R 2 (the marginal R 2 of a Poisson GLM, which does not include random factors).We tested the following 4 models.
(1) Latency to touch the device was log transformed to fit a Gaussian distribution (total trials ¼ 226).The following fixed effects were included in our model: motivation (low or high), trial number, selected personality lines (fast or slow exploring), previous experience of solving any functioning access point including the opaque device (no or yes) and sex (male or female).Individual bird identity was included as a random effect to control for repeated measures and to test repeatability of individual differences.
(2) Accuracy was defined as the number of touches to a functioning access point divided by the total number of touches to any part of the device per trial.Fixed effects included interaction rate (total number of touches to any part of the device per min, per trial), motivation group, trial number, selected personality lines, previous experience of any functioning (but not fused) access point (including previous experience of lever-pulling propensity on opaque device), fused trial (where any of the solutions were fused and therefore unavailable, as a fused access point may decrease accuracy) and sex.Individual bird identity was included as a random effect as subjects completed multiple trials.
(3) To test which factors predicted solving within each trial (binary; N ¼ 224), we included the following fixed effects: accuracy, previous experience, motivation group, personality, sex, trial number and fused trial.Individual bird identity was included as a random term.To limit overparameterization in the model, we did not include latency to touch in this analysis (but see analysis on number of different solves).(4) We tested which factors affected innovativeness defined as the number of different access points solved by an individual (N ¼ 35).Birds solved either 0, 1, 2 or 3 different access points.We included the following explanatory variables: hunger, personality, sex, latency to touch the device in the first trial only and inhibitory control.As this analysis was conducted on the number of different access points solved across all trials, we did not include variables that are trial specific (i.e.previous experience and accuracy).Finally, we determined individual repeatability of the response variables in each of the first three questions above (latency to touch the device, accuracy and solving within a trial), using the rptR package, estimating repeatability (intraclass correlation) and CIs from Gaussian, binary, proportion and Poisson data (Stoffel, Nakagawa, & Schielzeth, 2017).We report unadjusted and adjusted repeatability, to encompass repeatability before and after controlling for influential fixed effects (Cauchoix et al., 2018).Unadjusted repeatability measures the between-individual variation in a given behaviour, while adjusted repeatability controls for fixed effects that could influence individual behaviour, because they explain either between-or within-individual components of variation.For both adjusted and unadjusted repeatability, we included individual identity as a random effect.Data and R code are included in the Supplementary material.

Latency to Touch the Multiaccess Device
Latency to touch the multiaccess device decreased over consecutive trials (Table 1, Fig. A1).The high-motivation group took less time to touch the device than the low-motivation group.Latency to touch the device did not differ between the personality selection lines.There was a nonsignificant trend for sex, suggesting that males took less time to touch the device than females.The variance of the random effect (individual bird identity) and the residual are indicated in Table 1.There was no effect of previous experience.The interaction between motivation and personality was not significant (b ± SE ¼ e0.47 ± 0.60, t ¼ e0.78, P ¼ 0.44).Inhibitory score had no effect on latency to touch the device (see Table A6).

Accuracy
Birds were more accurate if they had previous experience solving any functioning access point, including solving the opaque device before the main experiment (Table 2, Fig. 3).There was a nonsignificant trend for slow birds being more accurate than fast birds.Birds tended to be less accurate in trials where there was a fused access point.The variance of the random effect (individual bird identity) and residual are indicated in Table 2.There was no effect of motivation group, interaction rate, sex or trial number on accuracy.The interaction between motivation and personality was not significant (b ± SE ¼ e0.06 ± 0.15, t ¼ e0.39,P ¼ 0.70).Inhibition was unrelated to accuracy (see Table A7).

Solving Within a Trial
Nineteen of the 35 birds pulled the lever on the opaque device.Of the 35 birds that participated in the multiaccess task, 12 solved one access point, four solved two access points, seven solved all three access points and 12 did not solve any (Fig. 4).Three birds solved three different access points over three consecutive trials while the device was fully operational (all access points functioning).One bird solved two access points in one trial, solving the string and then the lever in their fourth trial.We include both solutions as separate observations in our analysis.The lever was  .The effect of previous experience (whether there was an access point available that the bird had solved previously including the opaque device) on accuracy (the number of touches to functional parts of the device divided by all touches to the device per trial; see Table 2).Note that a previously solved access point could still be available as birds had to solve each access point three times before it was fused, thus making it unavailable.Smaller points represent individual birds (which have been jittered along the X axis to reduce overlap; as a result, any remaining overlap results in darker points).The large point represents the mean and the error bars represent SEs.solved by 23 different birds, the door was solved by 10 birds and the string by seven birds.
Food-deprived birds were more likely to solve an access point (Table 3).Higher accuracy and previous experience also predicted solving likelihood within a trial.The variance of the individual identity random effect and the residual are indicated in Table 3.There was no effect of personality, sex, trial number, whether it was a fused trial or not, total number of touches to device per trial or inhibition (see Table A8).The interaction between motivation and personality was not significant (b ± SE ¼ e0.46 ± 0.90, t ¼ e0.51, P ¼ 0.61).Follow-up post hoc analysis, using a Fisher's exact test, revealed a correlation trend between lever pulling on the opaque and multiaccess device (P ¼ 0.07).Further analysis, investigating the order in which the multiaccess device was solved, using a Fisher's exact test, showed that the lever was more likely to be solved first (P < 0.001), while there was no difference between the string or door (Fig. 4).

Innovativeness: Number of Access Points Solved
Highly motivated birds solved more novel access points than low-motivated birds (Table 4, Fig. 5).There was no effect of personality, sex, latency to touch the device in the first trial only (Table 4) or inhibition (Table A9).The interaction between personality and motivation was nonsignificant (b ± SE ¼ e0.86 ± 0.79, z ¼ e1.09, P ¼ 0.28).

Repeatability
Latency to touch the device was repeatable but repeatability decreased when adjusted for significant fixed effects (Table 5).Accuracy was also repeatable but increased when adjusted for significant effects.Solving performance within a trial was also repeatable but repeatability disappeared entirely when adjusted for all significant fixed effects.To further investigate which factors were reducing the individual repeatability between the unadjusted and adjusted R values for solving access points within a trial, we removed each fixed effect individually and reran the repeatability model (see Table 5).Adjusted repeatability changed only when a significant fixed effect was excluded.Adjusted repeatability without accuracy was significant, and without motivation, while adjusted repeatability without previous experience only approached significance.There was no change in adjusted repeatability for any factor that did not affect solving performance.

DISCUSSION
Our study sought to explore factors that drive individual variation and repeatability at various stages of innovative problemsolving performance (Fig. 6).We showed that hunger-induced motivation affected multiple problem-solving stages, that previous experience influenced accuracy, and that hunger, accuracy and previous experience influenced problem-solving success.Personality and inhibitory control had little or no effect.Solvers of the opaque lever-pulling device tended to solve the lever on the multiaccess device.Furthermore, birds were more likely to solve the lever first, but showed no preference between the door and string.All traits were significantly repeatable; however, the repeatability    4).Smaller points represent individual birds (which have been jittered along the X axis to reduce overlap; as a result, any remaining overlap results in darker points).The black points represent the mean number of access points solved in each motivation group and the error bars represent SEs.
of problem solving was explained entirely by motivation, accuracy and experience.

Motivation Drives Innovation
Although motivation is often viewed as a confounding variable, if considered at all when examining mechanisms underlying problem-solving tasks (reviewed in Griffin & Guez, 2014), it also underpins the 'necessity drives innovation' hypothesis (Reader & Laland, 2003).In support of this hypothesis, motivation was the major driver of an individual's latency to touch the device, to solve the same access point repeatedly and to innovate multiple times in our experimental set-up.Previous studies reported that task engagement increased with increased food deprivation, thus facilitating problem solving (Griffin et al., 2014;Sol et al., 2012), but motivation itself did not predict problem solving (Griffin & Guez, 2014;van Horik & Madden, 2016).Likewise, the relationship between problem solving and motivation, as measured by body weight or body condition, is inconclusive: at times an effect is present (Laland & Reader, 1999;Mateos-Gonzalez, Quesada, & Senar, 2011) and other times not (Cole et al., 2011;Thornton & Samson, 2012; but see Griffin & Guez, 2014, for a full review).This variability in results across studies may be in part due to differences in how motivation is defined and how problem solving is measured.Our results emphasize the importance of controlling for motivation and standardizing the length of time animals are food deprived in captive experiments, as well as acknowledging that not knowing an animal's motivational state may be a weakness of cognitive experiments conducted in the wild.Nevertheless, controlling for motivational effects generally is unlikely to be straightforward (Auersperg, Gajdon, & Bayern, 2012;Griffin & Guez, 2014;Morand-Ferron et al., 2016;Morand-Ferron & Quinn, 2011), not least because whether food deprivation removes, or just changes, individual variation remains unclear.

Personality
Considerable evidence suggests that personality traits defined by R eale et al. ( 2007) influence individual problem-solving performance (Greenberg, 2003;Johnson-Ulrich et al., 2018;Sol, Griffin, Bartomeus, & Boyce, 2011).However, in our study, personality selection lines with known genetic provenance for object neophobia and novel environment exploration did not predict latency to touch the device; nor did they predict problem-solving behaviour, in terms of success within trials, or the number of different innovations reached.We predicted the effects of hunger-induced Unadjusted values are from mixed models with only individual as a random effect.Adjusted values also include significant fixed effects for each of superscripts a, b and c as shown in Tables 1, 2 and 3, respectively.In addition, for superscript c adjusted repeatabilities are also shown when single fixed effects were removed.Significant results (P < 0.05) are highlighted in bold.motivation could mask effects of artificially selected personality lines on problem-solving behaviour, but the interaction between motivation and personality had no effect on any problem-solving measure, suggesting that our ability to detect the effect of personality on an individual's capacity to solve problems was not confounded by motivation or vice versa.Furthermore, while there was a nonsignificant tendency for slow birds to be more accurate, this did not translate into higher likelihood to solve problems or innovativeness for slow birds.Previous work in this same population, using a lever-pulling task, also found no link between personality and innovative problem-solving performance (Zandberg et al., 2017).The absence of an effect of personality on problemsolving performance in that study, and here, could be influenced by the composite nature of 'exploration' used in our selection lines (Verbeek, Drent, & Wiepkema, 1994).Moreover, latency to touch, which may be considered a measure of neophobia, may have been confounded with associative learning when considering latency to touch across multiple trials.Nevertheless, our results emphasize the challenge of examining links between personality traits and innovative problem solving, not least because of the inherently composite nature of both behaviours.

Inhibitory Control
Inhibitory control is an integral part of behavioural flexibility (MacLean et al., 2014;Manrique, V€ olter, & Call, 2013), both of which are beneficial for problem solving, allowing animals to overcome outdated information.Contrary to our predictions, individuals that exhibited high inhibitory control were no more likely to generate a novel solution to the task than those with low inhibitory control, even when the reward contingencies changed (i.e. when an access point was fused), a time when behavioural flexibility is required.This lack of correlation may be because changing one's behaviour is necessary but not sufficient to solve a problem (Logan, 2016a(Logan, , 2016b)).Moreover, the validity of the detour-reaching task as a test for inhibitory control remains under debate because performance does not necessarily correlate with other tasks that aim to measure inhibitory control, or because previous experience of transparency and persistence may influence performance (Kabadayi, Bobrowicz, & Osvath, 2018;van Horik et al., 2018).Neither Johnson-Ulrich et al. (2018) nor Daniels et al. (2019) found a correlation between problem solving and inhibitory control, even when inhibitory control was measured using an alternative paradigm to the detourreaching task.Thus we conclude that there is no case for motor inhibition affecting behavioural flexibility in the context of problem solving, but it remains possible that it reflects other facets of behavioural flexibility (reviewed in Bari & Robbins, 2013).

Previous Experience
Birds with previous experience of having solved the opaque lever device, or indeed any of the three access points during the main trials, were more accurate and had higher solving success in subsequent trials.Furthermore, performance improved with experience over repeated problem-solving attempts with regard to that particular solving method, perhaps owing to instrumental conditioning.Thus, attributing an individual's cognitive performance to how quickly it solves a problem, or its ability to solve multiple novel problems, may be a function of its previous experience (Rowe & Healy, 2014;Sih & Del Giudice, 2012).We acknowledge the constraints in controlling for all experiences animals may have had with features of an experimental apparatus, especially if based on simple generalizable rules.Nevertheless, tasks could be designed such that they include multiple access points that vary in modality (e.g.smell and touch: sensory versus motor), in the appearance of the specific materials they use (e.g.white plastic versus black plastic) and/or in the required motor skills as we have attempted to do here (Auersperg, Bayern, Gajdon, Huber, & Kacelnik, 2011;Griffin & Guez, 2014;Manrique et al., 2013;Overington et al., 2009).This paradigm may facilitate the testing of true innovations that are not confounded by previous experience, or alternatively, to explicitly test what kinds of experiences facilitate future innovations.

Repeatability, Pseudorepeatability and Positive Feedback
Our results demonstrate repeatable individual differences across two behaviours involved with problem-solving behaviour (latency to touch the device and accuracy when interacting with the device), and for problem-solving success itself.Adjusted and unadjusted repeatabilities differed for all three behaviours.For latency, repeatability decreased but remained significant after controlling for hunger-induced motivation, suggesting that some of the between-individual differences in the unadjusted repeatability were caused by hunger.In contrast, for accuracy, repeatability increased (and again remained significant) after controlling for the effects of previous experience, suggesting that some of the withinindividual variation (the error component) in the unadjusted analysis was explained by previous experience.And for problemsolving success, repeatability was lost after controlling for accuracy, hunger and previous experience (i.e.consistent individual differences in problem-solving performance were explained entirely by these three factors).Thus, repeatable problem-solving behaviour arose because of a complex set of interactions between different factors which themselves differed consistently between individuals.
The significance of these findings is tied to the nature of the specific factor involved.First, in the case of hunger, designed to manipulate motivation, each individual only experienced one of two treatments, a potentially reversible effect, suggesting that the component of the unadjusted between-individual difference explained by hunger was inflated, resulting in pseudorepeatability.Although some sources of motivation are probably permanent, either through a permanent environment (Wilson, 2018) or intrinsic motivation (Ebel & Call, 2018;Gajdon, Lichtnegger, & Huber, 2014;Polizzi di Sorrentino et al., 2014;Taffoni et al., 2014), this pseudo measure demonstrates that failure to control for motivation caused by temporary factors can inflate the intrinsic between-individual differences that researchers are attempting to characterize; that is, those differences that are caused by permanent environment or intrinsic effects.Second, accuracy explained some of the between-individual variation, suggesting that the mechanisms underlying accurate interaction with the device vary consistently between individuals themselves, and explain some of the between-individual differences in the problem-solving performance.It appears likely these mechanisms are intrinsic rather than reversible since motivation is controlled for in these analyses.Third, experience also caused some of the between-individual differences in problem-solving performance, and since experience is not reversible, and by definition carries forward into the next stage of the sequential problem-solving process, this suggests a positive feedback loop driving consistency between individual differences in problem-solving behaviour.Although the role of feedback loops in driving differences in individual behaviour is well known (Dall et al., 2004;Sih et al., 2015), and examples of positive feedbacks are common in nature (Kishida et al., 2011), to our knowledge none have explained consistent between-individual differences.In this case we assume the feedback caused by experience leads to a permanent effect, although it remains possible that individuals eventually forget the experience.
Our results highlight the challenges of characterizing consistent individual variation in sequential problem-solving performance as a measure of overall innovativeness.More generally, they provide a demonstration of how between-individual differences in innovation can be explained by inflated estimates of within-individual variation in motivation, inflated between-individual variation in accuracy, and by feedback loops involving previous experience.Much of the focus in studies on the evolutionary ecology of behaviour in general has been on the evolutionary processes that drive intrinsic individual variation.Our results support the idea that complex sources of variation underlying single traits are likely to make predicting the selective consequences of this variation challenging.

Figure 1 .
Figure1.Routes of progression through the multiaccess problem-solving experiment.To quantify their previous experience and propensity to pull sticks, individuals were initially presented with an opaque tube with a lever.They had four trials (30 min each) in which to pull the lever.Once they solved the task once, they were classified as having previous experience solving a lever task.All birds progressed to the multiaccess problem-solving task where they were presented with the transparent experimental device in which three access points were functional.Each bird had to solve the task using the same access point three times, before moving onto the next phase, where the previously solved access point was fused, leaving the remaining functional access points.This process was repeated for the other two access points.At any point of the testing, if a bird failed over three consecutive trials, participation in the experiment ended for that bird.Dashed arrows indicate there is an alternative progression to complete the experiment.
Figure3.The effect of previous experience (whether there was an access point available that the bird had solved previously including the opaque device) on accuracy (the number of touches to functional parts of the device divided by all touches to the device per trial; see Table2).Note that a previously solved access point could still be available as birds had to solve each access point three times before it was fused, thus making it unavailable.Smaller points represent individual birds (which have been jittered along the X axis to reduce overlap; as a result, any remaining overlap results in darker points).The large point represents the mean and the error bars represent SEs.

Figure 4 .
Figure 4.The frequency of access points solved, grouped by access point, ordered by trial number, as indicated.

Figure 5 .
Figure 5.Effect of motivation (high or low) on the number of different access points solved (see Table4).Smaller points represent individual birds (which have been jittered along the X axis to reduce overlap; as a result, any remaining overlap results in darker points).The black points represent the mean number of access points solved in each motivation group and the error bars represent SEs.

Figure 6 .
Figure 6.Schematic of the study's main results, with the four dependent variables aligned in the centre; arrows indicate influence of explanatory variables (left or right side).Dashed arrows indicate a nonsignificant tendency, no arrows refer to nonsignificant relationship.Note that no test was performed between previous experience and accuracy, and innovativeness because the two former variables were measured per trial, while the latter measure accrued across all trials.

Table 1
Full model outputs from GLMM with factors affecting latency to touch the device per trial a Low (reference level is high).bSlow (reference level is fast).c Male (reference level is female).d Yes (reference level is no).

Table 3
Full model outputs from GLMM with factors affecting solving within trials a Yes (reference level is no).b Low (reference level is high).c Slow (reference level is fast).d Male (reference level is female).e Yes (reference level is no).

Table 4
Full model outputs from GLM with factors affecting the number of different access points solved by an individual N ¼ 35, df ¼ 24, pseudo-R 2 ¼ 0.24, AIC ¼ 99.60.Significant result (P < 0.05) is highlighted in bold.aLow(referencelevel is high).bSlow(reference level is fast).cMale (reference level is female).

Table 5
Repeatability (adjusted and unadjusted)estimates for the three main components of problem-solving behaviour during the experiment