Does song complexity correlate with problem-solving performance in flocks of zebra finches?

The ‘cognitive capacity hypothesis’ states that song complexity could potentially be used by prospective mates to assess an individual’s overall cognitive ability. Several recent studies have provided support for the cognitive capacity hypothesis, demonstrating that individuals with more complex songs or larger song repertoires performed better on various learning tasks. These studies all measured individuals’ learning performance in social isolation. However, for gregarious species such as the zebra finch, Taeniopygia guttata, testing individuals in a group context is socially and ecologically more relevant if song complexity is to be a meaningful indicator of cognitive ability. We tested whether song complexity correlated with performance on a suite of novel foraging problems in flocks of male zebra finches, starting by replicating the lid-flipping task used by Boogert et al. (Animal Behaviour ,2 008,76, 1735e1741), who provided the first support for the cognitive capacity hypothesis in zebra finches isolated during testing. We also presented flocks with a barrier task and two types of novel food. We found that males’ song complexity scores did not correlate with their latency to solve any of these novel foraging problems in a social context. Individuals that solved the tasks likewise did not have more complex songs than nonsolvers. However, performance was positively correlated across the different foraging tasks. These results raise doubts as to whether the song complexity measures used by Boogert et al. are predictors of problem-solving performance, and perhaps cognitive ability, in a more ecologically relevant, social setting. Stress responsiveness might instead explain the association between song complexity and foraging task performance among isolated zebra finches reported by

males with smaller song repertoires. However, song sparrow males' song repertoire size did not correlate with their performance on the same lid-flipping task as used by Boogert, Giraldeau, et al. (2008), nor with performance on a colour association task (Boogert, Anderson, et al., 2011). Finally, in starlings, males that had experienced stress during development sang shorter song bouts and also performed worse on a spatial learning task than control males, but their performance on a novel foraging task did not differ (Farrell et al., 2012).
Although the first test of the cognitive capacity hypothesis by Boogert, Giraldeau, et al. (2008) using zebra finches provided a promising result, it measured males' performance only on a single novel foraging task. The subsequent studies by Boogert, Anderson, et al. (2011), Boogert, Fawcett, et al. (2011), Farrell et al. (2012 and Sewall et al. (2013) suggest that the picture might be more complicated when testing males on a range of cognitive tasks; in these studies, males' song complexity correlated positively with performance on some tasks, but not on others. Although species might be expected to differ depending on their life history and the importance of specific (cognitive) skills in their daily survival and reproductive success (Buchanan et al., 2013), the extent to which the initial zebra finch result holds up when males are tested on a variety of novel tasks remains to be determined.
Another issue worthy of exploration is how test conditions could affect the relationship between song complexity and performance on various cognitive tasks. Song sparrows are extremely territorial (Akçay et al., 2009;Searcy, Anderson, & Nowicki, 2006;Templeton, Campbell, & Beecher, 2012), which justifies testing them on cognitive tasks in isolation from conspecifics (as in Boogert, Anderson, et al., 2011;Boogert, Fawcett, et al., 2011 andSewall et al., 2013); group testing would be ecologically unrealistic and could lead to casualties. Zebra finches on the other hand are extremely gregarious. While they might show some territoriality in the immediate vicinity of their nest, they breed in colonies, travel to foraging and water sites and feed, drink and bathe together in flocks, and spend a lot of their time socializing (Zann, 1996;personal observation). Given that nearly all zebra finch decisions are made in a social context, the cognitive capacity hypothesis should hold true in the more socially and ecologically relevant context of the flock, if it is to be functional in the actual lives of the birds.
Here, we extended the study of Boogert, Giraldeau, et al. (2008) to test the prediction that zebra finch males' song complexity correlates positively with their performance on several novel foraging tasks when presented in the more natural social context of a flock, rather than in isolation. In wild zebra finches, a male's song complexity (syllable number and motif duration) predicts his reproductive success (Woodgate, Mariette, Bennett, Griffith, & Buchanan, 2012), and female domesticated zebra finches prefer larger syllable repertoires (reviewed in Riebel, 2009). We recorded the songs of 51 male zebra finches in three flocks and presented each flock with a lid-flipping task very similar to that used by Boogert, Giraldeau, et al. (2008). We also measured the bird's latency to cross an opaque partition with food at the other side, and its willingness to sample two different unfamiliar foods. We scored the latency with which each bird in each flock solved each task and assessed the relationships between these latencies and measures of song complexity. It seems likely that our problem-solving tasks form a continuum with regard to the cognitive processing required to solve them, with lid flipping probably being the most cognitively demanding task, crossing the opaque partition requiring more exploration, and neophobia probably influencing when birds sampled the novel foods. Indeed, the problem-solving latencies we measured are undoubtedly affected by a variety of factors, including individuals' cognitive capacity (in terms of information acquisition, processing and decision making; Shettleworth, 2010), their (feeding) motivation (David, Auclair, Giraldeau, & Cezilly, 2012;Sanford & Clayton, 2008), their foraging tactics (i.e. producing versus scrounging) that, in turn, are affected by their basal metabolic rate (Mathot, Godde, Careau, Thomas, & Giraldeau, 2009) and body condition (David et al., 2012), their activity and motivation to explore (Beauchamp, 2000), response to novelty (Schielzeth, Bolund, Kempenaers, & Forstmeier, 2011), (stress) hormone levels (Spencer & Verhulst, 2007) and the performance of flockmates. The presence of conspecifics is known to affect zebra finch neophobia (Coleman & Mellgren, 1994) and exploration (Schuett & Dall, 2009), and may induce 'audience effects' (as shown for zebra finch vocal communication: Vignal, Mathevon, & Mottin, 2004). Finally, flockmates can socially enhance each other's exploitation of novel food sources, for example by providing the opportunity to copy others' foraging decisions (Benskin, Mann, Lachlan, & Slater, 2002;Katz & Lachlan, 2003;Riebel, Spierings, Holveck, & Verhulst, 2012), but knowledgeable zebra finches may also slow down their naïve partners' learning about foraging opportunities (Beauchamp & Kacelnik, 1991). While physiological factors and personality traits could also affect performance on novel foraging and learning tasks when presented to test subjects in isolation, other factors will operate solely when tested in a social context. We assume that the additional social effects on performance provided by flockmates, especially in terms of facilitating (or possibly inhibiting) the social acquisition of information (i.e. social learning, itself a cognitive trait; , are those that make this test of the cognitive capacity hypothesis relevant to the life history of the study species. Thus, we expected that zebra finch males' song complexity should predict their problem-solving performance in a flock context for it to be an effective signal of cognitive capacity in this species.

Test Subjects
We studied three groups of 17 adult male domesticated zebra finches (i.e. N ¼ 51 individuals). Zebra finch groups were composed of randomly selected adult males from the university zebra finch population. This adult population is composed of zebra finches from various sources (Glasgow University, local pet shops) and their social rearing conditions and song tutoring histories are unknown. Each individual was fitted with a unique combination of one numbered and two coloured plastic split rings (A.C. Hughes, Hampton Hill, U.K.) for individual identification. Each group was housed in a wire-mesh cage (122 Â 71 cm and 138 cm high) containing hay bedding, multiple perches, a cuttlefish bone, crushed oyster shells, two water bowls, two water 'hoppers' (water was supplemented with Johnsons vitamin drops for birds), two food bowls and two food 'hoppers' (filled with mixed finch seed), at all times. Fresh greens were provided at least once a week. Birds were maintained at 20 AE 1 C ambient temperature on a 12:12 h light:dark cycle (lights on at 0700, off at 1900 hours). Lights were fullspectrum daylight. Mixed finch seed was available ad libitum outside experiments. Subject groups were assembled 1 week before testing began and these birds remained together throughout the duration of the experiments. The three groups were all housed in the same room, and housing conditions were as described above for the university population. Each day, before the start of an experiment, the cage of the group under study was wheeled into a separate room with similar ambient conditions where the birds were in visual and acoustic isolation from other zebra finch groups, and then returned to the home room after the experiment. All experiments were conducted in the home cage with the normal social companions. Each trial was filmed using digital video cameras (Panasonic SD80). We presented four different novel foraging tasks to each group in the same order: lid flipping (19 Octobere2 November 2011), a barrier task (27 Marche6 April 2012) and two novel foods: apple (21e23 August 2012) and peas (28e30 August 2012).

Lid Flipping
This task, slightly modified from that used by Boogert, Giraldeau, et al. (2008) by using a group-testing protocol (see below), shorter food deprivation (1 h instead of overnight) and a more desirable food reward (spinach instead of seed), required zebra finches to flip lids off wells to reach a food reward underneath. Each group was presented with the lid-flipping task for 3 consecutive days. On day 1, we first allowed birds to habituate to the foraging task. We removed food bowls at 0900 hours. At 1000 hours, we put a white cardboard sheet (29.7 Â 42 cm) on the bottom of one side of the holding cage. We placed four identical white plastic foraging grids (8 Â 12 Â 2 cm), each containing 12 wells (2 cm diameter, 1.5 cm deep) on top. Each well contained a small (0.5 Â 0.5 cm) piece of fresh spinach leaf. We presented 24 lids, each composed of a yellow cardboard square (2 Â 2 cm) with upward-folded corners to which a felt bumper was attached (2 cm diameter, 0.5 cm high), on top of the grid next to the wells. We allowed the test subjects to feed freely on the spinach from the wells for 1 h.
We started the test phase of the experiment at 1100 hours on the same day: we removed the grids and lids from the cage, refilled each well (48 wells in total) with spinach and covered it with a lid, and returned the baited grids to the cage. The zebra finches had to remove the lids from the wells to obtain the spinach underneath. The trial lasted 1 h, after which we removed, rebaited and returned the grids for a final hour of testing, starting at noon. At 1300 hours, we removed the lids, grids and cardboard bottom, returned the cage to the holding room and returned the regular food bowls. On the subsequent 2 test days, we removed the food at 0900 hours, presented the baited and lid-covered grids at 1000 hours, and rebaited and presented them again at 1100 and 1200 hours. Across the 3 test days we thus conducted a total of eight lid-flipping trials with each group. We recorded for all lid removals the latency from the start of the trial and the cumulative latency from the start of the experiment, which bird removed the lid and whether it ate the spinach reward or not. We used each bird's cumulative latency to flip its first lid (i.e. counting from the start of the experiment) as a measure of its 'problem-solving performance'.
As it was impossible to guide each of the birds in our flocks individually through the systematic shaping procedure adopted by Boogert, Giraldeau, et al. (2008), in which the lids were positioned such that they progressively covered more of the wells across four levels of increasing difficultly, we extracted another learning measure: we graphed the latencies of each bird's first four lid flips against lid flip number (i.e. 1e4), and used the slope estimated from a linear regression (not forced through the origin) through these points as a measure of learning. Birds that had properly learned how to flip lids should repeat the behaviour faster than birds whose successful first lid flip was a chance event. We thus assumed faster learners would show shallower learning curves and smaller slopes (i.e. shorter latencies in between consecutive lid flips) than slower learners. Three birds flipped lids only three times in total. Inclusion or exclusion of their learning curve slopes did not change the results.
Finally, we used individuals' latencies to flip their first lid on test day 3 as a proxy of how quickly individuals 'remembered' to flip lids after they had had at least a 21 h break (from 1300 hours on test day 2 to 1000 hours on test day 3) since their last lid flip or since observing others lid flipping. Only individuals that had previously flipped a lid at least once before were included.
We thus extracted three different measures from each bird's performance on this task: (1) problem-solving performance: latency to flip the first lid; (2) learning: the slope of a 'lid flip learning curve' based on the first four lid flips; and (3) 'memory': latency to flip the first lid on test day 3.

Barrier Task
This task required zebra finches to move through a hole in an opaque partition in the middle of their holding cage to reach seed feeders at the other side. We removed food bowls at 0900 hours. We ensured that all birds were on one side of the cage, and divided it in half by inserting an opaque wooden partition that contained two holes (6 cm diameter) at the bottom. The holes were surrounded by yellow cardboard shapes (a triangle and a circle, each 12 cm across) to make them more conspicuous. The holes provided easy access to the other side of the cage where four tubular transparent bird feeders containing mixed finch seed were attached to the upper part of the cage, such that they could not be seen unless a bird peeked through one of the holes. Birds were free to move to the other side of the cage and feed from the feeders for 4 h. We presented this task to each group on 3 days, each interspersed by 2 nontest days. For each bird moving to the baited cage side, we recorded the cumulative latency of its move as counted since the start of the experiment on the first test day.

Novel Food Sources
We tested birds' willingness to forage on two different novel food items. At 0900 hours we removed the regular food bowls and provided each zebra finch group with two piles (ca. one-third cup each) of apple cubes, one positioned on each cage side on white cardboard (21 Â 29.7 cm) for 3 h. We analysed only the first 50 min of video recordings as the great majority of birds had sampled the novel food within this time. We scored the latency of each bird's arrival at the food source, the latency to start feeding and the duration of feeding. Although, especially towards the end of the trials, some pieces of apple had been moved throughout the cage, we limited data collection to those birds that stood on the white cardboard and were thus clearly identifiable in the video. We repeated the exact same procedure with a second novel food item, canned peas, 1 week later.

Song Complexity
We recorded each male singing to an unfamiliar female zebra finch between 31 January and 4 February 2012. For recordings, we placed a male in another cage (60 Â 44 cm and 39 cm high) located in a room that was padded with anechoic foam to reduce background noise and echo on the recordings. The cage was split lengthwise in two by a wire-mesh partition. We positioned a female at the other side of the cage, such that the birds could hear and see each other, but not copulate. Most males readily sang in the presence of the female, although any male not singing on a particular day was re-recorded on a subsequent day while being presented with a different female. Zebra finch males sing a single stereotyped sequence of elements, a song 'motif', which is repeated a variable number of times, often with linking and introductory notes, to form a song. Song motifs are stereotyped and do not change once their songs have crystallized at around 90 days of age (Williams, 2004). Changing the female audience might change a male's motivation to start singing, but is not known to affect his song structure. We recorded and analysed an average of 10 song phrases per individual. We made all recordings using a Sennheiser ME66/K6 microphone connected to a Marantz PMD660 recorder. Each song was recorded to an uncompressed wav file using a sampling frequency of 44 kHz.
To examine song complexity we followed the procedure described in detail in Boogert, Giraldeau, et al. (2008). We analysed 10 randomly chosen recordings of each male's song motif and averaged these to obtain the final scores. We analysed the motif duration in milliseconds using Syrinx-PC version 2.6h (John Burt; www.syrinxpc.com) and used Avisoft-SASLab Pro v5.2 software (Avisoft Bioacoustics, Berlin Germany) to score the total number of elements and the number of unique elements in each song motif visually. The waveform amplitude, duration and frequency modulation pattern were used to characterize elements as the same or different. We analysed only complete song motifs, disregarding any repetitions that were truncated and excluding all introductory and linking elements from analyses. Female calls were easily identified in the recordings and excluded from the analyses. To help reduce subjectivity in song classification, two of us scored each motif. We were highly consistent (interindividual correlation: Cronbach's a > 0.98) in our scoring and any minor differences in judgement were resolved by consensus. In addition to the song complexity scores used in Boogert, Giraldeau, et al. (2008), we also measured a number of other song parameters that could be assessed by female listeners (Holveck & Riebel, 2007;Holveck, Vieira de Castro, Lachlan, ten Cate, & Riebel, 2008;Leadbeater, Goller, & Riebel, 2005): rate of unique elements (number unique divided by the song duration), variation in male motif duration (measured as standard deviation from the mean), acoustic density (proportion of sound versus silence for each motif) and variation therein. To measure acoustic density, we followed Leadbeater et al. (2005): we first used Avisoft to high-pass filter each song at 0.5 kHz, then measured the proportion of sound versus silence using a gating function, with gate threshold level 0.05 V and 10 ms classification sections.

Data Analyses
We first examined the relationship between the three song complexity measures used by Boogert, Giraldeau, et al. (2008), and then assessed the correlation between males' song complexity and their problem-solving performance. We natural log-transformed the three song measures (average song motif duration, total number of motif elements and number of unique elements) so that each was normally distributed according to a KolmogoroveSmirnov test of normality (all P > 0.195) and used Pearson correlation tests to assess the relationships between them. As the three song measures were strongly correlated (see Results), we conducted an unrotated principal component analysis (PCA) in SPSS v19 (IBM Corp., Armonk, NY, U.S.A.) and used the first principal component scores, extracted using the regression method, in subsequent analyses as our measure of song complexity (henceforth referred to as 'song complexity score'). Using the three song measures individually instead of this combined song complexity score did not change the results (not shown). To assess the relationship between song complexity scores and performance on the novel foraging tasks, as well as consistency in performance across tasks, we conducted linear mixed-effects models (LME) in R (The R Foundation for Statistical Computing, Vienna, Austria, http://www.r-project.org, package 'nlme') with task-solving latency as the predictor variable and 'group' as a random effect to accommodate the fact that each experiment was conducted in three separate groups of zebra finches. We natural log-transformed latencies for both novel foods (apple and pea) and the barrier tasks and square root-transformed all lid flip measures for analyses. We visually inspected plots of model residuals to ensure homogeneity of variance and normality of errors. Individuals that did not perform on a task were excluded from analyses of that task to avoid biasing the results by assigning these individuals arbitrary latencies (final sample sizes: lid: N ¼ 36; barrier: N ¼ 18; apple: N ¼ 46; pea: N ¼ 47). We also conducted an unrotated PCA on lid flipping, apple and pea task latencies (excluding barrier task performance owing to the small number of solvers) for each group separately, resulting in each case in a single extracted component (see Appendix Tables). We then tested whether these principal component scores correlated with the song complexity scores and with the additional song parameters measured (i.e. rate of unique elements, variation in male motif duration, acoustic density and variation therein), using the same linear mixed model formulation as described above. Finally, we used data for nonsolvers to compare the song complexity scores of the individuals that solved the lid-flipping and barrier tasks to those that did not, using t tests. As we did not find a significant relationship between the main variables of interest, namely song complexity and lid-flipping performance (see Results), we performed a post hoc power test to assess whether, given our sample size and an effect size similar to that of Boogert, Giraldeau, et al. (2008), we had reasonable statistical power to detect a significant relationship. We used a simulation written by William Hoppitt in R (code available in the Supplementary material) to accommodate the hierarchical structure of our data in the power calculation. For the expected effect size we used À0.569, the slope of the standardized regression between song complexity and lid flip performance, excluding nonsolving zebra finches, from Boogert, Giraldeau, et al. (2008), Boogert, Reader, et al. (2008). This slope differs slightly from the unstandardized regression slope reported in the original paper. We standardized the regression slope of Boogert et al. and each of our own variables for the power calculation to accommodate data that were measured in different units.
We did not conduct post hoc power tests for our other nonsignificant results as no publications exist to inform the expected effect size values.

Ethical Note
Zebra finches are highly gregarious. We did not observe any aggressive interactions in our zebra finch groups, apart from the occasional chase, which was extremely rare, brief, entailed no physical contact and was resolved within several seconds by the chased individual moving to a different part of the cage. Birds were kept in the three experimental groups for 1 year. Groups were monitored daily by the researchers and the University's Named Animal Care and Welfare Officer (NACWO) and monthly by the vet to certify that birds maintained good health throughout and after experiments. Four months after the end of experiments, males were paired up with females in spacious breeding cages and bred successfully. All research followed the ASAB/ABS guidelines for use of animals in research and was approved by the University of St Andrews Animal Welfare and Ethics Committee (21/12/09).

Song Complexity
The three song measures were all correlated positively: average motif duration correlated with both the total number of elements in the motif (Pearson correlation: r 48 ¼ 0.351, P ¼ 0.012) and the number of unique elements in the motif (r 48 ¼ 0.383, P ¼ 0.006); total number of elements in the song motif, in turn, correlated positively with the number of unique elements (r 48 ¼ 0.901, P < 0.001). The PCA of the three song measures extracted a single principal component with an eigenvalue of 2.138 and explained 71.25% of the variance in these variables. The unrotated component loadings were 0.607 for motif duration, 0.936 for the total number of elements and 0.945 for the number of unique elements. Individuals' loadings on this principal component were termed 'song complexity scores' and used in subsequent analyses.

Performance Across Novel Problems
Individuals' performance correlated positively across several novel problem-solving tasks (Fig. 1). Individuals' latency to flip their first lid was positively correlated with their latency to solve the barrier task (LME: t 5 ¼ 3.226, P ¼ 0.023). Latency to first lid flip was also positively correlated with latency to eat apple (t 30 ¼ 4.291, P < 0.001), and showed a similar trend with latency to eat peas (t 30 ¼ 1.845, P ¼ 0.075). Latencies to eat apple and peas were significantly correlated (t 42 ¼ 2.586, P ¼ 0.013). Individuals' latency to solve the barrier task was not significantly correlated with their latency to eat apple (t 12 ¼ 0.423, P ¼ 0.680) or peas (t 12 ¼ À0.796, P ¼ 0.441). The slopes of individuals' lid flip learning curves did not correlate with latencies to flip their first lid on test day 1 (t 27 ¼ À1.271, P ¼ 0.215) or test day 3 (t 21 ¼ 0.563, P ¼ 0.580), nor with latencies to eat apple (t 26 ¼ 0.431, P ¼ 0.670) or peas (t 26 ¼ 1.327, P ¼ 0.196). There were only seven barrier task-solvers with lid flip learning curves, precluding analysis of the relationship between these variables.

Song Complexity and Problem-solving Performance
Song complexity scores were not significantly correlated with performance on any of the problem-solving tasks. Specifically, song complexity scores did not correlate with latency to flip the first lid (LME: t 31 ¼ À0.744, P ¼ 0.463; Fig. 2). Given our total sample size of 35 birds (across three zebra finch groups) and an expected effect size of À0.569 (i.e. the standardized regression coefficient from Boogert, Giraldeau, et al., 2008), our power to find a significant (P < 0.05) relationship between song complexity and lid flip latency was 96.9%. Figure 2 suggests some covariance between song complexity and group ID owing to unknown factors.

DISCUSSION
This study is the first to test the cognitive capacity hypothesis in the highly gregarious zebra finch in the socially relevant context of the flock. We measured the song complexity of 51 domesticated zebra finch males. We then presented these males in flocks with the same novel foraging task used by Boogert, Giraldeau, et al. (2008), who provided the first evidence in support of the cognitive capacity hypothesis in 27 zebra finches tested in social isolation. However, in contrast to Boogert, Giraldeau, et al. (2008), we found no significant relationship between males' song complexity scores and their latencies to start flipping lids (a novel foraging behaviour) to obtain a food reward, even though we had high statistical power to detect such a relationship. In addition, song complexity scores did not correlate with the rate at which males increased their lid flipping, nor with the latency to flip their first lid after a day's delay since their last exposure to lid flipping, each a different measure of task learning. Furthermore, this lack of a relationship between song complexity and problem-solving performance was not restricted to the lid-flipping task: males' song complexity scores also did not correlate significantly with their performance on a barrier task or with their latencies to sample two novel foods, when tested in a flock context. Males' performance in the flock did correlate positively across the novel foraging tasks: individuals that were faster to start flipping lids were also faster to cross an opaque barrier to find food and faster to start foraging on two novel foods.
If a male's song complexity is to be a meaningful indicator of his general cognitive capacity, then it should signal his performance on other cognitive tasks in an ecologically relevant social context. In the case of the highly gregarious zebra finch, virtually all activities take place within social groups, so the relevant social context is the flock (Zann, 1996). We therefore assumed that the social experience of group testing in all-male flocks in captivity more closely resembles zebra finches' natural social environment than testing them in isolation. The fact that we could not replicate Boogert, Giraldeau, et al. (2008)'s finding in a flock context suggests either that (1) we did not use the right tasks to measure cognitive performance, (2) social testing obscures rather than clarifies the 'true' relationship between song complexity and cognitive ability, or (3) previous evidence for the cognitive capacity hypothesis in domesticated zebra finches is based on an artefact of isolation testing. We address each of these potential reasons for our negative findings in turn.
First, the foraging tasks we chose for this study forced zebra finches to solve problems similar to those they might encounter in a natural environment. The ability to remove obstacles and to navigate around barriers to find hidden foods, and the flexibility to switch to foraging on novel food sources in times of regular food source scarcity, seem ecologically relevant survival skills in the nomadic life of the wild zebra finch living in semiarid Australia (Zann, 1996). Our finding that performance correlated positively across these novel foraging tasks in our domesticated zebra finch males seems to suggest that we obtained relatively robust measures of their tendency to tackle novel foraging problems in a social context. However, precisely what 'cognition' is required to solve these and similar novel foraging problems remains to be determined (Healy, 2012;Thornton & Lukas, 2012).
It is likely that personality traits also played a role in task performance. Recent work in several bird species indicates that personality strongly affects learning and problem-solving performance (Guillette, Reddon, Hurd, & Sturdy, 2009;Titulaer, van Oers, & Naguib, 2012). One could argue that solving the barrier task, for example, may have predominantly relied on boldness and motivation to explore. In addition, using task-solving latency as a cognitive performance measure has been criticized (Healy, Haggis, & Clayton, 2010). One major concern with latencies is that they might capture motivation and/or stress rather than (or in addition to) cognitive performance (Buchanan et al., 2013;Healy et al., 2010). It is plausible that birds' latencies to solve our novel foraging tasks were affected by their motivation to interact with novel items and/or feed. A recent study (David et al., 2012) showed that 41% of the variation in isolated female zebra finches' latencies to feed could be explained by their personality traits, and 19% by their body condition: more active, exploratory, risk-taking and neophilic birds, and those in poorer body condition, were faster to feed from their normal food source after 1 h of food deprivation. However, the relationship between body condition/feeding motivation and task performance is not straightforward. For example, body condition did not explain latencies to solve novel foraging tasks in Zenaida doves, Zenaida aurita (Boogert, Monceau, & Lefebvre, 2010), great tits, Parus major (Cole, Cram, & Quinn, 2011) or blue tits (Aplin et al., 2013), whereas motivation did affect the number of errors zebra finches made in a spatial memory task (Sanford & Clayton, 2008). If ca. 40% of the variance in zebra finch feeding latencies can be explained by differences in personality (David et al., 2012), then perhaps personality differences underlie our finding that solving latencies were correlated across our novel  . Song complexity in relation to performance on a standard foraging task, when assessed in a group context. The song complexity score (PC1) for each male is shown in relation to (a) latency to flip a lid hiding a food reward, a measure of initial problem solving, and (b) the slope of the curve from the first to fourth success in solving the task, a measure of learning. Group identity is indicated by differently coloured circles.
foraging tasks, but showed no significant relationship with song complexity scores. A recent review by Sih and Del Giudice (2012) suggests that bolder, more exploratory and neophilic individuals should be quicker to encounter novel situations and stimuli; these individuals will then be faster to solve novel tasks, not because they have a superior cognitive ability, but because they are less hesitant to take risks and engage with the tasks. Song complexity scores, on the other hand, are more likely to reflect cognitive constraints as imposed by developmental stressors and environmental effects (Spencer & MacDougall-Shackleton, 2011). On balance, we are unable to exclude the possibility that the positive correlation associated with our 'problem-solving' tasks better reflects motivational or personality traits than cognitive ability. Future studies could resolve this by combining isolation and group testing.
The second potential explanation for our negative findings is that any underlying relationship between song complexity and problem-solving performance was obscured by the social context of our study. Previous studies on various species suggest that a range of cognitive and 'noncognitive' factors might underlie problemsolving performance in social contexts, such as social rank, neophobia, motivation, scrounging, and asocial and social learning processes. In small flocks of starlings, for example, dominant and neophilic individuals were the first to engage in a series of novel foraging tasks, whereas dominant birds, which also turned out to be the fastest asocial learners, were most likely to solve the novel tasks (Boogert, Reader, et al., 2008). In flocks of blue tits presented with a novel foraging task and a conspecific trained to demonstrate the solution, individuals' novel problem-solving success in isolation predicted their latency to adopt the demonstrated foraging task solution in the flock. Sex, age and dominance rank also affected social learning latencies, but neophobia and body condition did not (Aplin et al., 2013). In groups of wild meerkats, Suricata suricatta, presented with a novel foraging task and conspecifics demonstrating the solution, individuals' task-solving behaviour was determined by a mix of asocial and social learning processes, as well as motivation and socially facilitated perseverance (Hoppitt, Samson, Laland, & Thornton, 2012). Together these studies show that various factors affect animals' problem-solving performance when tested in social groups.
Zebra finch performance is particularly likely to be affected by that of flockmates, given that birds from this species have been shown to be less neophobic (Coleman & Mellgren, 1994) and more explorative (Schuett & Dall, 2009) depending on the behaviour of conspecific companions, and will copy each other's feeder (Benskin et al., 2002;Riebel et al., 2012) and food colour (Katz & Lachlan, 2003) choices (although scrounging from a knowledgeable partner has been shown to slow down the acquisition of a signale reward association in zebra finch pairs; Beauchamp & Kacelnik, 1991) . However, such social acquisition of information is a cognitive process in itself, which has been found to covary with asocial learning performance (Aplin et al., 2013;Heyes, 2012), and is important in song learning (Janik & Slater, 2000). Under the cognitive capacity hypothesis, song complexity scores should thus still have correlated with problem-solving performance, even if the latter was mostly a measure of individuals' social learning skills. Unfortunately our group-testing design does not allow us to disentangle asocial from social learning effects on the birds' problem-solving latencies and there are likely to have been consistent individual differences in foraging tactic use (Morand-Ferron, Varennes, & Giraldeau, 2011). In addition, the latencies measured for each individual within a given group are, strictly speaking, not independent because of the potential social effects. In conclusion, although our social test context did multiply the number of factors affecting problem-solving performance as compared to testing birds in isolation, it also allowed zebra finches to use their natural tendency to follow and learn from others. It therefore seems unlikely that moving to a more natural and less stressful, social, test situation would have obscured the 'true' relationship between song complexity and problem-solving performance.
A final possible explanation for our results showing a lack of relationship between measures of problem solving and song complexity is that song complexity is simply not a reliable predictor of cognition. Some previous work has suggested that zebra finch song complexity is an honest indicator of stress experienced during development (Spencer, Buchanan, Goldsmith, & Catchpole, 2003; but see Bolund, Schielzeth, & Forstmeier, 2010). If stress experienced during development makes an individual more (or less) stress-sensitive in adulthood (Monaghan, 2008), and social isolation is a stressful experience to zebra finches (Banerjee & Adkins-Regan, 2011), then perhaps zebra finch performance on learning tasks in social isolation is more affected by stress responsiveness than by cognitive capacity. In other words: perhaps the study by Boogert, Giraldeau, et al. (2008) measured the effects of developmental stress twice: in terms of its effects on song complexity and its effects on responsiveness to social isolation, and this is why these two measures were correlated. In this way, song may provide an honest signal of an individual's ability to cope with stress later in life. Like cognitive ability, stress responsiveness may be a critical trait for a potential mate to evaluate.
Although developmental stress has been shown to affect song complexity in some zebra finch populations (Spencer et al., 2003;Woodgate et al., 2014), various other studies have found no such effects (see Table IV in Riebel 2009;Bolund et al., 2010), leading some to suggest that zebra finch song might not be a quality indicator after all (Bolund et al., 2010). Alternatively, perhaps other song parameters such as syntax copying accuracy (Brumm, Zollinger, & Slater, 2009;Holveck et al., 2008) and fundamental frequency (Cynx, Bean, & Rossman, 2005;Perez et al., 2012) might be more robust indicators of developmental conditions and quality. We would therefore suggest future studies of the cognitive capacity hypothesis should (1) test birds that have been developmentally stressed, and whose tutor songs are known (as in Schmidt et al., 2013), (2) measure a myriad of song parameters, including syntax copying accuracy, (3) as far as possible use problem-solving tasks for which the cognitive requirements are determined, and (4) use a combination of isolation and group testing, as appropriate for the species under study. We expect that future research following this approach will be able to disentangle the potentially complex relationship between song, stress responsiveness and cognitive ability. Components with eigenvalues >1 were used for further analyses and only those are reported here.

Table A3
Results of an unrotated principal component analysis of the lid, apple and pea tasksolving latencies: zebra finch group 3 (N ¼ 10)