Numerical cognition in black-handed spider monkeys (Ateles geoffroyi)

We assessed two aspects of numerical cognition in a group of nine captive spider monkeys (Ateles geoffroyi). Petri dishes with varying amounts of food were used to assess relative quantity discrimination, and boxes fitted with dotted cards were used to assess discrete number discrimination with equally-sized dots and various-sized dots, respectively. We found that all animals succeeded in all three tasks and, as a group, reached the learning criterion of 70% correct responses within 110 trials in the quantity discrimination task, 160 trials in the numerosity task with equally-sized dots, and 30 trials in the numerosity task with various-sized dots. In all three tasks, the animals displayed a significant correlation between performance in terms of success rate and task difficulty in terms of numerical similarity of the stimuli and thus a ratio effect. The spider monkeys performed clearly better compared to strepsirrhine, catarrhine, and other platyrrhine primates tested previously on both types of numerical cognition tasks and at the same level as chimpanzees, bonobos, and orangutans. Our results support the notion that ecological traits such as a high degree of frugivory and/or social traits such as a high degree of fission-fusion dynamics may underlie between-species differences in cognitive abilities.


Introduction
Numerical cognition refers to the ability to perceive, represent, and act upon the numerical properties of stimuli (Beran et al., 2015). One aspect of numerical cognition, the ability to judge relative quantities, has been reported in a wide variety of species, e.g. in insects (Pahl et al., 2013), fish (Piffer et al., 2012), birds (Emmerton, 1998), terrestrial mammals (Ferkin et al., 2009;Perdue et al., 2012) and marine mammals (Jaakola et al., 2005). This should not be surprising as such an ability provides a clear fitness benefit to the individual. In a foraging context, for example, it is evident that being able to choose the larger one of two quantities contributes to maximizing food intake (Stephens and Krebs, 1986). In a social context, some species have been shown to determine the movements of their social group depending on the relative quantity of individuals heading in certain directions (Strandburg-Peshkin et al., 2015). During inter-group conflicts, the ability to assess the relative size of a rival group allows for animals to decide whether to respond aggressively or to retreat from a potentially harmful encounter (Kitchen, 2006).
Another aspect of numerical cognition, the ability to discriminate between discrete numbers of stimuli has also been reported in a variety of species, including insects (Bortot et al., 2019), fish (Messina et al., 2021), birds (Lorenzi et al., 2021), and mammals (Nieder, 2017). This cognitive ability, sometimes referred to as numerosity, has also been found to provide a fitness benefit to the individual in behavioral contexts such as social conflict, predator avoidance, navigation, and reproduction (Nieder, 2020). Despite the widespread occurrence of numerical cognition, comparative studies suggest that there are clear differences between species in the ability to utilize the numerical properties of stimuli (Beran and Parrish, 2016). This raises the question as to the mechanisms underlying this cognitive ability and the selective pressure(s) which may have promoted its evolution.
Several studies proposed that the degree of social or ecological complexity may be determinant predictors of cognitive abilities (Whiten and van Schaik, 2007;Rosati, 2017). In order to further test this hypothesis, primates appear to be a particularly suitable taxon as they comprise a remarkable variety of social systems, ranging from solitary to pair-living to multimale/multifemale group compositions (Kappeler and van Schaik, 2002) as well as a wide variety in their ecology, e.g. in terms of diets and habitats (Clutton-Brock and Harvey, 1977).
The complex spatiotemporal distribution of fruit, for example, has been suggested to require considerable cognitive abilities in order to remember the location of patchy food sources and to efficiently predict times and places to forage (Milton, 1981). Accordingly, frugivorous species should be expected to display superior cognitive skills compared to e.g. folivorous species. Similarly, high levels of fission-fusion dynamics are thought to require enhanced cognitive skills to enable individuals to track changes in social relationships . Accordingly, species living in fission-fusion societies should be expected to display superior cognitive skills compared to species with less complex social dynamics.
Therefore, the present study assessed numerical cognition in the black-handed spider monkey (Ateles geoffroyi), a highly specialized frugivore (Gonzalez-Zamora et al., 2009), displaying strong fission-fusion dynamics (Campbell, 2008). Though relatively understudied with regard to cognition, spider monkeys have demonstrated route planning skills (Di Fiore and Suarez, 2007), perspective-taking abilities (Amici et al., 2009), tool use in the form of utilizing sticks in an intentionally directed manner (Lindshield and Rodrigues, 2009), and proficiency in using their prehensile tail to solve out-of-reach feeding problems (Nelson and Kendall, 2018). In a variety of problem-solving tasks involving inhibitory control, spider monkeys performed at levels comparable to chimpanzees (Pan troglodytes), bonobos (Pan paniscus) and orangutans (Pongo pygmaeus), and better than gorillas (Gorilla gorilla) (Amici et al., 2008). These results have been mainly attributed to the spider monkeys' socioecological background and thus to their dietary specialization and distinct social organization.
To further corroborate this notion, we assessed the ability of spider monkeys to solve three numerical cognition tasks. More specifically, we assessed the performance of nine captive spider monkeys in a quantity discrimination task and two numerosity tasks.
We hypothesized that the spider monkeys 1) should be able to discriminate between relative quantities of food, 2) should be able to discriminate between discrete numbers of abstract non-food items, and 3) display a significant correlation between performance in terms of success rate and task difficulty in terms of numerical similarity of the discriminanda in all three tasks.

Subjects
The study was carried out with nine adult black-handed spider monkeys (Ateles geoffroyi). The group consisted of six males and three females, aged between eight and twelve years. The animals were housed at the field station UMA Doña Hilda Á vila de O ′ Farrill of the Universidad Veracruzana, located in a nature reserve near Catemaco, in the state of Veracruz, Mexico. The animals were housed in a roofed enclosure of 20 × 10 × 8 m which was subdivided into ten equally-sized compartments. The enclosures were connected by sliding doors, which were usually kept open for the monkeys to socialize but could be closed when experiments required temporary separation of animals. They were provided with fresh fruits and vegetables once per day and given seeds and edible foliage interdum to supplement their diet. The enclosures were equipped with mobile and fixed furnishings, including branches and logs, ropes, tires, perches and sleeping boxes. The experiments were carried out in the morning before feeding and no food deprivation scheme was adopted for this study. The animals had participated in previous studies on their cognitive abilities, namely in memory tasks (Reynoso-Cruz et al., 2021) as well as in sensory discrimination tasks (Pereira et al., 2021) and were thus accustomed to participating in behavioral tests and to temporary separation. All monkeys were tested individually in order to prevent interference from, and distraction by, the other monkeys.

Apparatuses
For the quantity discrimination task, the test apparatus consisted of an opaque white plexiglass board of 14 × 44 cm with two petri dishes (diameter 9 cm) that were attachable to the board at a distance of 15 cm. The apparatus represented a two-alternative choice paradigm in which the animals were allowed to choose between one of the two petri dishes, which contained different quantities of equally-sized food items (pieces of raisins, pieces of cranberries, quarters of Cheerios, or unshelled sunflower seeds, depending on each individual's favored food item).
For the numerosity task with equally-sized dots, the test apparatus consisted of a metal bar of 50 × 6 cm, with two PVC boxes (5 × 5 × 5 cm) attached to the bar at a distance of 22 cm. Laminated white cards (5 × 5 cm) featuring different numbers of equally-sized black filled circles were attached to the boxes' slightly larger metal lids (6 cm × 6.8 cm) using magnetic tape. The sizes of the circles were adjusted as to have the same amount of black and white surface on each card. The apparatus represented a two-alternative choice paradigm in which the animals were allowed to choose between one of the two boxes, where the box fitted with the card bearing the higher number of dots contained a food item (raisin, dried cranberry or half a Cheerio, depending on each individual's favored food item).
For the numerosity task with various-sized dots, the test apparatus, food rewards, and methods were the same as in the numerosity task with equally-sized dots. However, in this experiment, the sizes of the filled circles were modified so that dots of different sizes were present on a given card in order to control for the possibility that the animals would base their decision for one of the two options on the size of the dots rather than on their numerical properties. The overall surface that the black circles and the white background covered were still equal, and the total area covered by the dots was equal on all cards in both numerosity tasks. (See supplemental material for pictures of the apparatuses).

Procedure
For the quantity discrimination task, a correct response was recorded if the animal chose the larger quantity, and an incorrect response was recorded if the smaller quantity was chosen. Due to one animal experiencing digestive upset during several days of the quantity discrimination experiment, that individual did not participate in this task. Each animal was presented with the following ratios of food items: 1:2, 1:3, 1:4, 1:5, 2:3, 2:4, 2:5, 3:4, 3:5, 4:5. The position of the petri dishes (e.g. larger quantity on the left and smaller quantity on the right) was pseudorandomized, and bothlarge and small quantitieswere presented equally often to the left and the right. The larger quantity was presented a maximum of three times in a row on one or the other side. Each of the ten stimulus combinations was presented once per session, and one to two sessions were performed per animal, per day. A total of 300 trials, i.e. 30 sessions were performed with each individual.
For the numerosity task with equally-sized dots and the numerosity task with various-sized dots, a correct response was recorded if the animal chose the box with the higher number of dots, and an incorrect response was recorded if the box with the lower number of dots was selected. Each animal was presented with the following combinations of dots: 1:2, 1:3, 1:4, 1:5, 2:3, 2:4, 2:5, 3:4, 3:5, 4:5. The position of the numbered cards (e.g. higher number of dots on the left and lower number of dots on the right) was pseudorandomized, and bothhigh and low numberswere presented equally often to the left and the right. The card with the higher number of dots of a given pair was presented a maximum of three times in a row on one or the other side. Each of the ten stimulus combinations was presented once per session, and one to two sessions were performed per animal, per day. A total of 300 trials, i. e. 30 sessions were performed with each individual, in both numerosity tasks. The order of the experiments was the same for all animals, and was set as followed: quantity discrimination, numerosity with equallysized dots, and numerosity with various-sized dots.

Data analysis
In each trial of each experiment, the animals had two options: to select the correct, i.e. the rewarded optionor the better-rewarded option in the case of the quantity discriminationor to select the incorrect, i.e. the non-rewarded optionor lesser-rewarded option in the case of the quantity discrimination. We set the learning criterion at 70% correct responses over two consecutive sessions, i.e. at least 14 correct responses over 20 trials. Our rationale for this was that, firstly, this corresponds to p < 0.05 according to the two-tailed binomial test, and, secondly, the same criterion has been used in previous studies on cognitive performance in nonhuman primates. The Spearman rankorder correlation test was used to assess possible correlations between the animals' performance and the number of sessions, and to determine whether the group's mean performance ( ± SD) in a given task improved over time. In all experiments, the Spearman rank-order correlation test was used to assess possible correlations between the animals' performance and the task difficultyin terms of how similar or different the numerical discriminants were. A Friedman ANOVA was used to assess potential interindividual differences in performance by calculating the average score that each individual reached across all sessions of the tasks and ranking the animals accordingly.

Quantity discrimination
As a group, the animals reached the learning criterion in session 11 and remained above the learning criterion in all following sessions (Fig. 1). All eight animals succeeded in this task, with the three fastest individuals reaching the learning criterion in session 2 and the slowest individual in session 12.
The group's performance increased significantly across the sessions (p < 0.01, r s =0.89, Spearman test) as illustrated by the trendline showing a significant positive slope. This was also true for all eight individuals considered separately.
The monkeys' performance as a group significantly correlated negatively with task difficulty (p < 0.01, r s =0.91, Spearman test) (Fig. 2). Trials in which the number of food items differed only by one (Δ1) yielded the lowest mean scores of correct responses and trials in which the number of food items differed by four (Δ4) yielded the highest mean scores. The same was true with all eight individuals considered separately.

Numerosity with equally-sized dots
As a group, the animals reached the learning criterion in session 16 and only dropped below the learning criterion once in the following sessions (Fig. 3). All nine animals succeeded in this task, with the fastest individual reaching the learning criterion in session 2 and the slowest individual in session 12.
The group's performance increased significantly across the sessions (p < 0.01, r s =0.84, Spearman test) as illustrated by the trendline showing a significant positive slope. This was also true for all nine individuals considered separately.
The monkeys' performance as a group significantly correlated negatively with task difficulty (p < 0.05, r s =0.76, Spearman test) (Fig. 4). Trials in which the number of dots differed only by one (Δ1) yielded the lowest mean scores of correct responses and trials in which the number of dots differed by four (Δ4) yielded the highest mean scores. The same was true with all nine individuals considered separately.

Numerosity with various-sized dots
As a group, the animals reached the learning criterion in session 3 and displayed only occasional drops below the learning criterion in the following sessions (Fig. 5). All nine animals succeeded in this task, with the four fastest individuals reaching the learning criterion in session 2 and the slowest individual in session 13.
It is interesting to note that while the group's performance in the numerosity task with equally-sized dots gradually and significantly increased from around 60% to 80% across the sessions (Fig. 3), the group's performance in the numerosity task with various-sized dots already started between 70% and 80%, and generally remained in that range throughout the sessions (p = 0.58, r s =− 0.11, Spearman test) as illustrated by the trendline showing an almost horizontal and nonsignificant slope.
The monkeys' performance as a group significantly correlated negatively with task difficulty (p < 0.05, r s =0.67, Spearman test) (Fig. 6). Trials in which the number of dots differed only by one (Δ1) yielded the lowest mean scores of correct responses and trials in which Fig. 1. Mean performance of the spider monkeys (n = 8) in the quantity discrimination task. Each data point represents the mean ( ± SD) percentage of correct responses in one session of ten trials. The black dot represents the session in which the animals, as a group, reached the learning criterion of two consecutive sessions at or above 70% correct responses. The dotted gray line shows the trendline of the group's performance.
the number of dots differed by four (Δ4) yielded the highest mean scores. The same was true with all nine individuals considered separately.

Interindividual variation in performance
Based on their ranks in each task, none of the nine animals performed Fig. 2. Mean performance of the spider monkeys (n = 8) in the quantity discrimination task according to the number of food items by which the various quantity combinations differed from each other. Each data point represents the mean ( ± SD) percentage of correct responses in one session of ten trials. Δ1 refers to the quantity combinations which differed by only one food item (1 vs 2, 2 vs 3, 3 vs 4, and 4 vs 5). Accordingly, Δ2, Δ3 and Δ4 refer to the quantity combinations which differed by two food items (1 vs 3, 2 vs 4, and 3 vs 5), three food items (1 vs 4 and 2 vs 5) and four food items (1 vs 5), respectively. The dotted gray line shows the trendline of the group's performance.

Fig. 3.
Mean performance of the spider monkeys (n = 9) in the numerosity task with equally-sized dots. Each data point represents the mean ( ± SD) percentage of correct responses in one session of ten trials. The black dot represents the session in which the animals, as a group, reached the learning criterion of two consecutive sessions at or above 70% correct responses. The dotted gray line shows the trendline of the group's performance.

Fig. 4.
Mean performance of the spider monkeys (n = 9) in the numerosity task with equally-sized dots according to the number of dots by which the various number combinations differed from each other. Each data point represents the mean ( ± SD) percentage of correct responses in one session of ten trials. Δ1 refers to the number combinations which differed by only one dot (1 vs 2, 2 vs 3, 3 vs 4, and 4 vs 5). Accordingly, Δ2, Δ3 and Δ4 refer to the number combinations which differed by two dots (1 vs 3, 2 vs 4, and 3 vs 5), three dots (1 vs 4 and 2 vs 5) and four dots (1 vs 5), respectively. The dotted gray line shows the trendline of the group's performance.

Discussion
The results of the present study show that the spider monkeys were clearly able to discriminate between relative quantities of food as well as to discriminate between discrete numbers of abstract non-food items, irrespective of whether these were of equal size or of different size. Further, the animals displayed a significant correlation between performance in terms of success rate and task difficulty in terms of numerical similarity of the stimuli.

Quantity discrimination
Our finding that the spider monkeys succeeded in a quantity discrimination task with food is in line with our first hypothesis and also with previous studies which showed that members from all major primate taxa succeeded in this type of cognitive task. Hominoid primates (chimpanzees, bonobos, gorillas, orangutans, Hanus and Call, 2007), catarrhine primates (rhesus macaques, Hauser et al., 2000; olive baboons, Barnard et al., 2013), platyrrhine primates (capuchins, Addessi et al., 2008; cotton top tamarins, common marmosets, Stevens et al., 2007), and strepsirrhine primates (various lemur species, Jones and Brannon, 2012) have all been reported to successfully discriminate between relative quantities of food, though at different levels of success and with different limits of task difficulty.
Also in line with previously studied primate species, and in line with our third hypothesis, the spider monkeys' accuracy in quantity discrimination decreased as the ratio between the numerical values in the two sets of stimuli approached 1 (Fig. 2). However, while this ratio effect has repeatedly been observed across taxa, from strepsirrhine (Jones and Brannon, 2012) to hominoid primates (Hanus and Call, 2007), the extent to which performance was affected by this effect varied between species. Lemurs were only able to successfully discriminate the larger quantity of raisins or nuts with a 1:3 ratio, but not a 1:2 or a 2:3 ratio (Jones and Brannon, 2012). Marmosets and tamarins successfully discriminated between quantities of food pellets differing by a 2:3 or greater ratio, but not a 3:4 or 4:5 ratio (Stevens et al., 2007). Similarly, rhesus macaques succeeded in discriminating between ratios of apple slices up to 3:4 or greater but failed with the ratio 4:5 (Hauser et al., 2000). The spider monkeys of the current study, in contrast, still performed above chance level when presented with the most challenging task of discriminating between 4 vs 5 food items. Thus, they

Fig. 5.
Mean performance of the spider monkeys (n = 9) in the numerosity task with various-sized dots. Each data point represents the mean ( ± SD) percentage of correct responses in one session of ten trials. The black dot represents the session in which the animals, as a group, reached the learning criterion of two consecutive sessions at or above 70% correct responses. The dotted gray line shows the trendline of the group's performance.

Fig. 6.
Mean performance of the spider monkeys (n = 9) in the numerosity task with various-sized dots according to the number of dots by which the various number combinations differed from each other. Each data point represents the mean ( ± SD) percentage of correct responses in one session of ten trials. Δ1 refers to the number combinations which differed by only one dot (1 vs 2, 2 vs 3, 3 vs 4, and 4 vs 5). Accordingly, Δ2, Δ3 and Δ4 refer to the number combinations which differed by two dots (1 vs 3, 2 vs 4, and 3 vs 5), three dots (1 vs 4 and 2 vs 5) and four dots (1 vs 5), respectively. The dotted gray line shows the trendline of the group's performance.
performed better than the previously mentioned species and at the same level as chimpanzees, bonobos, and orangutans which also succeeded with this ratio when presented with two sets of food pellets (Hanus and Call, 2007). Interestingly, the success rate of the spider monkeys in terms of percentage of correct decisions was also comparable to that of the great apes, even at those ratios which differed only by one food item (1:2, 2:3, 3:4, and 4:5).
The ability to correctly evaluate quantities allows for numerical judgements not by counting, but rather by mentally representing the approximate number of items in a set, in an analog format (Cantlon, 2012). From an evolutionary point of view, the ability to evaluate quantities of food, mates, competitors, or predators serves a critical survival function and thus it is not surprising that this cognitive skill is found, though to different degrees, in all primate species tested so far. The ratio effect mentioned above is consistent with the approximate number system (ANS), a hypothetical mechanism for numerical representation that is based on the notion that the ability to discriminate quantities improves as the difference between those quantities increases (Brannon and Merritt, 2011). In other words, performance is limited by the ratio between the quantities rather than their absolute values. An alternative hypothetical mechanism that has been proposed to affect performance in relative quantity discrimination is the object-file system which suggests that low numbers of items can be individually represented with high fidelity, and though this allows for precise discrimination of low values, only three or four so-called object files can be maintained simultaneously (Beran and Parrish, 2016). It should be mentioned that these two hypothetical mechanisms which try to explain both the occurrence of and between-species differences in quantity discrimination are not mutually exclusive but may complement each other.
From an ecological perspective, the notion of a ratio effect appears plausible as choosing a food patch with ten fruits instead of a food patch with two fruits leads to a clear nutritional benefit whereas being able to tell apart ten fruits from nine fruits will not result in a similar benefit since there is less of a relative difference between the two options (Beran et al., 2015). This example illustrates the relevance of quantity discrimination for the highly frugivorous spider monkeys and provides a plausible explanation for the high level of performance of the animals in the present study. Although discriminating between quantities of food is relevant to many primates, spider monkeys, chimpanzees, bonobos and orangutans share another trait in addition to their frugivory which is their fission-fusion organization (Campbell, 2008;Lehmann and Boesch, 2004;van Schaik, 1999). While quantity discrimination abilities may be directly beneficial in terms of comparing different food sources, they may also indirectly aid in adapting group size in order to maximize individual food intake during fission events, which are thought to be based on the scarcity and availability of food (Anderson, 2001). Our finding that the spider monkeys of the present study readily learned and succeeded in the quantity discrimination task at a level comparable to that found in other primate species which are both frugivorous and live in fission-fusion societies is thus in line with the notion that both of these traits may contribute to an enhanced level in this cognitive skill.

Numerosity with equally-sized dots and with various-sized dots
Our finding that the spider monkeys succeeded in discriminating between discrete numbers of abstract non-food items is in line with our second hypothesis. It is commonly agreed that numerosity, i.e. the ability to discriminate between discrete numbers of stimuli presents a cognitively more challenging task compared to the ability to judge relative quantities due to the abstract nature of the stimuli that are commonly used, e.g. dots or other types of symbols instead of food items which have an intrinsic reward value for animals (Agrillo and Bisazza, 2014).
We decided to use two sets of stimuli in the numerosity tasks employed here (equally-sized dots and various-sized dots, respectively) as this allowed us to control for the possibility that the animals may use non-numerical cues such as surface area, size, shape, or color when discriminating between discrete numbers of stimuli (Brannon and Terrace, 2000). Capuchin monkeys (Judge et al., 2005) and chimpanzees (Tomonaga, 2008), for example, have been reported to rely on surface area rather than on the number of stimuli in their decision making when first exposed to abstract stimuli such as dots.
Due to the abstract nature of the stimuliwhich do not have an intrinsic reward valuethat are commonly used in numerosity tasks, animals are usually trained to associate the stimuli (e.g. dotted cards) with a reward, and their capacity to learn a numerical rule is then interpreted as evidence of their numerical abilities (Agrillo and Bisazza, 2014). As this method inevitably involves a training process, it is not surprising that the spider monkeys of the present study needed a certain number of trials to reach the learning criterion. This is in line with previous studies in nonhuman primates that have also found that the animals' performance progressively improved across sessions in numerosity tasks. However, marked interspecific differences in learning speed for numerical rules have been reported. The spider monkeys of the present study reached the learning criterion of 70% correct responses within 16 sessions, i.e. 160 trials, in the task with equally-sized dots (Fig. 3), and within 3 sessions, i.e. 30 trials, in the task with various-sized dots (Fig. 5). Three species of lemurs were trained to select the numerically larger of two arrays of dots on a touch screen, and while two individuals never reached the learning criterion, the remaining nine animals successfully discriminated between three presented number pairs within 600 trials (Jones et al., 2014). Squirrel monkeys successfully discriminated dotted number cards, one number pair at a time, and needed between 300 and 350 trials to discriminate above chance level between the first stimulus pair (Thomas et al., 1980). Chimpanzees succeeded in discriminating between ten pairs of numbered dot sets on a touch screen and reached the learning criterion within 120 trials (Tomonaga, 2008) and an orangutan needed the same number of trials in a comparable task (Vonk, 2014). It is interesting to note that the spider monkeys of the present study needed markedly fewer trials to learn the first of the two numerosity tasks employed here than the lemurs and squirrel monkeys, and only slightly more trials than the chimpanzees and orangutans. Here, too, our findings are in line with the notion that a high degree of frugivory and high levels of fission-fusion dynamics (Campbell, 2008;Lehmann and Boesch, 2004;van Schaik, 1999), socioecological traits that spider monkeys share with chimpanzees and orangutans, may have contributed to their superior learning speed in numerosity tasks.
Our finding that the spider monkeys reached the learning criterion in the second numerosity task (with various-sized dots, Fig. 5) significantly faster than in the first numerosity task (with equally-sized dots, Fig. 3) suggests a high degree of intramodal transfer ability to a novel stimulus set in this type of cognitive task and thus an order effect. This notion is also supported by our finding that the average performance in the first numerosity task increased from around 60% to 80% correct responses (Fig. 3) whereas the average performance in the second numerosity task already started between 70% and 80% (Fig. 5). Both findings suggest that the animals did indeed base their decision for one of two simultaneously presented dotted cards on their numerical properties rather than on the size of the dots. Finally, we found a ratio effect in both numerosity tasks (Figs. 4 and 6) which further refutes the possibility that the animals based their decisions on the size of the dots or the patterns of the cards rather than on their numerical properties.
From a socioecological perspective, being able to discriminate between discrete numbers can be beneficial and at least a few studies on free-ranging primates suggest that this cognitive skill is not only found in captive animals that have been trained in numerosity tasks. For instance, baboons use the number of individuals over size-based representations (i.e. total animal mass) to monitor social behavior during collective movements (Piantadosi and Cantlon, 2017). Vervet monkeys use the number of individuals when deciding to participate in communal range defense (Willems et al., 2015), and black howler monkeys use the number of intruder calls in order to assess inter-group fighting ability (Kitchen, 2006). Similarly, Verreaux's sifakas (Koch et al., 2016) and Javan gibbons (Yi et al., 2020) have been reported to use the number of actively participating individuals of the opponent group for deciding on participation in intergroup conflicts.
Although some differences between wild and captive animals have been reported in their levels of performance with certain cognitive tasks (Benson-Amram et al., 2013;Cauchoix et al., 2017), it is commonly agreed that cognitive abilities have evolved throughout a species' evolutionary history. In the case of primates, for example, their physical cognitive skills are thought to have evolved in response to the demands of their dietary specializations (Tomasello and Herrmann, 2010), while their social cognitive skills have been suggested to have evolved in response to the complexity of their social systems (Dunbar, 1998). Although we cannot exclude the possibility that the results of our study may have been affected by the captive setting and thus cannot be generalized to the species level, we are nevertheless confident that our findings may contribute to a better understanding of the mechanisms underlying cognitive abilities and the selective pressure(s) which may have promoted their evolution.
To summarize, the results of the present study support the hypothesis that ecological traits such as dietary specialization and/or social complexity traits such as fission-fusion dynamics may explain betweenspecies differences in cognitive performance.

Funding
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Ethical note
The experiments reported here comply with the American Society of Primatologists' Principles for the Ethical Treatment of Primates, and also with current Swedish and Mexican laws. The study was performed according to a protocol approved by the Ethical Board of the Federal Government of Mexico's Secretariat of Environmental and Natural Resources (official permits no. 09/GS-2132/05/10).