Abstract
The behavioural sciences are home to controversies that have survived for centuries, notably about the relation between observable behaviour and theoretical constructs addressing out-of-sight processes in the agents’ brains. There is no shared definition for cognition, but the very existence of a thriving journal called Animal Cognition proves that such controversies are still live and help to (a) promote research on the complexity of processes leading to action, and (b) nudge scholars to restrict their cognitive models to those that can be falsified experimentally. Here, we illustrate some of these issues in a limited arena, focusing on the construction and expression of subjective value and choice. Using mainly work from our own laboratory, we show that valuation of alternatives is sensitive to options’ properties, to subject’s state, and to background alternatives. These factors exert their influence at the time the subject learns about individual options, rather than at choice time. We also show that valuation can be experimentally dissociated from the cognitive representation of options’ metrics and argue that experimental animals process options independently at the time of choice, without elaborated comparisons along different dimensions. The findings we report are not consistent with the hypothesis that preference is constructed at the time of choice, a prevalent view in human decision-making research. We argue that animal cognition, viewed as a research program at the crossroads of different behavioural sciences rather than as a debate about properties of mental life, is inspiring and solid, and a progressive and progressing paradigm.
Similar content being viewed by others
Background
The 25th anniversary of the journal Animal Cognition is a fitting opportunity to reflect on when and how the scientific study of cognition is both justified and pragmatically helpful in understanding animal behaviour. Here, we share some reflections, findings and theoretical ideas related to some of the work on decision-making conducted in our laboratory in the period since the birth of the journal.
Much, perhaps most, adaptive behaviour, includes sensitivity to information that is only relevant within individual lifespans. Such information is acquired and processed deploying mechanisms evolved under natural selection across generations. For this reason, articulating research on learning and decision processes with the logic of evolutionary adaptation is at the core of animal cognition research. Through their lives, animals accumulate experience in which their own behaviour is associated with specific outcomes, and make decisions by choosing between feasible actions under the influence of such information. This much is uncontroversial and fits well diverse aspects of laboratory-based associative learning research, but normative understanding and modelling of how information is acquired and deployed follows different rationales across disciplines. Ecology, evolutionary biology, neuroscience, economics and experimental psychology, all have their own theoretical and empirical frameworks to design and test models of information acquisition and decision-making in living organisms, and the science of animal cognition has much to gain by placing itself at the crossroads of these approaches.
As a general framework, rather than modelling the generation of action as directly mapped to physical properties of potential targets, as is the norm in applications of optimality in behavioural ecology, we assume that stimuli identifying reward sources acquire subjective value through learning, and these (“remembered”) values determine action. This has a parsimony cost, because we include in our models cognitive entities such as cognitive representations and putative choice algorithms, that are not directly observable, but it provides a suitable framework to distinguish the learning circumstances from those of the expression of preferences.
The distinction between objective contingencies and their subjective impact can already be found in Daniel Bernoulli’s writing about human preferences (Bernoulli 1954, page 24; see also Stearns 2000). In 1738, he wrote that “The determination of the value of an item must not be based on its price, but rather on the utility it yields. […] utility […] is dependent on the particular circumstances of the person making the estimate […] A gain of a thousand ducats is more significant to a pauper than to a rich man, though both gain the same amount.” He was arguing that when trying to understand preferences, one cannot just use the absolute physical parameters of the available options, but must consider what those properties mean for the person (their utility). This introduces at once many topics that are still with us today in both human and other species research, including issues to which the field of animal cognition can contribute.
On the concept of utility
Our first topic is the concept of utility itself, and how it is handled across fields. In microeconomics theory, utility is defined as the function that is maximised by an agent’s preferences. This definition does not attribute a priori significance to substantive variables such as money, food, or reproductive success, nor is concerned with how an agent may perceive or represent its environment. In fact, “for the purpose of constructing a theory of consumer [rational] choice, not only the measurement of utility, but the very concept itself, is unnecessary. As we have seen, we can base a theory on the concepts of choice and indifference, and nothing more is needed for the theory than the set of indifference curves (or surfaces) with their assumed properties” (Gravelle and Rees 2004, p. 16. Emphasis in the original). This is enough to support a self-consistent theory of rational behaviour, but does not promote biological understanding of the shape of indifference curves themselves, nor concerns the cognitive operations that generate behaviour.
The closest equivalent of utility in behavioural ecology is fitness, itself a non-trivial theoretical construct (for complexities of the fitness concept see Grafen 2009, 2014). For instance, in Optimal Foraging Theory, researchers make hypotheses about the environment, about the repertoire of available actions, and about the fitness consequences of each of these actions in that environment. Behaviour is then predicted by ranking the actions in the repertoire according to their fitness consequences in that environment (Kacelnik and Cuthill 1987; Stephens and Krebs 1986; Vasconcelos et al. 2017). When data do not fit the predictions, one or more of the hypotheses is revised. Notice that, in contrast with microeconomic models, the equivalent of utility (the maximand of behaviour) is not inferred from revealed preferences, but predicted through their hypothetical trans-generational (evolutionary) consequences.
The relevance for the present discussion is that, in common with microeconomics’ utility, but not with everyday intuition, this version of utility (i.e., of what behaviour maximises, including preference in choices) does not imply a cognitively represented goal; in this research program cognition is a late-coming guest. Information-processing algorithms capable of generating the predicted behaviour are sometimes modelled, but this is done neither by assuming that the subject has fitness as a goal nor by describing its preferences, but by working out rules (strategies) that would produce optimal consequences. For instance, Houston et al. (1982) and McNamara and Houston (1985) modelled learning processes capable of behaving quasi-optimally in idealised foraging situations, but not by extrapolating from experimental results or by implementing results of previous research on animal learning, but by testing in silico which rules would converge to optimal behaviour. It is frequently argued that behavioural ecologists are concerned with functional, rather than proximate accounts of behaviour, but testing the predictive power of functional models is very hard, primarily because of the stochastic complexity of natural environments and of the heredity and development of behaviour. In practice, deviations between theoretical predictions and empirical descriptions are often accommodated by post hoc arguments about assumed cognitive processes, i.e. proximate mechanisms.
It may have become clear so far that in the course of the two and a half decades since the foundation of Animal Cognition we have become inclined to give more weight to research aimed at unravelling the cognitive processes behind observed behaviour. This has led us to develop and test models in which cognitive representations and their interactions play a decisive role. Under this view, leaving aside cognitive mechanisms when modelling behaviour emulates conchology, i.e., the branch of malacology that studies molluscs after discarding their soft parts.
A further important point about Bernoulli’s quote is his neglect of state-dependence. If identical physical rewards are not worth the same to a pauper as to a rich man, and paupers and rich men can swap places as a consequence of unstable life contingencies, then to understand choice we need to know the state of each agent. Further, if the consequences of choices are both state-dependent and learned by experience, we may ask whether the state that predicts behaviour is that at the time of learning or at the time the preference is expressed. If learning shapes preferences, then the former is to be expected. This is at odds with the influential view that preferences are constructed (and only exist) at the time of choice, when the agent judges relative, rather than absolute, properties of each alternative (Lichtenstein and Slovic 2006; see also Warren et al. 2011). Of course, by definition “preference” involves more than one alternative, but assignment of value can occur earlier, when options are experienced on their own, rather than in choices. Let us expand on this concept.
One framework can be that agents remember the physical parameters of each option, and, when more than one option is present, compare them along their dimensions to construct a ranking and make a choice. For instance, an agent may remember that two actions result each on a typical amount of and kind of food, and if forced to choose, compares the predicted consequences of the two actions in terms of two dimensions, amount and palatability, to rank them. This framework is often invoked to argue that some violations of economic rationality occur because different dimensions are given different weights when constructing preference at the time of choice (for examples, see Bateson et al. 2003; Nachev et al. 2021; Sánchez-Amaro et al. 2019).
Alternatively, the agent can assign value to each option by combining its attributes whenever it experiences it, even if there is no choice involved. Then, if a choice presents itself, the options’ values can be ranked through fast and simple processes, as we discuss below. Under this hypothesis, the combination of attributes such as amount or palatability, occurs at learning time, more likely on sequential encounters wherein only one alternative is available. Preference, even if not expressed because it is a choice-dependent concept, is latent (has been constructed) already before the agent has experienced any choice. In this case, to predicate that preference is constructed at choice time would be unhelpful. It has been said that “kicks in behinds” must have existed in the mind of God before He created kicks and behinds; more modestly, we argue that preferences can exist in latent form in deciding agents before they ever choose, so that when this happens, preferences are not constructed, but just expressed. As we shall illustrate, our studies, chiefly performed using starlings, lead us to believe that the latter is a more accurate account of how experimental animals act towards and choose between alternative opportunities, at least in laboratory contexts.
In summary, our stand is that hypotheses about cognitive processes, although not directly observable, are an essential component of behavioural research, and that a confluence between allied sciences dedicated to the understanding of decision processes is at the same time necessary, fun, and rich in consequences. A research program sensitive to these reflections would include assumptions of rationality or utility maximisation derived from economics, mechanistic discoveries of cognitive and behavioural experimental psychology, and formal analyses of the relation between experience and fitness.
In the next sections we revisit the problem of option valuation in the context of choice, briefly discussing (a) how and when state affects valuation; (b) how context affects valuation; (c) how valuation can be measured; and (d) how values interact at the time of choice.
State-dependent valuation learning
We have presented the hypothesis that preferences in choice depend on the state-dependent utility experienced when an animal becomes acquainted with a potential food supply, regardless of whether this happens through choices or in sequential encounters. We have tested this idea in a diversity of experiments that involved, so far, starlings (Aw et al. 2011; Kacelnik and Marsh 2002; Pompilio and Kacelnik 2005), pigeons (Vasconcelos and Urcuioli 2008), fish (Aw et al. 2009), and grasshoppers (Pompilio et al. 2006), with consistent results across these distant taxa. The protocols were of course adjusted to each species, but the general idea was the same: to first cause sequential (one at a time) encounters with potential food sources, manipulating the subject’s own state so that one option was met when the experienced benefit of the outcome was greater than in encounters with the alternative. State was manipulated either by changing the amount of work required to gain a reward (Aw et al. 2011; Kacelnik and Marsh 2002) or by varying the state of deprivation (Aw et al. 2009; Pompilio and Kacelnik 2005). Once training had occurred, the animals faced choices in either of the states of deprivation (Fig. 1a), or between sources typically associated with different effort (Fig. 1b).
At the time of measuring preference, in most cases, the subjects had not been rewarded for specific choice behaviour. Preference for any option was higher when subjects had been in a leaner state or paid greater work for that option during learning, consistently with Bernoulli’s original statement. In contrast, state at the time of choosing had no effect on the level of preference. We refer to these findings as State-Dependent Valuation Learning, or SDVL, and—in spite of its simplicity—believe that the result is likely to be very widespread and significant, as it argues strongly against the notion that preference is constructed at the time of choice. SDVL leads to paradoxical preferences in many experimental situations, but is likely to be adaptive in natural circumstances, when fitness benefits correlate with hunger or scarcity (hence effort). The enhanced value of costlier items is equivalent to the paradoxical “sunk cost” observations in humans (see for instance Kacelnik and Marsh 2002; Navarro and Fantino 2005). The preference for costlier items in animals has also been named “work ethics” (Clement et al. 2000), but it should be noticed that in these experiments animals do not choose to work harder, but rather to get for free the typical consequences of having invested greater effort, which seems the opposite of expressing a preference for hard work under the belief that effort carries an ethical merit or has a moral value.
The context dependence of value
In the previous section, we showed that the state of decision-takers at the time when they learn can be dissociated from their state at the time of expressing preferences, and that available evidence indicates that the former has greater impact. In this section, we focus on the learning environment, rather than the state of the agent. The experimental protocol in this case (Pompilio and Kacelnik 2010; see also Vasconcelos et al. 2013) was inspired by previous work in pigeons, especially by Belke (1992). As before, the learning context was experimentally dissociated from that in which preference was measured. There were four options, which in the basic protocol were identified as A(5 s), B(10 s), C(10 s) and D(20 s). The capital letters indicate an arbitrary stimulus, such as the colour of a pecking key, and the suffix is the delay for a food reward to be delivered, lapsing from the time of responding at the stimulus. During training, starlings spent time in either of two contexts, [A(5 s)–B(10 s)] or [C(10 s)–D(20 s)], with “context” being defined by the options that could be encountered. However, options at this stage were encountered sequentially, not in pairs, so that subjects did not choose between them. In the subsequent critical preference tests, animals did face pairwise choices. In Test 1, they chose between options with equal delays but different ranking (B(10 s) Vs. C(10 s)), in Test 2, the choice was between options with equal ranking but different delay to food (A(5 s) Vs. C(10 s)), and in Test 3 delay and ranking were counterposed (B(10 s) Vs. C(14 s)), as explained in Fig. 2. When delays were equated (Test 1) they preferred the better ranking option, when ranking was equated (Test 2), starlings preferred the shorter delay, and when both dimensions were counterposed, they were indifferent. Sensitivity to both relative and absolute parameters can be explained by the reinforcement impact at the time of learning. In Test 1, due to the context in which B(10 s) and C(10 s) had typically occurred, the stimulus identifying B(10 s) signalled “bad news” relative to its background, while the opposite was true for the stimulus identifying C(10 s). Thus, even though both stimuli led to the same physical consequences, the hedonic impact of these consequences and hence the attached valuation of the stimuli is likely to have been greater for C(10 s). In this test, the value of the stimuli at the time of choice was the same, hence could not have caused the observed preference.
The impact of ranking at the time of learning does not imply, however, that their absolute parameters are not influential: Test 2, in which two options leading to delays of 5 s and 10 s were presented, both having previously been half the delays in their alternatives, shows that when ranking of the options at learning time is not different, their absolute values determine preference at choice time. Further, Test 3 pitched relative Vs absolute values, by offering a choice between an option that was objectively better but had been the worse of a pair at learning time and an alternative that was 40% longer but had been the better option in its learning context. In this test, ranking and absolute values neutralised each other and starlings did not show any preference. These results are consistent with the view that circumstances at the time of learning are influential in the construction of preference, and that circumstances at choice time could not be responsible for the observations (see Fig. 2).
Under no illusions: memory for temporal parameters is independent of context
From the point of view of cognitive processing, preference for an option over another which has equal absolute properties does not necessarily imply that subjects assign option-specific hedonic value, or utility, at the time of learning. It is theoretically possible that there is no valuation, but the subjective representation of the critical metrics of each option are influenced by its learning context. In the experimental examples used to illustrate state and context effects subjects showed preference between options that were equal in absolute value. However, while options’ absolute values were equal, their subjective representations could have differed. For instance, two options with delays of 10 s could have been remembered as having shorter or longer delays, according to the subjects’ state or context. The contextual effects on preference have been successfully replicated in human subjects, and an interesting parallel has been made by Palminteri and Lebreton (2021) with the so-called “Ebbinghaus illusion”, whereby the apparent relative size of visual stimuli is influenced by context. This possibility can be rejected for our starlings’ experiments, because the behaviour of the animals allows us to measure their temporal expectations, which have been shown not to be biased (Fig. 3).
The accuracy of the animal’s knowledge of options’ metrics was measured using the peak procedure (Balcı and Freestone 2020; Catania 1970; Monteiro and Machado 2009; Roberts 1981). Figure 3a makes the case compellingly by displaying the pecking rate of starlings while waiting for the outcome of trials in encounters with the four different options described in Fig. 2a. The critical observation is that pecking rate peaks at the time the rewards would normally be delivered. In particular, pecking in options B(10 s) and C(10 s) peaks at 10 s. Crudely, this can be described by saying that the bird “knows” when food is due in both cases, but still prefers the option that had signalled “good news” in its learning context (see Fig. 2a). This dissociation between interval representation and valuation supports the notion of an indirectly inferred hedonic component. Figure 3b shows similar findings from Monteiro et al. (2020) wherein, for alternatives associated with the same delay but different amounts of food, response rate peaked at the same time, even though in choices starlings preferred the more profitable alternatives (i.e., the ones leading to larger amounts).
The mechanism of choice: what happens at choice time?
The results summarised so far emphasise the intricate relation between behaviour, learning, cognition and normative models of decision-making. We have shown that, in addition to the mnemonic representation of metrics of food sources, animals store information about the hedonic impact experienced at learning time, when options become identifiable. We call this inferred impact “valuation”. This information, we claim, is highly influential when two options are met simultaneously and the animal must behave towards just one of them, namely when it expresses a preference. We argued against the view that preference is constructed at the time of choice by comparison of the remembered parameters of each option, for two main reasons. First, preference at choice time can be predicted by measures of behaviour taken when choices have not yet occurred (see below), and second, strong preference can exist between options whose metrics subjects accurately represent as equal, or even when the more delayed of two equally sized rewards is preferred due to its history (as shown in Figs. 2 and 3).
In this section, we shift our focus to the choosing stage, and explore what happens when an already informed subject (in the sense that it has already learned the properties and assigned value to each option in its environment) encounters two options simultaneously. In brief, we propose that it is possible to detect a measure of the value an animal assigns to an option independently of its preference in choices. This behavioural measure plays a similar role to that of the Willingness-To-Pay protocols in behavioural economics (Slovic 1995), because it offers a window into an agent’s valuation of options in the absence of a choice between alternatives. For the starlings, we use temporal hesitation to take options encountered sequentially, rather than in pairs or multiple sets. This temporal hesitation, latency, or response time, has two properties of significance for the present argument: (a) everything else being equal, they are shorter when the option’s objective or relative value is greater; and (b) for a given option, latencies show some distribution of durations between trials. Both properties make sense and have been corroborated repeatedly (e.g., Monteiro et al. 2020; Shapiro et al. 2008). Here, we discuss their consequences for modelling the mechanism of choice. As we shall see, these simple facts make testable predictions which are far from trivial.
The main idea here is that when two or more options are met simultaneously, the processes that generate response times in sequential encounters are deployed independently, namely without interfering with each other. No cognitive comparative evaluation takes place at choice time. This is a parsimonious starting point, and it is worth exploring how far it can take us. Under this assumption, which is the core of the Sequential Choice Model (SCM; Aw et al. 2012; Freidin et al. 2009; Kacelnik et al. 2011; Monteiro et al. 2020; Sasaki et al. 2018; Shapiro et al. 2008; Smith et al. 2018; Vasconcelos et al. 2010, 2013), if and when stimuli corresponding to options previously encountered sequentially are encountered simultaneously, each stimulus elicits a response time by sampling from its own distribution of response times in sequential encounters. The option that in that encounter yields a shorter sample receives the action, and is seen by the observer as being “chosen”. The alternative option does not generate an observable datum on that occasion. Because the distributions of response times have some spread, choice is a stochastic process in which the option whose associated latencies tend to be shorter is chosen in the majority of encounters. Of course, the less-preferred alternative occasionally wins the race, and then it is chosen and the observer records a response time. The response time observed when an option is chosen out of a pair or set of multiple options is then a biased sample from the distribution for the same option in sequential encounters, because samples at the left tail of the distribution of each option are more likely to result in a datum than unconstrained samples. The net result is that the distribution of response times observed in choices should be shorter than those observed when each option is encountered sequentially (more on this below).
There are close precedents for this rationale in the psychological literature, and it has been argued that choice is more a methodological resource of researchers than relying on special processes adapted for decision-making, and present in the animals themselves. For instance, Herrnstein (1970) argued that even when there is only one measured response (the equivalent of our sequential trials), animals still show some allocation of responding between the response being measured and the environmental background. He argued that the richer the background the lower the rate of responding to the response being measured. He further argued that when researchers orchestrate a choice by offering more than one possible response, nothing new happens, but the rates of responding compete by matching their relative values. Rate of responding is, of course, a concept that is reciprocally related to inter-response interval, and consequently to latency when one measures delay to emit a single response. Herrnstein explicitly supported the view that nothing else is necessary to understand choice behaviour when more than one option is present. In the cited paper, he writes: “It is hard to see choice as anything more than a way of interrelating one’s observations of behavior, and not a psychological process or a special kind of behavior in its own right.” This is consistent with our stand in this respect, except that we place emphasis on the fact that it is also possible to show that the subjects can remember the true properties of options even when their choice is contrary to the ranking of the parameters.
Other authors have developed race models for choice in independent but convergent ways. Logan et al. (2014) expanded models of Stop-Signal-Response-Time (SSRT), so as to relate the study of responding to single options to data on choice. SSRT models (Logan and Cowan 1984) deal with scenarios in which subjects respond to one “Go” signal and one “Stop” one. If the onset of both signals is simultaneous, and they are expressed in behaviour through a drift diffusion process, then the subject will respond or not in each trial according to which of the two processes reaches the response threshold earlier. In the extension by Logan et al. (2014) the concept is applied to choice scenarios, where each option acts as a stop signal for the competing option(s), similarly to the notion of cross-censorship between the distributions of latencies in the SCM.
The SCM originates from the optimal foraging and risk sensitivity tradition. Reboreda and Kacelnik (1991) noticed that response times in sequential encounters were a useful metric of preference, independent of, and complementary to, the proportion of responses in choice scenarios. They were testing the hypothesis that a widespread property of perception (Weber’s Law) may cause a positive relation between the variance in outcomes of a given option and its chance of being chosen in tests in which smaller outcomes were preferable (as in delays to food), and the opposite when larger outcomes were preferable (as in amount of food). They found that choice proportions were consistent with this hypothesis, but only weakly so: experimental starlings were consistently risk prone for delays, but either risk averse or indifferent for amounts (rather than being reliably risk averse). However, when they measured response times in sequential encounters, results were reliably consistent with the hypothesis: latencies to respond in sequential encounters were significantly shorter for greater variance in delays and significantly longer for greater variance in amounts (see Fig. 4a). On this basis, they argued that preference can be measured outside choice trials, and that response time in the absence of choice may be at least as sensitive a metric of relative valuation of options as proportion of choices.
As work accumulated, we have come to believe that latency in sequential encounters is in fact a more sensitive and informative index of preference than choice proportion. The Sequential Choice Model in its present form was suggested by Shapiro et al. (2008), that studied starlings’ preferences between food sources that varied across combinations of fixed amounts and delays. They tested the fit of several different models to the data, and found that (a) latencies to respond in sequential trials were shorter the higher the profitability of an option (Profitability = Amount / Delay); (b) latency for each option was longer the higher the profitability of alternative option(s) in the environment (i.e., a context effect); and (c) for each pair of options, the best predictor of choice proportion was the relation between their latencies in no-choice trials (Fig. 4b). They called their model the Sequential Choice Model or SCM, and argued that from a normative perspective, the mechanism underlying the SCM made sense, because sequential encounters are likely to be more prevalent in nature than simultaneous ones, and latency provides a common path to integrate different factors affecting the value on each alternative. As previously alluded, the SCM makes the counterintuitive prediction that observed response times should be shorter in choices than when options are met singly, the opposite of what should be expected if elaborated cognitive work occurs at choice time. The reason why the SCM predicts that choice latencies should be shorter than sequential ones is cross-censorship between the latency distributions of the alternatives: since the model assumes that the latency distributions of each alternative in sequential encounters are sampled independently and compete for expression, shorter samples have a greater chance of winning the race and being represented in the distribution of choice latencies. This censorship effect should be more pronounced for the less-preferred option of each pair, because samples from its right tail will never win the race and will not be recorded in any choice test. The interested reader can find a computational description, analytical implementation and numerical simulations of the SCM in Monteiro et al. (2020) supporting information.
The argument that choice behaviour in simultaneous encounters should be predictable from performance in single option encounters has been supported by experimental data from several labs and subject species. Protocols include risky choice (Aw et al. 2012), context manipulations (Vasconcelos et al. 2013), sub-optimal choice preparations (Macías et al. 2021; Ojeda et al. 2018), select/reject protocols (Freidin et al. 2009), mid-session reversal protocol in pigeons (Smith et al. 2018) and environments composed of multiple options defined either by delay (Vasconcelos et al. 2010) or by profitability (i.e., amount/delay; Monteiro et al. 2020).
Testing the SCM prediction that latencies to respond should be shorter in choices presents practical difficulties. On one hand, when animals strongly value an option, its latency distribution in sequential encounters approaches minimal reaction time, and there is little or no room for a shortening effect. On the other hand, less-preferred options, whose longer mean latencies should make the difference between sequential and choice trials more detectable, are by definition seldom chosen, leading to small sample sizes. Despite these difficulties, a combination of shortening or no-change has been universally documented when comparing simultaneous with sequential decisions, and this is the opposite of the predicted temporal cost if cognitive evaluations occurred at the time of each choice. Shapiro et al. (2008), Monteiro et al. (2020)—Fig. 1c, Ojeda et al. (2018), and Mácias et al. (2021) all found reliable shortening of latencies, but Aw et al. (2012) and Vasconcelos et al. (2013) found no evidence for either shortening or lengthening. Macías et al. (2021) arranged a particularly apt procedure to test this aspect of SCM, avoiding the limited sample size problem, and found clear evidence for shortening.
It is thus likely that in typical animal experiments preference is not constructed at choice time, but is the result of valuation of options by the subjects when they learn about the parameters of each alternative. In our view, in animal experiments, there is so far no evidence for the presence of cognitive processes evolved to generate optimal outcomes in simultaneous choices. However, lack of evidence is not the same as evidence for absence, and such evidence may emerge in novel protocols. In humans, in particular, one may intuitively expect that the processes addressed here may be prevalent in situations within the realm of so-called “system 1” (Kahneman 2011), rather than in the more elaborated and slow processes assumed to be addressed by “system 2”. In experimental studies of preference with human subjects information is frequently provided by description rather than experience (for discussions of the significance of this distinction see Hertwig and Erev 2009; Kahneman and Tversky 2000; Ludvig and Spetch 2011). When options are verbally described, it is to be expected that subjects may need to reason about their properties in order to rank them, leading to a detectable lengthening of decision time respect to single option encounters. Home buyers may indeed compare separately potential homes’ merits in terms of location and quality, and may take longer when more homes or more dimensions are compared. We are not aware of studies directly addressing the dynamics of differences in latency to act in sequential versus choice encounters in humans, but such studies would clearly serve to establish closer bridges between the cognitive processes of choice of humans and other species.
Concluding remarks
In summary, our stand is that hypotheses about cognitive processes, although not directly observable, are an essential component of behavioural research. The field of animal cognition is building a scientific program that answers the epistemological caveats that have historically been highlighted by justified critiques of mentalism. Perhaps the ideals of the cognitive revolution, who sought to replace both radical behaviourism and mentalism through a scientific approach to cognitive processes can be realised, after all, in the field of animal cognition. As evidenced in many publications in the Animal Cognition journal, whose first quarter of a century we are celebrating, this broadly defined research program is suitable to integrate contributions from a diversity of allied sciences, such as importing operational definitions of rationality derived from economics, mechanistic discoveries of cognitive psychology, rigorous empirical procedures used by behaviourists, and formal models of the relation between experience and evolutionary fitness as developed in behavioural ecology.
Data availability
This article containes no original data. All figure sources are included in the text.
References
Aw J, Holbrook RI, Burt de Perera T, Kacelnik A (2009) State-dependent valuation learning in fish: banded tetras prefer stimuli associated with greater past deprivation. Behav Proc 81(2):333–336
Aw J, Vasconcelos M, Kacelnik A (2011) How costs affect preferences: experiments on state dependence, hedonic state and within-trial contrast in starlings. Anim Behav 81(6):1117–1128
Aw J, Monteiro T, Vasconcelos M, Kacelnik A (2012) Cognitive mechanisms of risky choice: is there an evaluation cost? Behav Proc 89(2):95–103
Balcı F, Freestone D (2020) The peak interval procedure in rodents: a tool for studying the neurobiological basis of interval timing and its alterations in models of human disease. Bio-Protoc 10(17):e3735
Bateson M, Healy SD, Hurly TA (2003) Context-dependent foraging decisions in rufous hummingbirds. Proc Biol Sci R Soc 270(1521):1271–1276
Belke TW (1992) Stimulus preference and the transitivity of preference. Anim Learn Behav 20(4):401–406
Bernoulli D (1954) Exposition of a new theory on the measurement of risk (translation of Bernoulli D 1738 specimen theoriae novae de mensura sortis; Papers Imp. Acad. Sci. St. Petersburg 5 175–192). Econometrica 22(1):23
Catania AC (1970) Reinforcement schedules and psychophysical judgment: a study of some temporal properties of behavior. In: Schoenfeld WN (ed) The theory of reinforcement schedules. Appleton-Century-Crofts, New York, pp 1–42
Clement TS, Feltus JR, Kaiser DH, Zentall TR (2000) “work ethic” in pigeons: reward value is directly related to the effort or time required to obtain the reward. Psychon Bull Rev 7(1):100–106
Freidin E, Aw J, Kacelnik A (2009) Sequential and simultaneous choices: testing the diet selection and sequential choice models. Behav Proc 80(3):218–223
Grafen A (2009) Formalizing Darwinism and inclusive fitness theory. Phil Trans R Soc 364(1533):3135–3141
Grafen A (2014) The formal darwinism project in outline. Biol Philos 29(2):155–174
Gravelle H, Rees R (2004) Microeconomics. Pearson Education, London
Herrnstein RJ (1970) On the law of effect. J Exp Anal Behav 13(2):243–266
Hertwig R, Erev I (2009) The description-experience gap in risky choice. Trends Cogn Sci 13(12):517–523
Houston AI, Kacelnik A, McNamara J (1982) Some learning rules for acquiring information. In: McFarland D (ed) Functional ontogeny. Pitman Advanced Publishing Program, Boston, pp 140–191
Kacelink A, Cuthill IC (1987) Starlings and optimal foraging theory: modelling in a fractal world. Foraging behavior. Springer, Boston, pp 303–333
Kacelnik A, Marsh B (2002) Cost can increase preference in starlings. Anim Behav 63(2):245–250
Kacelnik A, Vasconcelos M, Monteiro T, Aw J (2011) Darwin’s “tug-of-war” vs. starlings’ “horse-racing”: how adaptations for sequential encounters drive simultaneous choice. Behav Ecol Sociobiol 65(3):547–558
Kahneman D (2011) Thinking, fast and slow. Penguin, London
Kahneman D, Tversky A (2000) Choices, values, and frames. In: MacLean LC, Ziemba WT (eds) Handbook of the fundamentals of financial decision making: Part I. World Scientific Publishing Co, Singapore, pp 269–278
Lichtenstein S, Slovic P (eds) (2006) The construction of preference: an overview. Cambridge University Press, Cambridge, pp 1–40
Logan GD, Cowan WB (1984) On the ability to inhibit thought and action: a theory of an act of control. Psychol Rev 91(3):295–327
Logan GD, Van Zandt T, Verbruggen F, Wagenmakers E-J (2014) On the ability to inhibit thought and action: general and special theories of an act of control. Psychol Rev 121(1):66–95
Ludvig EA, Spetch ML (2011) Of black swans and tossed coins: is the description-experience gap in risky choice limited to rare events? PLoS ONE 6(6):e20262
Macías A, González VV, Machado A, Vasconcelos M (2021) The functional equivalence of two variants of the suboptimal choice task: choice proportion and response latency as measures of value. Anim Cogn 24(1):85–98
McNamara JM, Houston AI (1985) Optimal foraging and learning. J Theor Biol 117(2):231–249
Monteiro T, Machado A (2009) Oscillations following periodic reinforcement. Behav Proc 81(2):170–188
Monteiro T, Vasconcelos M, Kacelnik A (2020) Choosing fast and simply: construction of preferences by starlings through parallel option valuation. PLoS Biol 18(8):e3000841
Nachev V, Rivalan M, Winter Y (2021) Two-dimensional reward evaluation in mice. Anim Cogn 24(5):981–998
Navarro AD, Fantino E (2005) The sunk cost effect in pigeons and humans. J Exp Anal Behav 83(1):1–13
Ojeda A, Murphy RA, Kacelnik A (2018) Paradoxical choice in rats: subjective valuation and mechanism of choice. Behav Proc 152:73–80
Palminteri S, Lebreton M (2021) Context-dependent outcome encoding in human reinforcement learning. Curr Opin Behav Sci 41:144–151
Pompilio L, Kacelnik A (2005) State-dependent learning and suboptimal choice: when starlings prefer long over short delays to food. Anim Behav 70(3):571–578
Pompilio L, Kacelnik A (2010) Context-dependent utility overrides absolute memory as a determinant of choice. Proc Natl Acad Sci USA 107(1):508–512
Pompilio L, Kacelnik A, Behmer ST (2006) State-dependent learned valuation drives choice in an invertebrate. Science 311(5767):1613–1615
Reboreda JC, Kacelnik A (1991) Risk sensitivity in starlings: variability in food amount and food delay. Behav Ecol 2(4):301–308
Roberts S (1981) Isolation of an internal clock. J Exp Psychol Anim Behav Process 7(3):242–268
Sánchez-Amaro A, Altınok N, Heintz C, Call J (2019) Disentangling great apes’ decoy-effect bias in a food choice task. Animal Behav Cognit. https://doi.org/10.26451/abc.06.03.05.2019
Sasaki T, Pratt SC, Kacelnik A (2018) Parallel vs. comparative evaluation of alternative options by colonies and individuals of the ant Temnothorax rugatulus. Sci Rep 8(1):12730
Shapiro MS, Siller S, Kacelnik A (2008) Simultaneous and sequential choice as a function of reward delay and magnitude: normative descriptive and process-based models tested in the European starling (Sturnus vulgaris). J Exp Psychol 34(1):75–93
Slovic P (1995) The construction of preference. Am Psychol 50(5):364–371
Smith AP, Zentall TR, Kacelnik A (2018) Midsession reversal task with pigeons: Parallel processing of alternatives explains choices. J Exp Psychol 44(3):272–279
Stearns SC (2000) Daniel Bernoulli (1738): evolution and economics under risk. J Biosci 25(3):221–228
Stephens DW, Krebs JR (1986) Foraging theory. Princeton University Press, Princeton
Vasconcelos M, Urcuioli PJ (2008) Deprivation level and choice in pigeons: a test of within-trial contrast. Learn Behav 36(1):12–18
Vasconcelos M, Monteiro T, Aw J, Kacelnik A (2010) Choice in multi-alternative environments: a trial-by-trial implementation of the sequential choice model. Behav Proc 84(1):435–439
Vasconcelos M, Monteiro T, Kacelnik A (2013) Context-dependent preferences in starlings: linking ecology Foraging Choice. PLoS ONE 8(5):e64934
Vasconcelos M, Fortes I, Kacelnik A (2017) On the structure and role of optimality models in the study of behavior. APA handbook of comparative psychology: perception, learning, and cognition. American Psychological Association, Washington D.C., pp 287–307
Warren C, McGraw AP, Van Boven L (2011) Values and preferences: defining preference construction. Wiley Interdiscip Rev Cognit Sci 2(2):193–205
Acknowledgements
We thank former members of the Behavioural Ecology starling laboratory at Oxford Department of Zoology (currently Department of Biology), especially Lorena Pompilio for retrieving her original data for replotting. Stephen Lea and Joah Madden made incisive and helpful comments on a previous version. They corrected errors and pointed out important omissions, and for this we are very grateful. Needless to say, responsibility for remaining flaws is entirely ours. Debbie Kelly was graciously tolerant for our delayed submission.
Funding
AK is grateful for the support of the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy—EXC 2002/1 “Science of Intelligence”—project number 390523135. MV is grateful for the support from the Portuguese Foundation for Science and Technology (UIDB/04810/2020).
Author information
Authors and Affiliations
Contributions
AK, MV and TM jointly wrote, edited and reviewed the manuscript.
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Ethical approval
This paper does not contain original research to which ethical considerations would apply.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Kacelnik, A., Vasconcelos, M. & Monteiro, T. Testing cognitive models of decision-making: selected studies with starlings. Anim Cogn 26, 117–127 (2023). https://doi.org/10.1007/s10071-022-01723-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10071-022-01723-4