The evolution of decision rules in complex environments

Models and experiments on adaptive decision-making typically consider highly simplified environments that bear little resemblance to the complex, heterogeneous world in which animals (including humans) have evolved. These studies reveal an array of so-called cognitive biases and puzzling features of behaviour that seem irrational in the specific situation presented to the decision-maker. Here we review an emerging body of work that highlights spatiotemporal heterogeneity and autocorrelation as key properties of most real-world environments that may help us understand why these biases evolved. Ecologically rational decision rules adapted to such environments can lead to apparently maladaptive behaviour in artificial experimental settings. We encourage researchers to consider environments with greater complexity to understand better how evolution has shaped our cognitive systems.

Models and experiments on adaptive decision-making typically consider highly simplified environments that bear little resemblance to the complex, heterogeneous world in which animals (including humans) have evolved. These studies reveal an array of so-called cognitive biases and puzzling features of behaviour that seem irrational in the specific situation presented to the decision-maker. Here we review an emerging body of work that highlights spatiotemporal heterogeneity and autocorrelation as key properties of most real-world environments that may help us understand why these biases evolved. Ecologically rational decision rules adapted to such environments can lead to apparently maladaptive behaviour in artificial experimental settings. We encourage researchers to consider environments with greater complexity to understand better how evolution has shaped our cognitive systems.
Abandon the urge to simplify everything, to look for formulas and easy answers, and begin to think multidimensionally . . . appreciate the fact that life is complex (M. Scott Peck [1]) The origins of irrational behaviour Patterns of decision-making in humans reveal some striking deviations from economically rational expectations [2][3][4]. These include distorted beliefs about external events [5,6], inconsistent preferences that are altered by past experience [7] and current context [8], and apparent violations of the axioms of rational choice theory [9,10]. Such deviations may be caused by cognitive biases [11] (see Glossary); here we focus on the behavioural outcomes (outcome biases [12]) because we make no assumptions about the underlying psychological or physiological mechanisms. Mounting evidence suggests that analogous biases exist in other organisms. For example, slime moulds violate regularity [13], domestic dogs show negative contrast effects [14], and honeybees behave pessimistically when agitated [15]. Far from being uniquely human quirks, our biases appear to have deep evolutionary roots. This observation seems difficult to reconcile with the fundamental biological concept of natural selection as an optimising process. Why would evolution produce such apparently irrational behaviour? Review Glossary Autocorrelation: an association across space or time in the state of the environment. Positive autocorrelation (which is our focus here) implies that environmental conditions tend to be more similar between locations and times that are close together, rather than far apart. Cognitive bias: a consistent deviation from an accurate perception or judgement of the world. Note that this is a psychological phenomenon that may or may not lead to irrational behaviour. Contrast effect: a change in the perceptual, physiological, or behavioural response to a given stimulus caused by simultaneous or recent exposure to other stimuli in the same dimension. Here we consider successive contrast effects, in which the response to current conditions is enhanced by previous exposure to worse conditions (a positive contrast effect) or diminished by previous exposure to better conditions (a negative contrast effect). For example, honeybees trained to expect a 50% sucrose solution are more likely to abandon that reward source when it only delivers a 20% solution, compared to honeybees always rewarded with a 20% solution [89]. Decision rule: a description (without specifying the underlying neural mechanisms) of the relationship between an internal or external stimulus and the choices an individual will make. Ecological rationality: the fit between a particular decision rule and the statistical structure of the environment in which it evolved. Environmental heterogeneity: variability in (external) environmental conditions over space (spatial heterogeneity) and/or time (temporal heterogeneity). Independence of irrelevant alternatives (IIA): a principle of rational choice stating that if an individual prefers an option A when given the choice between A and B, then it will also prefer A when given the choice between A, B, and a less attractive (i.e., irrelevant) option C. Irrational behaviour: acting in a way that is not optimal. In the context of evolutionary theory, rationality -sometimes called biological rationality (Brationality), to distinguish it from economic rationality (E-rationality) [90][91][92] does not imply conscious consideration of different options, but merely behaving in a way that maximizes expected benefit. Outcome bias: a pattern of decision-making that apparently deviates from the predictions of rational choice theory. Note that this definition makes no assumptions about underlying cognitive processes. Path-independence: a principle of rational choice stating that the decisions of an individual should only depend on its knowledge about the current state of the world (including itself), not on past states. Rational choice theory: an economic theory giving an axiomatic definition of (economically) rational behaviour. Regularity: a principle of rational choice stating that the frequency with which an individual chooses option A, when given a choice between A, B, and C, cannot be higher than the frequency of choosing A when given a choice between only A and B. Transitivity: a principle of rational choice stating that if an individual prefers option A in a choice between A and B, and option B in a choice between B and C, then it must prefer A in a choice between C and A.
One possible answer is that in many situations the costs of deviating from the optimal, fitness-maximising decision are negligible, and/or that constraints in the mechanisms underlying decision-making prevent natural selection from reaching this optimum. Studies on noisy information processing [16] and polygenic mutation-selection balance [17] have argued for the importance of constraints. Here we summarise an emerging line of research that suggests an alternative explanation: that many surprising features of behaviour, which may at first appear irrational, can in fact be understood as the result of ecologically rational decision rules adapted to exploit environments that vary in space and time. The approach we describe is an extension of standard techniques [18] used in behavioural and evolutionary ecology to investigate the adaptive significance of animal behaviour. This approach does not assume that all behaviour is adaptive or that constraints are unimportant, but instead seeks to identify how natural selection shapes the decision rules underlying behaviour [19,20]. The implications of this work for understanding cognitive systems have been largely overlooked, because theoretical models and laboratory experiments alike have traditionally focused on highly simplified situations that fail to capture some of the important complexities of the environments in which organisms have evolved.
The limitations of simple models Simple mathematical models are of great value in behavioural and evolutionary ecology, where the techniques of game theory and optimisation are used to predict the endpoints of natural selection [21]. This approach has revealed some important general principles of how organisms (including humans) should choose between different options, from food items to potential mates to the age at first reproduction. Most evolutionary models of decisionmaking consider a highly simplified environment in which the availability of different options is known to the organism and does not change over time. This is of course an unrealistic assumption. In most natural environments, the availability of different options fluctuates in time and space, and the fluctuations are often unpredictable.
That mathematical models simplify and abstract the phenomena they aim to represent is not in itself a problem; indeed, this is precisely what models are designed to do, because a model that was as complex as the real world would be of little use. But there is a danger of oversimplification [22] ('Einstein's razor' [23]): if we simplify things too much, we may fail to capture crucial features of natural environments that are needed to understand the behaviour.

The power of simple experiments
Similarly, laboratory experiments place individuals in artificial situations that are far simpler than most situations encountered in the natural world. In many of the standard laboratory protocols routinely used in behavioural ecology and experimental psychology, subjects are trained and tested using a small number of behavioural options, with straightforward relationships between the available stimuli, the actions of the subject and the resulting consequences [24][25][26][27]. In these artificial situations the experimenter has created a deliberately simplified version of the types of problems the animal might encounter in its natural environment; the aim is to isolate the key variables needed to understand the behaviour. As with the simplified models discussed earlier, there is a risk that such laboratory settings may not reflect the statistical structure of the environment to which the animal is adapted, making it seem as though the animal is making errors [4]. However, if we recognise this problem, then deviations from rational behaviour in simplified laboratory set-ups can be illuminating because they may reveal unexpected biases that arise from rules adapted to the natural environment.
Irrational behaviour from ecologically rational rules Natural selection will tend to produce decision rules which, although not optimal, perform well in the types of situations the individual normally encounters [19,20,28,29]; that is, they should be ecologically rational [30]. The statistical properties of environments, including the distribution of resources and how that changes over time, favour particular decision rules. For example, noisy miners (a type of bird) change their foraging strategy depending on the resource they are exploiting: they use movement-based rules when searching for invertebrates, which are cryptic and highly mobile, but switch to using spatial memory when searching for nectar, which is found only in fixed, conspicuous locations (flowers) and is quickly depleted [31]. The ecological and evolutionary context is crucial; animals follow decision rules that are adapted to the statistical properties of the resource types commonly encountered during their evolutionary history. In novel experimental contexts lacking this structure, such ecologically rational rules may lead to biased or irrational behaviour.
When seeking to understand how natural selection has shaped decision rules, it can be instructive to use a form of reverse engineering. This process starts with the identification of some bias that is not accounted for by current theory. The next step is to consider which particular aspects of environmental complexity need to be included in the models to predict that bias. The aim is to identify the minimal amount of real-world complexity that is sufficient to account for the observed behaviour, thereby forming a basis for novel predictions that can be used to test the proposed explanation. Models developed in the past few years illustrate the power of this approach and highlight spatiotemporal heterogeneity and autocorrelation as two important factors affecting the psychology of humans and other animals ( Figure 1). Incorporating these factors into standard models can explain several biases, listed in Table 1, that appear irrational in more simplified environments.

Spatiotemporal heterogeneity
Conditions in most natural environments are not uniform but vary over time and space. For highly mobile organisms, these two forms of heterogeneity will typically be closely linked; an individual moving through a spatially heterogeneous environment will encounter temporal heterogeneity too. Spatiotemporal heterogeneity has important consequences for behaviour because in a heterogeneous world the optimal response of an individual to Review Trends in Cognitive Sciences March 2014, Vol. 18,No. 3 current conditions depends on the conditions it expects to encounter in the (near) future [32][33][34][35]. The most basic form of heterogeneity we can consider is where the conditions at any one time or place are independent of those at any other time or place (Box 1). This is only a crude representation of the heterogeneity in most natural environments (see next section), but it can already account for some interesting biases: The placebo effect It is a widely reported (though controversial [36,37]) finding that fake treatments such as sugar pills or sham surgery, known as placebos, can lead to improvement in a patient's health [38]. Although health improvement is of course beneficial to the patient, if they are capable of recovering without help it would seem rational to do so immediately, rather than waiting for an external, inert cue. In an environment where conditions change over time, however, a delayed response may be adaptive. If an individual falls sick when conditions are harsh, it may be worth waiting until the environment is perceived to be less challenging, when it will be less costly to mount an immune response. Recent theory [39] has shown that the optimal strategy for recovery depends on the beliefs of the patient about current and future conditions, which affect the relative benefits of investing in recovery now rather than later. From this viewpoint, placebos falsely alter the expectations of the patient regarding the costs and benefits of putting effort into recovery, in some cases triggering an immediate response (i.e., a placebo effect). The placebo effect itself is not adaptive, but a generalised response to external cues may be favoured by natural selection if, on average, those cues reliably indicate a change in environmental conditions.

Pessimism
Natural selection should, in general, produce behaviour that is appropriate for the environmental conditions, giving the impression that individuals 'know' what those conditions are even if they cannot perceive them directly. Sometimes, however, humans and other animals consistently behave in a way that does not maximise their Heterogeneity Pessimism across generaƟons [45] Area-restricted search [48] Hot-hand fallacy [53] Risk allocaƟon [33] Placebo effect [39] State-dependent valuaƟon [71] OpƟmism about survival [75] Successive contrast effects [73] OpƟmism across generaƟons [45] EmoƟons and moods?

State-dependence
State-dependent life histories [18] A B A B TRENDS in Cognitive Sciences Figure 1. Incorporating spatiotemporal heterogeneity and autocorrelation into standard evolutionary models can account for several cognitive biases and puzzling features of behaviour. The Venn diagram indicates which combination of factors can produce particular outcomes; the phenomena discussed in this paper are shown in bold type. In a heterogeneous world the environmental conditions change over time or space (e.g., between states A and B), with positive autocorrelation implying that conditions are more likely to stay the same (thicker arrows) than change (see also Box 1). Some of the adaptive explanations we discuss are extensions of standard state-dependent models of behaviour [18] (shown in plain font). Some are based on uncertainty about current conditions and/or the pattern of environmental change [93]. Possible directions for future work are shown in italics.  [5] Medicinally inert substances or fake treatment procedures enhance recovery Individual who is capable of recovery without external help should do so immediately Optimism [40] and pessimism [42] Individual behaves as though conditions are better (optimism) or worse (pessimism) than they actually are Rational decision-maker should base behaviour on an unbiased (Bayesian) estimate of current conditions The 'hot-hand' fallacy [6] Misinterpretation of a statistically independent sequence of successes as a run of good form In a sequence of trials known to be independent (e.g., roulette), estimated chance of success should not be influenced by outcome of previous trial Intransitive choice [63] Individual prefers option A over option B, and option B over option C, but prefers C over A Inconsistent with absolute valuation of options, which would imply that if A > B, and B > C, then A > B > C Violation of regularity [61] Preference for one option over another is reversed by presence of a third option Inconsistent with absolute valuation of options, which would imply that ranking of two options is unaffected by alternative options State-dependent valuation learning [69] Individual prefers options they previously found to be rewarding when in a state of need Rational decision-maker should choose whichever option gives greatest benefit, irrespective of past states Successive contrast effects [72] Response to current conditions depends on whether conditions in the past were better or worse short-term gains, but would maximise their short-term gains if conditions were better than they actually are (an 'optimistic' bias) [40,41] or worse than they actually are (a 'pessimistic' bias) [42][43][44]. Recent theoretical work [45] shows that temporal heterogeneity across generations can select for pessimism: behaviour should be biased towards the response that yields the best results in poor conditions -because it is poor conditions that have the strongest influence on long-term fitness across multiple generations. Other factors, including autocorrelation (see below), may alter the tendency towards optimism or pessimism (Box 2).

Spatiotemporal autocorrelation
Environments that are spatiotemporally heterogeneous may also show positive autocorrelation, in that the conditions at a given place and time tend to be similar to those at nearby locations and in the recent past (Box 1). One wellknown adaptation to spatial autocorrelation is arearestricted search [46], in which successful discovery of an item prompts intensive local searching [47], thereby promoting efficient exploitation of clumped resources [48]. The impact of temporal autocorrelation is less well appreciated, but may be even more important for understanding cognitive adaptations. In environments that change over time the strength of temporal autocorrelation -and hence the time for which current and future conditions persist -has important consequences for adaptive behaviour [49] and learning [50], and this is reflected in our cognitive systems. When there is temporal autocorrelation, current conditions not only determine the consequences of current decisions but are also informative of future conditions. This important insight can account for several well-known biases: The 'hot-hand' fallacy In gambling and sports, there is a widespread but often mistaken belief that players have 'streaks' or 'runs' of success. Basketball players, for example, are perceived to be more likely to shoot successfully if their previous shot hit rather than missed, whereas real data show that the chances of scoring are statistically independent from one shot to the next [51]. This so-called 'hot-hand' belief reveals our tendency to see patterns even when none exists [52]. It has been argued that this tendency represents a Box 1. Modelling environmental heterogeneity and autocorrelation Incorporating environmental heterogeneity into models of adaptive behaviour requires the inclusion of an environmental state variable. Often we can capture sufficient complexity with only two environmental states A and B, such as high and low food availability, or safe and dangerous. Next, we characterise stochastic transitions between the environmental states. The simplest case is where the probability of transition (per unit time) between states depends only on the current state ( Figure Ia), because then we can write the transition probabilities as single values c A and c B (the subscripts indicating the current state), with c A + c B < 1 representing positive temporal autocorrelation. The length of time the environment stays in state i then follows a geometric distribution with mean t i = 1/c i . We assume that the individual 'knows' (i.e., is adapted to) these probabilities and can directly perceive the current conditions. We then investigate how environmental heterogeneity affects responses to current conditions, such as predation risk [49]. For a finer gradation of states, this approach can be extended to any number of states n, with an n Â n matrix of transition probabilities. For some systems, such as gradual changes in the food supply, we set all the probabilities of moving between non-adjacent states to zero.
Individuals will often be uncertain about the transition probabilities, and we may be interested in how they should respond to this uncertainty. A simple representation considers two possible transition matrices (e.g., fast-or slow-changing conditions). The individual may 'know' the transition probabilities of each matrix, but not which matrix currently applies ( Figure Ib). If the environment is temporally autocorrelated, then the recent past is informative of the future, and therefore the individual should adjust its behaviour in response to its previous experience of the pattern of change. An optimal decision-maker would learn from past experience using Bayesian updating [93]. We can model this by including a state variable to represent the probability that one particular matrix applies, which can help to explain apparently irrational behaviour such as contrast effects [73].
The above assumes that the individual can accurately perceive whether the environmental state is currently A or B. To explore a situation where the individual knows neither the current conditions nor the transition probabilities with certainty, we can use an additional variable to represent the probability of a given situation. However, note that learning two interdependent probabilities requires three state variables and a very fine grid size; computational limitations may constrain our approach.
We have described the simplest scenario for modelling temporal autocorrelation in a heterogeneous world. Real environments may show more complex patterns of change, but this is a mathematically convenient way to capture some of the statistical structure that could be important for understanding cognitive adaptations. broad-purpose cognitive adaptation to a world in which most resources are clumped (i.e., positively autocorrelated) in space and time [4,53,54]. Thus the hot-hand fallacy could result from a generalised decision rule that is unable to distinguish sequences of genuinely independent events from autocorrelated sequences. Experimental evidence from computer-based 'foraging' [53] and gambling [54] tasks largely supports this view, and suggests that human minds have evolved to expect temporal autocorrelation in the world.

Intransitive and irregular preferences
In an autocorrelated world, the possibility that current behavioural options will persist into the future can affect patterns of choice. Rational choice theory holds that the preference for one option over another should be both transitive and independent of irrelevant alternatives (see Glossary); satisfying the axioms of this theory is both necessary and sufficient to maximise expected benefit [55]. Studies of consumer behaviour [56] and experiments on humans [8-10] and a diverse range of other organisms [13,[57][58][59][60][61][62][63] have found evidence for context-dependent preferences that appear to violate these axioms of rational choice (however, see [64]). However, empirically observed choices are part of a long sequence of choices that individuals make throughout their lives, whereas the axioms refer to one-off choices (which can be choices between alternative decision rules that specify what to do in every possible situation an individual might encounter in its lifetime). In repeated choices, mathematical models [65,66] show that violations of transitivity and regularity can result from decision rules adapted to heterogeneous, autocorrelated environments, in which currently available options provide information about what options will be available in the future (Box 3).

State-dependent valuation learning
The energetic state of an individual reflects recent foraging conditions, and can therefore inform it about future conditions in an autocorrelated world. Laboratory studies on birds [67], insects [68], and fish [69] have shown that the value animals place on different options depends on the state they were in when they learnt about those options. When given a choice between two food sources, animals consistently choose the one they previously found to be rewarding when they were hungry, despite the alternative having equal [67] or even higher [70] profitability.

Box 2. The evolution of optimism and pessimism
Consider an environment composed of a large collection of discrete patches. Individuals mature on a patch, reproduce, and die. Some of their offspring disperse to other patches. Patches change over time, independently of one another; in some generations conditions are good, in other generations poor. Whether optimal behaviour appears unduly optimistic or pessimistic that conditions are good depends on the degree of dispersal and autocorrelation [45]: (i) When dispersal between patches is low, pessimism is favoured; individuals must behave conservatively in case conditions deteriorate and the whole lineage is wiped out. (ii) When dispersal rates are higher, dispersal acts as an insurance against a local patch deteriorating, spreading the risk between members of the same lineage, such that individuals no longer need to be conservative. If conditions are positively autocorrelated in time there is a 'multiplier effect' [94], with descendant numbers growing rapidly in a patch over successive generations if conditions are good. Individuals should then take a risk and behave optimistically so as to exploit conditions if these turn out to be good, because behaviour in good conditions has a predominant influence on long-term fitness [45]. It can also be optimal to be optimistic about the chances of survival. Imagine an animal that has to survive a given period of T days if it is to reproduce. Suppose that the density of predators varied during the evolutionary history of the population, and that there are no cues that provide direct information on the density on a given day. Then the frequency with which different levels of predation occurred in the past specifies the current probability distribution of predation levels. Do we expect anti-predator traits (e.g., cautious behaviour) to evolve so that individuals maximise their expected daily survival given this distribution? It depends [75]: (i) If T = 1 or predator density on successive days is independent, then the answer is yes. (ii) However, if T > 1 and predator density on successive days is positively autocorrelated, then individuals do best to be optimistic about risk. To understand this, consider the extreme case in which T is large and predator density is the same on all days, either always high or always low. If the density is high, the individual will almost certainly die regardless of its anti-predator trait, whereas if it is low the trait value matters. Thus the trait is only really relevant when the density is low, and therefore it should evolve to be optimal given a low density [75] -that is, behaviour should appear to be optimistic about predation risk. Weaker autocorrelation in the predator density across successive days will favour a weaker optimistic bias towards the optimal response for low density.

Box 3. Violations of regularity and transitivity
A central tenet of studies of decision-making is that in the absence of constraints or costs, decisions should be transitive and regular (see Glossary) in sequences of choices (cf in one-off choices, as required by rational choice theory). In an autocorrelated world, this is not necessarily true. Foragers often face a choice between options that differ in both the expected rate of energy gain and the risk of predation, which may be positively related. What is the strategy that maximises longterm survival? At high reserves, they should choose options with a low predation risk; at low reserves, to avoid starvation they should choose options with a high probability of energy gain. For intermediate reserve levels, the best option depends not only on the immediate danger but on the longer-term risk of starvation. If options persist into the future, this risk depends on which other options are currently available; options that are not currently chosen may still affect optimal decisions because they can act as insurance against an energetic shortfall in the future. For example, a dangerous but high-gain option should be avoided when the individual is well fed, but can be relied on in an emergency if reserves drop to critically low values. In the absence of this insurance option, the individual may be forced to choose riskier foraging options than it would do otherwise, to keep its energy reserves at a safe level. The value of a given option is therefore affected by the presence of other options, which can lead to violations of regularity [65] and transitivity [66] under optimal behaviour. Recent models predict that violations may occur even in cases without state-dependence, where the animal is simply maximising its rate of energy gain [95].
Without autocorrelation, the presence of one option would not affect the value of another. Waksberg et al. [96] argued that irregular choice could outcompete rational behaviour in a model with no autocorrelation, but they considered a restricted set of decision rules that did not allow the choice of an individual to depend on its current energy reserves [97]. This set does not include the optimal decision rule. In evolutionary models of decision-making that account for heterogeneity, it is important that the best-performing decision rule is optimal over some sufficiently long timescale, otherwise we cannot argue that it would have evolved [76].

Review
Trends in Cognitive Sciences March 2014, Vol. 18,No. 3 Evolutionary simulations [71] have shown that, although this biased valuation appears irrational, it can make sense in particular types of environments that fluctuate slowly between rich and poor conditions. If the best option differs between rich and poor conditions, but individuals cannot perceive the conditions directly, state-dependent valuation learning is expected to evolve: food rewards should be more strongly reinforcing when an individual has low energy reserves, which are indicative of poor conditions. Selection favours this bias in the learning rule because making the correct choice under poor conditions is particularly important for fitness [71].

Successive contrast effects
If an individual is uncertain about the temporal pattern of change in conditions, future expectations may also be influenced by conditions experienced in the past. Standard theories of rational choice posit that optimal behaviour is path-independent, in that it depends on the current state of the world but not on how that state was reached. If we equate current state with current environmental conditions, this view cannot account for successive contrast effects, in which the response of an individual to current conditions depends on whether conditions were previously better (a negative contrast effect) or worse (a positive contrast effect) [72]. Such sensitivity to change can be understood by recognising that many animals have evolved in an environment where conditions fluctuate over time in an unpredictable way. Assuming the pattern of change is sufficiently stable, the conditions experienced in the past then provide potentially valuable information about the likely pattern of change in the future, which in turn affects optimal behaviour (see Box 1). This dependence of optimal behaviour on past experiences can produce positive and negative contrast effects in the artificial situations used in laboratory studies [73]. Similar effects could result from an optimal trade-off between exploration and exploitation in heterogeneous, autocorrelated environments [74].

Optimism
Temporal autocorrelation across generations may also be important. If there is spatial heterogeneity in environmental conditions, and those conditions persist over multiple generations (i.e., temporal autocorrelation is sufficiently high), optimistic behaviour is favoured [45] (cf pessimism when temporal autocorrelation is weak; see previous section). Alternatively, uncertainty about an external, autocorrelated mortality risk can favour optimism [75] (Box 2). Such cognitive biases may appear irrational, but they arise from a strategy that maximises fitness over a longer timescale [76].
As these examples illustrate, some apparently maladaptive behaviours observed in artificial laboratory situations can be seen as ecologically rational if we recognise that organisms are adapted to stochastically fluctuating conditions that are autocorrelated in time and space. By interacting with this rich statistical structure, organisms have evolved to exploit their natural environments efficiently using a range of simple decision rules that need not require complex computation [77,78]. It is important to recognise that such rules may lead to outcome biases in environments that lack this statistical structure. For example, standard laboratory procedures for demonstrating successive contrast effects eliminate any correlation between past and future conditions; an ecologically rational decision rule adapted to exploit this correlation will produce apparently irrational behaviour [73]. Similarly, in tests of context-dependent choice the current options do not predict which options will be available in the future, but the animal may be responding as if they do [65,66] (see Box 3).
From 'just-so' stories to predictions and empirical tests In the approach we have outlined the aim is to build evolutionary models with the minimal amount of realworld complexity to account for observed patterns of decision-making. Nevertheless, identifying one potential adaptive explanation does not rule out the existence of other explanations that may account for the observed bias equally well. To move beyond adaptive story-telling, models should generate testable predictions as well as explanations. In particular, evolutionary models of biases in decision-making should identify which factors affect the magnitude of the bias, and therefore the organisms and circumstances in which the bias should be most pronounced.
Although the evolutionary roots of many biases appear to run deep, there is evidence of considerable variation between species. For example, studies have found evidence of successive contrast effects in honeybees, bumblebees, starlings, and a variety of mammals, but not in goldfish, toads, pond turtles, chickens, or pigeons [79]. This variation could reflect phylogenetic inertia [80] in the underlying neuroendocrine mechanisms that constrain behaviour [81], or ecological differences between species that select for different decision rules [82]. A general expectation of the theories we have reviewed here is that many biases will be most pronounced in species adapted to strongly fluctuating environments, where the fluctuations have a big impact on optimal behaviour. We might therefore expect some biases to be stronger in animals reliant on tightly clumped, ephemeral food sources (e.g., specialist frugivores and nectarivores) than in those adapted to stable, widely available resources (e.g., grazing herbivores). To test such broad-scale, comparative predictions we need quantitative data on variation in biases across species (controlling for selective reporting [83]) and detailed information on the spatiotemporal structure of natural environments (including social dynamics, for which 'reality mining' techniques [84] hold great promise). Differences in feeding ecology have been proposed to explain variation in impulsive behaviour across primates [85]; a more in-depth approach using detailed ecological data might help in understanding the taxonomic distribution of other behaviours that at first appear irrational.
Another exciting possibility is to test the evolutionary predictions experimentally by manipulating the pattern of environmental change. Taking the simplest case of two environmental states (e.g., high versus low food availability), exposing different experimental groups to different transition probabilities (see Box 1) could potentially generate different biases in decision-making, providing that the study organism can adapt behaviourally to the pattern of change. Many of the examples we have discussed involve adaptation over an evolutionary rather than a behavioural timescale, but even then it might be possible to test hypotheses using experimental evolution in Drosophila, nematodes, or other organisms with a short generation time. We hope researchers using these systems will take up this challenge.

Concluding remarks
The evolutionary explanations we have highlighted here represent only one of several possible approaches to understanding biases in decision-making; it is important to compare this framework with alternative approaches based on genetic [17] or cognitive [16] constraints. Nonetheless, we believe that insights from evolutionary studies can make an important contribution to this issue by considering how organisms adapt to richer environments. The simple models and experiments routinely used to study decision-making may misrepresent key features of the environment of selection, leading to incorrect predictions and regular reports of seemingly irrational behaviour. The real world can be complex, variable, and autocorrelated, and we should expect cognitive and perceptual systems to have evolved to exploit its statistical structure. By considering environments with sufficient richness we can generate novel, testable explanations for many puzzling behavioural and psychological phenomena, which can be meaningfully tested even in simplified laboratory settings. Much exciting work lies ahead (Box 4). A better understanding of the statistical structure of real-world environments may help us to understand the workings of the mind [86][87][88].

Box 4. Outstanding questions
A major theme of the recent theoretical work discussed here is that, in a temporally autocorrelated world, current or past options may be informative about the future. This general principle may shed light on decisions in a range of other situations, such as a choice between risky options (i.e., options for which the outcome is variable). Prospect theory is a highly influential descriptive model of human decision-making that captures several interesting features of our attitudes to risk [98], such as our tendency to focus more on changes in state (e.g., wealth) than the states themselves. Could this pattern of decision-making be ecologically rational in an autocorrelated world (see Box 1)? If conditions fluctuate over time, organisms may need to take into account the pattern of change to decide whether it is worth gambling on a risky but potentially highly rewarding option. How does natural selection shape the mechanisms involved in decision-making? Most models of adaptive decision-making focus on behaviour, ignoring the psychological and physiological mechanisms that produce it. Even so, observed behaviour may be consistently associated with particular psychological and/or physiological states, and therefore to understand decision-making properly we need to model the evolution of these mechanisms explicitly [19]. This can be technically challenging and typically involves computationally intensive methods such as genetic algorithms (e.g., see [99]), but modern computing power is beginning to bring these approaches within reach. Studies of the evolution of psychological mechanisms may hold the key to unravelling some of the most enduring mysteries of the human mind, such as why we have emotions and moods. Do affective states enhance or constrain decision-making? One idea is that mood states are an efficient way of summarising recent experiences and can be used to adjust decision thresholds, which might be adaptive in a stochastically changing, autocorrelated environment [100][101][102] (see Box 1). Whether emotions and moods are closely linked to brain mechanisms that promote survival and other fitness components is unclear [103], but this remains a promising direction for future research. One of the key challenges of a comparative, evolutionary approach to cognitive biases is how to identify analogous outcome biases in nonhuman organisms. To allow valid comparisons, behavioural measures need to be both ecologically relevant and applicable to a wide range of taxa. Tests have been devised for impulsive behaviour [104,105] and for optimistic and pessimistic biases [15,106], but what are the behavioural indicators of affective states such as anxiety, depression, or disappointment? Researchers are beginning to tackle this difficult problem [44,107,108], but much remains to be done.