Multi-Attribute Decision by Sampling: An account of the attraction, compromise and similarity effects

options. The of a choice option of sampled comparators the dominates. We specify conditions on the sampling distribution that are for MADS to predict the three context effects. the model using a novel experimental design with 1200 online participants. In the first experiment, prior to making a choice participants were shown a selection ofmarketoptionsdesignedtochangetheirbeliefsaboutthemarketdistribution.Participants’subsequent choices were affected as predicted. The effect was strong enough to impact the size of two of the three classic context effects significantly. In the second experiment, we elicited individuals’ estimates of distributions of market options and found the estimates to be systematically influenced by the choice set aspredictedbythemodel.ItisconcludedthatMADS,amodelbasedonsimplebinaryordinalcomparisons, is sufficient to account for the three classic context effects.


Introduction
A well-established challenge to the standard utility model is given by the existence of context effects in consumer choice. Context effects occur when the relative frequency with which one option is chosen over another depends on the other options in the choice set. In this paper we consider the three most-studied context effects found in multi-attribute choice experiments: the similarity effect (Tversky, 1972), the attraction effect (Huber, Payne, & Puto, 1982) and the compromise effect (Simonson, 1989). These three context effects have been replicated many times in a variety of domains (e.g., Doyle, O'Connor, Reynolds, & Bottomley, 1999;Huber et al., 1982), and within a single study (Berkowitsch, Scheibehenne, & Rieskamp, 2014;Noguchi & Stewart, 2014). Moreover, the fit of discrete-choice models can be improved by adding estimable parameters for each context effect and some of their interactions (Rooderkerk, Van Heerde, & Bijmolt, 2011). The classical utility paradigm built on the assumption of rational preference orderings renders choice invariant to the introduction of seemingly irrelevant alternatives, and hence is not able to explain these phenomena without substantial modification.
In this paper we offer a concise account based on a simple cognitive mechanism, binary ordinal comparison, which is motivated by a large body of independent psychological evidence. We term the model Multi-Attribute Decision by Sampling (MADS). It contrasts with previous accounts provided in both economics and psychology. For example, it has been shown that the compromise effect can result as equilibrium behavior in markets under uncertainty where the choice set provides information for the decision-maker (e.g., Kamenica, 2008;Wernerfelt, 1995). However, these accounts of context effects do not explain well why the effects are found in domains where it is less plausible that the options carry information regarding decision-relevant attributes such as quality (e.g., consumer choices over gifts of coupons and cash: Tversky & Simonson, 1993; or choices over lotteries: Wedell, 1991). Furthermore, Trueblood et al. (2013) show that the 'big three' context effects appear when individuals judge psychophysical stimuli, suggesting that the mechanism underlying the effects is a more fundamental component of the human decision-making process. In economics, existing accounts of some of the effects have been based on psychological factors such as dimensional weighting (Bushong, Rabin, & Schwartzstein, 2015), salience (Bordalo, Gennaioli, & Shleifer, 2013), limited attention (Manzini & Mariotti, 2014;Masatlioglu, Nakajima, & Ozbay, 2012) of violations of the regularity principle. In studies since, it has been common to define the context effects via comparisons of the probability of an alternative being chosen from two ternary choice sets (see Table 1 of Trueblood, Brown, Heathcote, & Busemeyer, 2013). Throughout this paper, we also define context effects via comparisons of an alternative's choice probabilities from ternary choice sets. and reference points (Ok, Ortoleva, & Riella, 2015). Some have also been predicted by the solution to an intra-personal bargaining problem (de Clippel & Eliaz, 2012).
In psychology, there are models of choice that account for all three of the major context effects (e.g., Bhatia, 2013;Roe, Busemeyer, & Townsend, 2001;Trueblood, Brown, & Heathcote, 2014;Usher & McClelland, 2004). However, none capture the three effects with one psychological mechanism, instead resorting to arguably ad-hoc parametrizations. Furthermore, most of these models are complex and can only be estimated numerically. In contrast, we offer a novel account of the three consumer choice context effects based on sampling and binary ordinal comparison, while maintaining analytic expressibility. Our argument is one of sufficiency, not necessity: We suggest that simple binary dominance relations, combined with an assumption that samples are drawn from a distribution that is influenced by the choice set, are all that is needed to account for the three context effects. We do not present data that exclude more complex accounts (e.g., accounts based on better-than-ordinal dominance relations).
Our model instantiates three key assumptions. The first assumption is that individuals evaluate choice options by comparing them to a limited sample of other items. The idea that judgments and choices are based on a process of sampling comparator items from memory and/or the immediate choice environment is ubiquitous in psychology (e.g. Fiedler, 2000;Fiedler & Juslin, 2006;Hertwig & Pleskac, 2010) and is strongly supported by the existence of context effects of the type discussed in the present paper. Related ideas are found in several recent economic models (e.g., Bordalo, Gennaioli, & Shleifer, 2012aGennaioli & Shleifer, 2010;Kőszegi & Szeidl, 2013) and neuroscience (Bornstein, Khaw, Shohamy, & Daw, 2017;Shadlen & Shohamy, 2016).
The second assumption is that the sampling process is systematically influenced by the choice set. We assume that a given choice set will be taken by subjects to suggest the presence of unobserved market options which the subject may therefore include in the sample they generate. More specifically, in our model people behave as if they infer a distribution over the whole marketplace of options on the basis of the choice set that they face, and sample from that distribution. This assumption resonates with much existing literature. First, Kamenica (2008) presents a model in which choosers infer that choice options reflect the preferences of the population, and thereby explains choice overload effects. In consumer psychology it also been suggested that people treat choice options as informative about the marketplace, as when a medium-height person will rationally choose a sweatshirt size near the middle of the available range of size options (Prelec, Wernerfelt, and Zettelmeyer, 1997;Simonson, 2008;Wernerfelt, 1995). A further claim, found in cognitive psychology, is that people update their estimates about quantities such as market prices on the basis of experimentally-provided options, particularly when initial uncertainty is high (Brown, Sanborn, Aldrovandi, and Wood, 2015;Shenoy and Yu, 2013;Sher and McKenzie, 2014). Our claim is of this latter type: we assume that people update prior beliefs about market distributions on the basis of sets of choice options.
The third assumption is that the probability of choosing an alternative is determined via dominance relations between items in the mental sample. This assumption is consistent with and motivated by a large body of research in psychology that suggests that subjective valuation involves a series of simple ordinal comparisons between pairs of items (e.g. Stewart, Chater, and Brown, 2006; see also Kornienko, 2013). For example, the Decision by Sampling model (DbS: Stewart et al., 2006) assumes that subjective values are determined by (a) retrieving a small sample of comparison items drawn from memory and the environment, (b) tallying via binary ordinal comparisons the number of comparison attribute values that are smaller than the target attribute value, (c) tallying the number of comparison attribute values that are larger than the target attribute value, and (d) computing the relative ranked position of the target attribute value within the comparison context provided by the comparison sample. Stewart et al. use these assumptions to explain the form of, inter alia, the value and probability weighting functions in Prospect Theory.
The key psychological assumption that MADS inherits from DbS is the idea that purely ordinal comparisons are involved in the construction of subjective values. A considerable amount of evidence within both economic and psychological domains finds that subjective valuations are affected by the relative ranked position of attribute values within a comparison context. Such findings are consistent with the suggestion that (in process terms) valuations are constructed through a series of ordinal comparisons and we are not aware of cardinal models that can capture the relevant data. An initial strand of research that examined people's judgments of the subjective magnitudes of simple psychophysical quantities such as size and weight found such judgments to be determined partly by the relative ranked position they occupy within a comparison context (e.g., Parducci, Calfee, Marshall, & Davidson, 1960;Parducci & Perrett, 1971;Riskey, Parducci, & Beauchamp, 1979). Subsequent work found that quantities as diverse as prices (Niedrich, Sharma, & Wedell, 2001;Niedrich, Weathers, Hill, & Bell, 2009), personality (Wood, Brown, Maltby, & Watkinson, 2012), fairness (Mellers, 1982), body perception (Wedell, Santoyo, & Pettibone, 2005) and alcohol consumption , as well as many others, are judged at least partly in terms of their relative ranked position within a comparison context. Students' attitudes to anticipated graduation debt is determined partly by the ranked position of their anticipated debt relative to the assumed debt of others (Aldrovandi, Wood, Maltby, & Brown, 2015). Rank of income, rather than income per se, determines satisfaction with that income (Boyce, Brown, & Moore, 2010;Smith, Diener, & Wedell, 1989), and people's anticipated and experienced satisfaction with a wage are both related to how the wage ranks within a comparison context (Brown, Gardner, Oswald, & Qian, 2008). Moreover, some neuro-imaging evidence is consistent with rank-based coding of value in the brain (Mullett & Tunney, 2013). There is therefore a considerable body of evidence consistent with the idea that cardinal valuations result from a process of binary ordinal comparisons, and the dominance relations that the present account assumes are of this binary ordinal type.
To link our model with the classical utility paradigm, we note here that features of the classical utility approach are obtained as a limiting case of MADS: If individuals' sampling distributions do not depend on the choice sets they face and the number of items sampled approaches infinity then choices are deterministic, consistent across contexts and context effects are not predicted.
The rest of the paper is organized as follows: In Section 2 we present an intuitive description of how MADS explains the three context effects, followed by a formal model description and specification of sufficient conditions on the sampling distribution for MADS to predict the effects. In Sections 3 and 4 we report the experimental design and results. Section 5 provides a discussion of the model, relates it to other approaches within economics and psychology, and concludes.

Informal illustration
We now provide an intuitive introduction to the model and illustrate how it accounts for the attraction, compromise and similarity effects. Faced with a choice set, an individual draws a finite sample of n items to mind. The distribution from which the n items are drawn is assumed to be influenced by the set of choice options. Multi-attribute decision by sampling. A two-alternative example of how MADS operates. For ease of exposition, assume that the shaded area represents the support of a uniform sampling distribution. As lower prices and higher quality are preferred, a point is dominated if it lies to the south-east of another.
Each choice alternative is then compared to the other choice alternatives and to the items in the sample, accruing a point for every one that it dominates. The alternative with the highest score is then selected. 2 Consider the choice set {A, B} in Fig. 2. The shaded area represents the distribution over the whole marketplace of options from which the individual draws a sample of size n. Each of the n comparison items increases the probability of choosing A, the probability of choosing B, or neither. The effect of a comparator will depend on where in the price × quality space it falls. Consider the regions marked R A , R B , and R AB . R B is the dominance region exclusive to option B, which we refer to as B's solo-dominance region. Any item that falls within R B is more expensive than both A and B, lower quality than B, and higher quality than A. Thus B dominates any item in R B . Items in R B are more expensive but also higher quality than option A, so they are not dominated by A. Similarly, comparison items that fall in R A are dominated exclusively by A. Finally, items that fall in R AB are dominated by both A and B so R AB is referred to as a joint-dominance region. MADS assumes that choice is determined by how many comparison items fall into each of these regions. The distribution from which comparison items are drawn will therefore affect whether A or B is chosen. In Fig. 2, a larger portion of the shaded area falls into R A than R B , meaning that A is more likely to be chosen. Note that because of the probabilistic nature of the sampling process, A will not always be chosen, especially if the comparison sample is small, leading to a stochastic element of choice in our model. 3 MADS assumes that an alternative accrues a point when an item falls in its solo-dominance region. For items in a joint-dominance region, a point accrues to any of the dominating alternatives' scores with equal probability. Therefore, if there are no dominance relations between the alternatives in the choice set, the alternative with the highest probability of accumulating a point is also the most likely to be chosen. The model provides analytic expressions for the probabilities of choosing each alternative from a choice set; these expressions are provided in Appendix.
We now provide an intuition for how MADS accounts for each of the three context effects using example sampling distributions, shown in Figs. 3-5. These distributions are chosen for illustrative purposes only. More specifically, they show examples of sufficient conditions rather than necessary configurations of the sampling  distributions required for MADS to produce the effects. For general, formal statements of the conditions required of the sampling distributions to produce each context effect, see Propositions 1-3 in Section 2.2.
An illustration of how the sampling distribution is hypothesized to depend on the attraction effect choice sets is given by Fig. 3. Here, the triangular nature of the attraction effect choice set pulls more of the density of the sampling distribution to the dominance region of the target (A in the left panel, B in the right). This increases the probability that comparison items are drawn from the target's dominance region. This will tend to increase the score accumulated by the target relative to the non-target alternative, and hence will increase the probability of it being chosen. Furthermore, attraction effect choice sets include a dominated alternative which gives the target a head-start in the accumulation of points.
The intuition behind the explanation of the compromise effect is illustrated by Fig. 4. When the shift in the choice set is accompanied by a corresponding shift of the sampling distribution as shown, the central alternative has the solo-dominance region with the most density. In addition, the target (compromise) alternative enjoys a joint-dominance region with each of the other alternatives. These two facts combined imply that in both panels, the central alternative has the highest chance of accruing a point, and hence the highest chance of being chosen: the compromise effect.
The intuition behind our account of the similarity effect is illustrated by Fig. 5. The effect is driven by the fact that the nontarget alternative (B in the left panel, A in the right) is forced to share its solo-dominance region with the decoy. On the other hand, the target alternative is left with a relatively large solo-dominance region, increasing the probability of it being chosen: the similarity effect occurs.

Formal description
Each x ∈ X is referred to as a choice alternative, and X the choice set. Each choice alternative is J−dimensional, where x = (x 1 , . . . , x J ) describes the level of each attribute of alternative x.
Given X , the individual samples from a J−dimensional distribution or sampling distribution over the product space, with CDF denoted F X . There are n > 0 draws made from F X . For the purposes of this paper we assume draws are independently and identically distributed and that F X contains no mass points. MADS describes two stages of cognitive processing. In the first stage, a sample is generated. The set of draws sampled is denoted W where a typical element is w = (w 1 , . . . , w J ) ∈ W .
In the second stage, a score for each choice alternative is determined and a choice is made. The score of an alternative x is constructed via ordinal binary comparisons of its attribute levels against those of other items in the reference set X ∪ W , with typical element r = (r 1 , . . . , r J ). Elements of this set are referred to as comparators or comparison items. Choice alternatives accrue points when they are compared to comparison items that they dominate. Where more than one choice alternative dominates a comparison item, one of the choice alternatives is selected at random to accrue a point. Where there are dominance relations within the choice set, we make the simplifying assumption that the dominated choice alternative does not accrue any points, as described formally below. The choice alternative with the highest total score is then chosen. To represent the process explicitly, let ≿ j be the rational binary preference relation which an individual has over levels of the attributes j = 1, . . . , J over any two items. Therefore x dominates y if x ≿ j y for j = 1, . . . , J. 4 In the case of hotels, where the attributes are price (p) and rating (q), both preference relations are assumed to be monotonic: The choice correspondence c : X ↦ → X can then be expressed as: If c(X ) is a singleton, then this item is chosen. If it contains more than one element, then each element of the set is chosen with equal probability. In this notation,X is the set of undominated alternatives in the choice set. DX (r) is the set of choice alternatives inX that dominate r (excluding comparison with itself), A(r) is the alternative that accumulates a point from comparison item r, and s(x) is the total score accumulated for each choice alternative. We now provide sufficient conditions on the sampling distributions required for MADS to predict each of the three context effects. We provide sufficient conditions under an assumption of symmetry between the sampling distributions across the two choice sets of each context effect, which provides a clean statement for how the model can generate the effects. The symmetry conditions state that the probability of sampling an item which affects the expected scores of A and B by the same amount, across the different choice sets, is the same. If these conditions are satisfied, we can focus solely on the probabilities of items falling in regions that affect the difference in the scores.
More general conditions, which allow for a relaxation of symmetry, are still expressible analytically, but no longer have the simplicity of those in Propositions 1-3 as they require conditions quantifying the asymmetry and on n. Where data do not satisfy symmetry, the full expressions for choice probabilities can be used directly to check when the context effects are expected. These expressions are provided in Appendix.
With symmetry assumed, we now reveal the simple conditions driving our intuition that are required for the sampling distributions to produce the effects. Figs. 3-5 are the counterparts of the Propositions below which display distributions that satisfy the conditions, where for ease of reference, one can suppose that shaded areas represents a uniform density, integrating to one. Proofs are relegated to Appendix.
Denote the probability of a sampled comparison item being dominated by an alternative x, given choice set X , as F X (x). Furthermore, let

(ii)
Condition (ii) states that the probability of an item falling in the solo-dominance region of the target is greater than the probability of an item falling in the solo-dominance region of the non-target. This implies it is more likely that the target accumulates a point. Because the item with the highest score is chosen, the attraction effect results.
Proposition 2 (Compromise). The model produces the compromise effect if the following are satisfied: (ii) larger when it is the target it helps to produce the effect. Notice also, that when B is the target, it has two joint-dominance regions, but it has only one when it is not the target. This accounts for the presence of an extra joint-dominance region in the numerator of (ii). A similar argument can be made for A.
Proposition 3 (Similarity). The model produces the similarity effect if the following are satisfied: the similarity effect. By inspection of Fig. 5 one can see that B(B) is likely to be greater than A(B) due to the configuration of the similarity effect choice sets. Although A(B, S A ) is also likely to be greater than B(B, S B ), these joint-dominance regions only add to B's score with probability 1 2 . A similar argument can be made for A. The symmetry conditions of Propositions 1-3 suppose that individuals' sampling distributions will depend on the relative position of the choice set's alternatives to each other, but will otherwise be the same. Assuming symmetry allows for a clear exposition, permitting explanations to rely on a ratio consisting of a few areas of the sampling distribution's density. We consider symmetry a natural benchmark case, especially for markets where individuals have had no prior experience. In practice, when individuals are evaluating items, they will draw not only on the choice set presented to them, but also on their prior experience or knowledge of the product. This can also be expected to affect their sampling distribution. For hotels, for example, if individuals have predominantly had exposure to cheaper, lower quality hotels than those in the choice sets offered, their sampling distributions are likely to place more weight on this end of the market. This would cause the distributions to be asymmetric. MADS provides analytically expressible choice probabilities for any sampling distribution i.e., regardless of symmetry. However, the symmetric case provides tractable statements that carry the intuition for the explanation of the context effects.

Design
Choice alternatives for both experiments were Manhattan hotel stays, for which there are two main attributes: 'average rating' and 'price'. Data pertaining to real hotels were taken from Hotels.com on 23 June 2014. We recorded the price and average rating of the cheapest 200 hotel stays for a one-night stay for one adult in one room, for a stay on 12 November 2014. Fig. 6 provides a plot of the hotel data recorded. 'Average rating' refers to the rating given by members of Hotels.com who had previously stayed at the hotel. 5 We presented the score rounded to one decimal place, as it is presented on the website itself. This served as our proxy for quality, so that we could present data across the price-quality domain, as in classical context-effect experiments. Given the familiarity of such sites to internet users, we referred to 'average rating' rather than 'quality' throughout the experiment. 12 distinct hotels were selected from these data to form the six choice sets of the three context effects, shown in Table 1. As with most studies showing the presence of these context effects, participants' hotel choices were hypothetical. 5 Each reviewer submits a score of 1, 2, 3, 4 or 5, where higher numbers correspond to a better experience. All the hotels we recorded had at least 25 reviews.  The choice sets used in the experiments. To match the terminology used in the text and Fig. 1, relabel the decoys respectively for the attraction, compromise and similarity effect choice sets.
1304 Amazon Mechanical Turk (AMT) workers were recruited in July 2014. It was decided in advance that 1200 participants would be tested; 1300 were requested from AMT in order to be able to remove some if there were those who had completed a related pilot, and 1304 were received. There is considerable variation in the size of context effects in the literature, and they are of course not always found (Trueblood, Brown, & Heathcote, 2015). Estimates from studies in consumer choice find sizes ranging from about 0.15 (see Table 1 of Trueblood et al., 2013) to over 0.3 (Noguchi & Stewart, 2014). We chose to ensure 100 participants per condition; this gives a power of 0.8 to detect a difference in choice proportion of 0.2 when comparing two conditions with each other.
We excluded 68 participants from the analysis because they had previously completed a related pilot study; one was removed because they did not complete the experiment. This left data from 1235 participants for analysis. Average completion time was 14 minutes 27 seconds. Participants were compensated with a participation fee of $1.50, which corresponds to an average hourly wage of $6.23. Participants were randomly assigned to one of 12 Participants in the treatment condition each saw data relating to ten hotels taken from the dataset before selecting an alternative from one of the six choice sets. They were shown the price and average rating of each of these ten hotels one at a time, and for each one were asked to indicate on a seven-point scale how likely they would be to stay at that hotel. The purpose of asking this was to ensure some amount of engagement with the hotels presented, such that they would affect the hotels available in the participants' comparison sample. An example screen-shot is provided in Fig. 7. Following the treatment, participants faced one of the choice sets   Table 1. Notice that in the attraction and compromise choice sets, only alternative B is promoted. In the similarity set including a decoy close to B i.e., {A, B, S A }, both B and the decoy are promoted because they are so close together. and answered the question ''Which hotel would you be most likely to choose?''. Participants in the control condition simply chose without seeing any other hotels first. Participants were later asked to indicate how they divided their attention when considering hotel choices using a seven-point scale where 1 meant ''considered solely prices'', 4 meant ''both attributes equally'', and 7 ''solely ratings''. Before finishing, participants faced a series of questions for another experiment. Basic demographic questions followed on the final screen.
The ten hotels shown in the treatment condition were chosen from our data shown in Fig. 6 such that they were dominated by alternative B (the more expensive, higher-quality alternative), but not dominated by alternative A (the cheaper, lower-quality alternative), in the choice set they faced afterwards. Because the hotels in each context effect choice set were different, a different set of hotels was used as a manipulation for each choice set. Where there were more than ten candidate hotels fitting this description, ten were chosen at random. Every participant in the same choice set saw the same ten hotels, but in a random order. The manipulation is illustrated by Fig. 8. Notice that in the attraction and compromise choice sets, only alternative B is promoted. In the similarity set including a decoy close to B i.e., {A, B, S A }, both B and the decoy are promoted, because they are close together. We now refer to the promoted alternatives as manipulation targets.

Choice and context effect manipulation
The treatment effect is the difference in the proportion of participants choosing the manipulation targets in the treatment and control conditions. Overall, the proportion of times participants chose the manipulation targets was 22% higher following the treatment (.32 in the control vs. .39 in the treatment; p = .012).
Our theory supposes that this manipulation will be successful because the hotels that participants were shown are dominated by the manipulation targets on both attributes, but by the other alternatives in the choice set on only one attribute. Therefore, we expect the manipulation to have the most impact when individuals pay attention to both attributes. Table 2 shows large differences in the manipulation effect depending on whether or not participants paid attention to both attributes equally. The majority of participants paid attention to both attributes equally and of these, the treated participants chose the manipulation targets .17 more (p < .001), corresponding to a relative increase of 47% in the proportion choosing the targets. There was no effect of the manipulation on the choices of participants who paid attention unequally.
MADS is intentionally as simple as possible, and implicitly assumes that participants pay equal attention to both dimensions. 6 We therefore continue the analysis using data from participants who reported paying equal attention to both dimensions. 7 Using the data of these participants, Table 3 shows the effect of the manipulation broken down by which context effect choice set 6 While extensions to the model to account for differential weighting of dimensions are possible, these involve adding additional parameters and compromise the analytic tractability and conciseness of the present approach.
7 Subsidiary analysis of the choices made by participants who reported paying unequal attention to the two dimensions were as expected: the 73% (56%) who paid more attention to price (rating) chose the cheapest (highest-rated) of the three options. Attribute attention: sliding scale of integers {1, . . . , 7} where 1 = only considered price, 4 = considered both attributes equally, 7 = only considered quality. Control and Treatment report the proportions of participants choosing manipulation targets in the control and treatment groups respectively. The only difference between the control and the treatment is that those treated first observed and rated ten additional hotels, as described in Section 3.1. Which alternative(s) were manipulation targets depends on which of the choice sets (listed in Table 1) an individual was allocated to. In the attraction and compromise choice sets along with the similarity effect choice set {A, B, S B }, B was the sole manipulation target. In the similarity effect choice set {A, B, S A }, both B and the decoy S A were manipulation targets. Manipulation effect: the difference in the proportion of participants choosing the manipulation targets in the treatment and control conditions. P-values are from two-proportion z-tests against the null of no effect. * (indicates p < .05).  Table 2 by the choice sets that participants were assigned to (there are two choice sets per context effect). Data is from participants who paid equal attention to both attributes. P-values are from two-proportion z-tests against the null of no effect. * (indicates p < .05). Fig. 9. Manipulating context effects: method. Price is on the x-axis, average rating on the y-axis. The treatment was to expose participants to ten hotels (the crosses) prior to them facing one of the choice sets (the solid circles). The treatment was intended to increase the choice share of alternative B within the ensuing choice set which dominated all these ten hotels. The treated (untreated) participants' experience is represented by the top (bottom) row of diagrams. We predict that the context effects will be countered or enhanced due to this treatment. As an example, consider countering the attraction effect by comparing the choice shares of B of the participants whose experience is represented by (i) the top-leftmost panel, to (ii) the bottom-leftmost panel. Similar comparisons within each of the six columns of the figure allow an assessment of whether the context effects were countered or enhanced.
the participant was assigned to. Within all three context effects' choice sets, the manipulation effect was in the predicted direction (i.e., positive), although only significantly so for the attraction and similarity effects. Our design permits us to attempt to counter and enhance the three context effects. For example, the attraction effect says that alternative B will be chosen more often from {A, B, T B } than {A, B, T A }, with no manipulation. When participants choosing from {A, B, T A } are instead in the treatment condition, we predict that B will be more popular than if they were not. Therefore, when we compare choices from participants who faced {A, B, T B } in the control and {A, B, T A } in the treatment group, we predict that the attraction effect will be reduced. This example corresponds to the first column in Fig. 9. The remaining columns describe which data from which conditions are compared to investigate the impact on the context effects.
We know from Table 2 that the manipulations had a significant impact on choices. Fig. 10 Fig. 9; (N) denotes 'neutral', which refers to our attempt to replicate the context effects. Standard error bars are given. Solid circles refer to a significant context effect i.e., different from zero, at the 5% level from a two-proportion z-test against the null of no effect. Hollow circles refer to no significant difference from zero, and hence no context effect. was not replicated, with an effect insignificantly different from zero. Our theory dictates that the presence of context effects is probabilistic and we are not the first study to find that the similarity effect is the weakest (e.g., Noguchi & Stewart, 2014). 9 More importantly, our manipulations did have an effect. Countering the similarity effect pushed the size down to -.15, which is marginally significantly different from zero (p = .079). When we enhanced the effect, the size became .20, which is significantly different from zero.

Design
The data generated by 607 Amazon Mechanical Turk workers were used. The participants of experiment 2 were also the control group for experiment 1. Following their choice, and hence exposure to a choice set, we elicited what participants inferred about the rest of the hotel market. It would have been infeasible to ask them for their best guess of all the other 197 hotels in our dataset. Instead we asked them to estimate the price and quality of a randomly chosen 20. They were told that 20 hotels had been randomly selected from our dataset, which they had to estimate. These 20 did not include the three they had already seen in their choice set. Participants were able to see their choice set throughout the elicitation process, but not to change their choice.
To elicit their estimation of the market distribution, they completed two screens. On the second screen, we asked participants to plot where they thought these 20 hotels lay in price × rating space, based on the choice set they had seen. That is, they were shown their choice set plotted on a pair of axes and required to plot where they thought the 20 additional hotels were located. An example screen-capture is provided in Fig. 11. The example participant shown chose from a similarity effect choice set, {A, B, S A }, which typically promotes the cheapest option A (labeled in the figure for the benefit of the reader). The participant has placed 9 Note that when we looked at individuals' sampling distributions we did not significantly predict choice in the similarity effect conditions, but did in the attraction and compromise effect conditions (see Table 4).
nine plots so far. The choice set was displayed in red; plots in green. To avoid possible anchoring effects, we did not include grid lines, axis ticks or axis-tick labels. To scale the axes, we asked for their best guess of the minimum and maximum price and average rating of the 23 hotels (including the choice set they faced). The minimum and maximum value of each axis was determined by participants themselves on the first screen, which they could not return to. Illogical answers, e.g., that the minimum was higher than the maximum, were not allowed. The on-screen size of the plotter was fixed to be a square; participants only determined the scale of the axes. Participants were shown the coordinates where their mouse was hovering, were able to remove the points they had plotted and start again by 'resetting' the graph and were provided with a counter telling them how many points they still had to place.
We did not allow participants to go back to change the minimum and maximum values for the axes due to concerns that participants would tweak their answers to move their choice set around the plotter. We removed data of participants who entered values extreme enough such that the choice set would be shown bunched into the corner of the screen. 10 We chose the cut-off to be a price of $800; anyone entering this value or higher was excluded. This removed 35 participants, leaving 572 for analysis.
As an incentive payment, participants were told that the five who plotted closest to the 20 hotels would be paid $5 as a bonus. The procedure we used to determine who was the closest was the modified-Hausdorff metric, as advocated by Dubuisson and Jain (1994). This metric provides a distance based on the average minimum pairwise distances between two sets of coordinates. In our case, these two sets were the participant's plotted data, and the 20 hotels randomly chosen. Participants were not told the details of the metric; they were simply told that the five participants whose plots were 'closest' to the 20 we had would be paid.

Aggregate distribution elicitation
Each panel of Fig. 12 presents the pooled plots placed by all ≈100 participants per choice set, meaning there are roughly 2000 plots in each panel. We provide the proportions of plots contained within the crucial areas identified by theory. We draw on patterns in these aggregate data which illustrate how choice sets affect distributions and that these distributions move in ways compatible with our theory to produce the context effects. We discuss features of this aggregate distribution data as if it were in fact the distribution of all individuals in order to demonstrate how we consider context effects can arise. The individual-level data are explored in the next subsection.
Recall that MADS supposes that movements in the sampling distributions change the probabilities of alternatives being included in the comparison set, and hence affect choice probabilities. First, we test whether in fact there has been any difference in the distributions elicited across the choice sets for each context effect. Casual inspection of the heat-maps suggests pronounced movement of the density of the plotted points between choice sets. Using a multi-dimensional version of the Kolmogorov-Smirnov test, as proposed by Fasano and Franceschini (1987), we found that within each of the three context effects the distributions elicited from participants differed significantly (p < .001) depending on which choice set they had seen. We now turn to see whether the movements of these aggregate data coincide with MADS's account of the context effects.
The attraction effect panels show the density of the two solodominance regions in each condition that determine the choice

Individual-level estimation results
Finally, we examined whether it was possible to predict individuals' choices. Our model imposes no assumptions or parameters on an individual's sampling distribution, but with only 20 plots per participant, empirical distributions at the individual level are too coarse to reasonably enable prediction. Therefore, we selected the Fig. 12. Plotting data by choice set and aggregate-PDF values. Aggregate estimated sampling distributions. A lighter tone refers to a higher density of plots. The numbers refer to the proportion of points plotted in that region i.e., the empirical density. Graphics are cropped at 2/5 for quality, and $500 for price; over 95% of the plotting data is in this range. .33 .30 .327 The null hypotheses are those implied by random prediction between all nondominated choice alternatives. p-values are from two-sided binomial tests.
multivariate distribution that best-fitted each participant's plots. Various copulas (Gaussian, t, Frank, Gumbel and Clayton) were fitted to each participant's estimate of the distribution of 23 points (20 plotted and 3 from their choice set). 11 Copula selection for each participant was determined by the Akaike Information Criterion. For each participant we then identified the choice that was most likely on the basis of that participant's estimated sampling distribution. The proportion of correct predictions is reported in Table 4. It can be seen that the choices were reasonably well predicted for participants in the attraction and compromise conditions but not in the similarity conditions. These findings are congruent with the fact that we replicated the attraction and compromise effects but not the similarity effect, as shown in Fig. 10.
To compute the estimates of Table 4, it was necessary to estimate the MADS parameter n, which specifies the number of comparison items brought to mind from individuals' sampling distributions. We calculated the probability of participants' choice data for different values of n and found the maximum likelihood estimate to be 4. 12 The estimate is precise in the sense that it is different from both 3 and 5 (LR tests p = .021 and p = .091 respectively). We emphasize that n = 4 is a psychologically realistic value for working memory capacity (see Cowan, 2001), consistent with the idea that comparison samples are held in working memory during choice.

Discussion and conclusions
MADS offers a concise model of the attraction, compromise and similarity effects. Two experiments tested the assumptions of the model. In Experiment 1, prior exposure to a selection of market options altered subsequent choices in ways predicted by the model and allowed us to reduce and enhance two of the 'big three' context effects documented in consumer choice. Experiment 2 demonstrated that individuals' sampling distributions of market options depend on the choice set in ways required for the model to produce the effects. These results, consistent with recent psychological studies of risky choice (e.g., Harris, 2015, Ungemach, Stewart, andReimers, 2011) but here pertaining to the classic context effects, suggest that assumptions about background distributions, combined with choices made on the basis of simple dominance relations, are sufficient to give rise to context effects. Moreover, the maximum likelihood estimator of the model's central parameter for the number of comparators sampled took on a psychologically realistic value for the capacity of human working memory.
We note that we have remained theoretically neutral regarding the psychological mechanisms that participants use when inferring background distributions from a set of presented options. Rather, we have simply used the data provided by participants and selected the multivariate distribution that best fits those data. Our account is compatible with a number of ways in which participants 11 Copulas are succinct descriptions of the correlation between two variables.
12 Three participants chose the decoy alternative from attraction effect choice sets and were excluded from the estimation. Our model has no additional error term and so predicts such behavior with probability zero. Inclusion of these participants' data would prevent the log-likelihood from being well defined. might infer distributions (e.g., the Bayesian approaches described by Natenzon, 2016 andShenoy andYu, 2013) but (a) we regard the available data as insufficiently constraining to enable selection of a particular mechanism, and (b) we believe that the present account is best served by our avoidance of the additional degrees of freedom that would be provided if we made an arbitrary choice.
Our data were generated by members of Amazon's online platform, Amazon Mechanical Turk (AMT). AMT's participant population has been shown to have the advantages of being more demographically diverse, and producing data of a comparable quality to more traditional participant methods Chandler, Mueller, and Paolacci, 2014). This has been shown through many studies replicating classic experiments in various domains including cognitive psychology (e.g., Paolacci, Chandler, and Ipeirotis, 2010;Goodman, Cryder, and Cheema, 2013) and economics (e.g., Horton, Rand, and Zeckhauser, 2011). Some have expressed concern that AMT participants may not pay sufficient attention to the choice alternatives which may lead to a failure to find the attraction effect (Simonson, 2014). However, we have demonstrated that the classic context effects in consumer choice can be found with such samples.
Our approach is rooted in and brings together various approaches within cognitive science, consumer psychology and economics. In economics, some recent theoretical approaches have been developed to show how anomalous choice behaviors can be explained by cognitive limitations such as binary ordinal comparison (e.g., Kornienko, 2013), memory limitations in forecasting (Mullainathan, 2002), psychological salience (Bordalo, Gennaioli, & Shleifer, 2012b or as optimal responses to noise (e.g., Howes, Warren, Farmer, El-Deredy, & Lewis, 2016;Robson, 2001;Steiner & Stewart, 2016). Our approach falls within this tradition and also draws on information-sampling models of judgment, most of which assume that judgments are typically made on the basis of limited samples (e.g., Fiedler, 2000;Fiedler & Juslin, 2006;Fiedler & Kareev, 2006;Lindskog, Winman, & Juslin, 2013). Relevant alternatives are assumed typically to be retrieved from memory as well as, or instead of, being sampled from the choice context. 13 More specifically, our model can be seen as an extension of rank-based models such as DbS (Stewart et al., 2006) and sampling models that assume options are evaluated relative to an assumed background distribution (see also Kornienko, 2013). Related to suggestions in economics (Kamenica, 2008), marketing (Wernerfelt, 1995) and cognitive psychology (Shenoy & Yu, 2013;Sher & McKenzie, 2014), MADS assumes that people's inferences about the relevant background distributions are influenced by the context of choice options. MADS, however, both extends DbS to the multi-dimensional case (see also Stewart & Simpson, 2008), and specifies the role of choice options in causing the background distribution to be updated.
More generally, MADS makes the same predictions as the classical utility paradigm as a limiting case: If we allow the number of items sampled to approach infinity and assume individuals' sampling distributions do not depend on the choice sets they face, then choices become deterministic and context independent, hence context effects are not predicted.
Our approach differs from those found in both economics and psychology. Initial explanations within psychology focused on one or two of the context effects at a time, making reference to decision strategies such as elimination-by-aspects (Tversky, 1972) or concepts such as loss aversion (Simonson & Tversky, 1992;Tversky & Simonson, 1993). Since then, process models have typically had difficulty in accounting for all three effects within a unifying framework without resorting to arguably ad hoc parameters or separate mechanisms in order to capture all three effects simultaneously. In Simonson and Tversky (1992); Tversky and Simonson (1993) two concepts are proposed. A tradeoff contrast operates via either the local context (choice set) or the background context. The introduction of a dominated alternative then enhances the relative tradeoff of attributes of the dominating alternative, leading to the attraction effect. Extremeness aversion specifies that the absolute advantages and disadvantages of a choice option (relative to the other options) are weighed by a loss-averse decision maker. A compromise (middle) alternative would then notch up smaller losses through comparison to the other choice options, whereas extreme alternatives suffer from larger losses, leading to the compromise effect. In other models, attraction and compromise effects are attributed to loss aversion e.g., in the Leaking Competing Accumulators model (Usher & McClelland, 2004) or attention switching and mutual inhibition occurring between choice options in Multi-alternative Decision Field Theory (Roe et al., 2001). Bhatia (2013) proposes a model in which the accessibility of attributes is determined by the attributes' associations with objects of potential choice. More accessible attributes in turn carry higher weight in an evidence accumulation process. These models, among others, can all account for the three key context effects. However, in each case the three effects cannot be explained in terms of a single mechanism. In Bhatia's model for example, the three effects can occur simultaneously but will not do so under all parameter settings. In the Multiattribute Linear Ballistic Accumulator model (Trueblood et al., 2014), the attraction and compromise effects arise because objects that are closer to each other receive larger attention weightings, whereas the similarity effect occurs when confirmatory evidence is given more weight than disconfirmatory evidence. Finally, in work regarding judgment rather than consumer choice tasks, Bhatia (2014) investigates the role of confirmatory search processes in explaining the attraction effect. He finds increased retrieval of cues favoring the target option when a decoy is present, and that the attraction effect can be removed by manipulating the availability of relevant cues. Our model is similar in spirit in that the proportion of 'cues' (or 'sampled items' in our case) that favor the target alternative is influenced by the choice set. However, MADS differs in that it is built to model choice rather than judgment, and in that it is more general e.g., applying also to the compromise and similarity effects.
The models developed within economics also do not generally offer an account of all three effects. However, in the models that offer explanations of the attraction (and some also the compromise) effect, there is a recurring emphasis on dominance comparisons, which resonates with our approach. In a model of limited attention, Masatlioglu et al. (2012) suggest that the attraction effect reveals that the target alternative only enters the consideration set when the dominated decoy is present (for a related analysis under stochastic choice, see also Manzini and Mariotti, 2014). Ok et al. (2015) show how dominated choice alternatives can endogenously act as reference points and constrain the consideration set to include only the dominating alternatives, leading to the attraction effect. de Clippel and Eliaz (2012) model the cooperative bargaining problem of an individual with two selves, where each has a preference ordering over one attribute: Consider a choice alternative receiving two scores obtained by counting its number of favorable ordinal comparisons within each attribute and define the 'minimal score' for the alternative as the lower of these two numbers. The authors show that the solution to the bargaining problem is to select the choice alternative with the highest minimal score. The approach predicts the attraction and compromise effects, but not the similarity effect. More broadly, in these frameworks choice is deterministic. In MADS, stochastic choice follows directly from the presence of the sampling process. Furthermore, it is unclear how these other models could explain the effects of alternatives not present in the choice set e.g., the 'phantom decoy' effect, which can be thought of as working in the opposite direction to the attraction effect. Taking a Bayesian approach, Natenzon (2016) supposes that decision-makers receive information about their preferences when they inspect the alternatives in the choice set, forming posterior beliefs over various possible underlying stable preference orderings. The author posits that dominance relations make for a simple comparison and so emphasize dominant options, leading to the attraction effect. Furthermore, Natenzon shows that if the precision of signals about the decision-maker's utility are sufficiently low, the compromise effect is predicted. The model we develop drops the classical economic assumption of a stable underlying preference ordering, nesting it as an extreme case. Instead, MADS relies on simple binary dominance relations, limited sampling and systematic changes in sampling distribution in order to allow the context to determine choice. 14 As a result of its possession of these features, MADS contrasts strongly with other models of context effects developed in economics. For example, a number of economic models assign a key role to the differential weighting of consumption dimensions that may result from changes in the choice context. For example, Bushong et al. (2015) assume that a dimension is weighted less when the range of values on that dimension (from worst to best) increases. The focusing model of Kőszegi and Szeidl (2013) assumes in contrast that greater weight is assigned to dimensions on which choice options exhibit more variation. As Bushong et al. note, range-based models have no natural way to incorporate compromise effects without augmentation. Moreover, other more general considerations may be thought to militate against rangebased dimensional weighting models. First, Wedell has argued in a number of papers (e.g., Wedell, 1991;Wedell, 1998;Wedell and Pettibone, 1996) against the idea that simple context effects of the type discussed here reflect changes in the weighting of relevant dimensions (although we note Bushong et al.'s observation that weighting may become increasingly important as the number of potentially relevant dimensions increases beyond the two that we consider here). More generally, psychophysical research on the subjective judgment of magnitudes has shown that the perceived magnitude of an attribute value (and hence the perceived difference between attribute values) is influenced not just by the range of contextual stimuli, but also by the relative ranked position that each attribute value occupies within a comparison context. Moreover, there have been recent suggestions that apparent range effects may really be 'rank effects in disguise' (Brown and Matthews, 2011;Brown, Wood, Ogden, and Maltby, 2015). MADS, with its assumption that binary ordinal comparisons form the basis of choice, aligns closely to this tradition of research. Indeed, the ordinal comparisons that MADS assumes are precisely the same as those that are assumed to underpin judgment and choice in psychological process models such as DbS (Stewart et al., 2006). We therefore view MADS as aligning more closely with a range of psychological evidence than do range-based models.
We have focused on possibly the three most widely discussed context effects in the literature. Finally, we note here that MADS is able to capture more. Firstly, another documented context effect is the 'phantom decoy' effect (Pratkanis & Farquhar, 1992;Pettibone & Wedell, 2000, 2007. There, the decoy option (which is 14 We also note that there is work in economics on sampling at the intersection of industrial organization and bounded-rationality. Spiegler (2006b) examines the consequences for a market when it is assumed that consumers sample one item from past experience, and in Spiegler (2006a), one attribute of a complex product. More recent work has focused on equilibrium in markets when consumers exhibit some degree of trade-off aversion and employ some heuristic (e.g., Bachi and Spiegler, 2015;Papi, 2014). In particular, Bachi and Spiegler (2015) study twoattribute goods and assume that a dominant alternative is chosen when it exists. unavailable for choice) dominates one alternative (the target) but not the other (non-target). The effect is present when the target's choice share is higher than the non-target's. MADS predicts the effect when the decoy changes the sampling distribution over the product space in such a way that the density in the target's solodominance region increases by more than the density in the nontarget's solo-dominance region. As the decoy is typically close to the target, we consider it plausible that this area of the sampling distribution would be inflated, leading to an increase in the choice share of the target relative to the non-target. Secondly, more distant decoys may lead to larger context effects (e.g., Soltani, De Martino, & Camerer, 2012). For the case of the attraction effect, consider moving T B in Fig. 3 (right panel) further away from the target, B (but such that T B is still dominated only by B). MADS will predict a stronger attraction effect with this more distant decoy if the sampling distribution becomes further stretched out (as a result of the decoy becoming more distant) and hence more density moves into B's solo-dominance region. This movement will increase the probability of B being chosen and hence strengthen the attraction effect. Thirdly, Teppan and Felfernig (2009) find that the attraction effect can be offset by introducing two decoys (rather than one) to a binary choice set A, B, one dominated by A only, one by B only. MADS naturally predicts this to happen: the sampling distribution would plausibly be pulled down approximately equally by both decoys, giving A and B a more equal share of the density in their solo-dominance regions compared to a less equal share following an attraction effect choice set, when only one of these decoys is present.
These allow us to write: Denote p andp as a lower bound for p(A|ABT A ) and an upper bound for p(A|ABT B ) respectively, hence p >p ⇒ p(A|ABT A ) > p(A|ABT B ), the attraction effect: These allow us to write: We make the following symmetry assumptions: These allow us to write: We make the following symmetry assumptions: