The geometry of distributional preferences and a non-parametric identification approach

This paper proposes a geometric delineation of dis tributional preference types and a non-parametric approach for their identification in a two-person context. It starts with a small set of assumptions on preferences and shows that th is set (i) naturally results in a taxonomy of distributional archetypes that nests all empiricall y relevant types considered in previous work in economics and social psychology; and (ii) gives ris e to a clean experimental test design that discriminates between archetypes according to core features of preferences rather than properties of specific modelling variants. As a byproduct the test yields a two-dimensional index of preference intensity. JEL Classifications : C90, D63, D64, C81, B41

This is correct, of course. As will become clear below, the main innovation of the present paper in this regard is to derive the number and core properties of types from a small set of primitive assumption on preferences. This is in contrast to previous studies which either start with a given list of types or a specific model of preferences. 6 From the papers mentioned in the main text Charness and Rabin (2002), Strobel (2004), Cabrales et al. (2010), Blanco et al. (2011) and Iriberri and Rey-Biel (2013) are examples of the former track (starting with a given set of types and designing tests to discriminate between the members of the set; or starting with a functional form and estimating the free parameters), while Andreoni and Miller (2002) and Fisman et al. (2007) are examples of the latter track (starting with a test without specifying a priori which types are tested for). 7 Standard references are Griesinger and Livingston (1973) and Liebrand (1984). 8 See, for instance, Offerman et al. (1996), Sonnemans et al. (1998), van Dijk et al. (2002), Brosig (2002) Blanco et al. (2011) and Iriberri and Rey-Biel (2013) employ identification procedures based on the piecewise linear model originally introduced by Fehr and Schmidt (1999) as a description of self-centered inequality aversion and later extended by Charness and Rabin (2002) to allow for other forms of distributional concerns and thereby assume piecewise linearity; and Andreoni and Miller (2002), Fisman et al. (2007) and Cox and Sadiraj (2012) check consistency with -and estimate parameters of -standard or modified constant elasticity of substitution (CES) utility functions.
Summing up the above discussion we conclude (i) that there is neither an agreement in the literature on what the relevant set of distributional basic motivations -defined as the manner in which people care about the (material) well-being of others -is, nor on how to delimitate distributional types; and (ii) that existing studies employ identification procedures that rely on strong structural assumptions as, for instance, linearity, piecewise linearity or standard or modified CES forms. By using a systematic approach based on a small set of primitive assumptions on preferences, the present paper offers an improvement in both dimensions. It shows (i) that this set of assumptions naturally results in a well delineated, mutually exclusive and comprehensive distinction between nine archetypes of distributional concerns; and (ii) that this set gives rise to a simple non-parametric experimental test that discriminates between the archetypes according to core features of preferences rather than properties of specific modeling variants or functional forms. As a byproduct the test yields a two-dimensional index of preference intensity.
While the primary purpose of this paper is methodological, the experimental results obtained in an implementation of the test also produce some substantive insights. For instance, the result that -consistent with the theoretically appealing assumption that distributional preferences are convex -about 95% of the subjects reveal (weakly) more benevolent (less malevolent) preferences in the domain of advantageous than in the domain of disadvantageous inequality. A second interesting detail is that beyond selfish subjects, the empirically most frequent distributional archetypes are those who exhibit (at least weakly) positive attitudes towards others in both domains (i.e., altruism and maximin), while archetypes that imply a negative attitude in at least one of the domains are empirically by far less important (the behavior of less than a forth of the subjects is consistent with any form of inequality aversion, for instance, and the choices of less than 7% of the subject population are consistent with spite). 9 The rest of the paper is organized as follows: Section 2 starts by introducing the assumptions on which the analysis is based and argues that those assumptions are fulfilled by 9 The finding that there are very few malevolent people in variants of the dictator game is in line with previous studies -see Charness and Rabin (2002) or Engelmann and Strobel (2004), for instance.
-4 -all major modeling variants of distributional preferences discussed in the economic literature. It continues by highlighting the core features of different distributional preference types discussed in the literature and shows that delimiting types according to the proposed assumptions naturally leads to a distinction between nine mutually exclusive and comprehensive archetypes of distributional concerns. Section 3 introduces the identification procedure (the "XY-test"). It starts with the intuition behind the proposed identification approach (Subsection 3.1) and then presents the basic version of the test (Subsection 3.2). Furthermore, a two-dimensional index for identifying the archetype and characterizing the intensity of distributional concerns -the (x, y)-score -is introduced, and a graphical representation of the type-intensity distribution is proposed (in Subsection 3.3). The section ends with a discussion of several extensions and modifications of the test (in Subsection 3.4). Section 4 highlights the main differences between the current approach and other tests proposed in the literature. Section 5 relates the (x, y)-score to other measures of type and intensity of distributional concerns and discusses the pros and cons of replacing it by a cardinal metric (like the willingness to pay for changes in the income of the other). Section 6 illustrates the working of the identification procedure by reporting experimental results generated with the symmetric basic version of the test, and Section 7 concludes. Implementation issues for the case where the test is used as a tool in experimental economics (to address research questions in which distributional preferences play an important role, to control for subject pool effects, or to help to interpret data from other unrelated experiments) are discussed in Appendix A, while Appendix B contains a more detailed discussion of the ring-test, probably the closest competitor to the current approach when intended as an add-on to other experiments. Appendix C contains the instructions of the experiment reported in Section 6.

The Three Basic Assumptions
Let a = (m, o) denote an income allocation that gives material payoff m (for "my") to the decision maker (DM or "agent") and material payoff o (for "other") to the other person. Then In terms of axioms on preferences Assumption 1 requires ordering (completeness and transitivity) and continuity. While ordering is important for the arguments below (as it is for -5 -substantial parts of economic theory), continuity is not. 10 The second assumption requires thatholding the material payoff of the other person constant -the DM's utility is strictly increasing in her own material payoff: Assumption 2 (strict m-monotonicity) : Holding o constant, the agent's well-being is strictly increasing in m. That is, ∂u/∂m > 0 for all (m, o) oe R 2 .
Strict m-monotonicity is quite a natural assumption. It is violated, for instance, if the DM is willing to burn her own monetary payoff because she feels bad whenever she has (much) more than the other person. Such behavior is essentially never observed in experiments. The third assumption requires that the DM's general attitude towards the other person (i.e., whether she is benevolent, neutral, or malevolent to the other) depends only on whether the other person has more or less monetary payoff than the DM herself: Piecewise o-monotonicity is both permissive and restrictive, depending on the perspective. It is permissive because it allows for all major variants of distributional preferences that have been discussed in the economic literature -see the discussion at the end of this and in the next subsection. Piecewise o-monotonicity is also restrictive because it implies (i) that preferences only depend on outcomes, not on the way they are achieved (this is the defining feature of distributional preferences); and (ii) that the reference point for the evaluation of allocations (if one is used) is an equal-material-payoffs allocation.
Ad (i) The implication that preferences only depend on outcomes is likely to be violated Furthermore, in a richer environment, where agents have more information on each other, 10 Continuity simplifies the presentation of the assumptions and the description of the core features of different archetypes of distributional preferences but is neither needed nor used in any other part of the paper. Continuity is not needed because the identification procedure proposed here uses information on the shape of revealed upper and lower contour sets to derive bounds on indifference sets. So, from a theoretical point of view, the procedure works even if indifference sets are singletons (as is the case for "lexself" preferences discussed by Fisman et al. 2007, for instance).
-6 -beliefs about the other-regarding concerns of the other person may play a role (as in the literature on type-based models cited in Footnote 4). Finally, features of the situation (such as context, entitlements, properties of the outcome generating process, etc.) or the DM (such as a code of conduct, or a preference for honesty) might shape behaviour. Knowing that all those factors might be behaviourally relevant in a richer environment it seems important that distributional preferences are identified in a non-strategic setting and a neutral frame to avoid confounds. This is not to say that distributional preferences are unimportant in richer environments, of course, but rather that they cannot be unambiguously identified there.
Ad (ii): Some distributional archetypes discussed in real life and in the literature (most importantly, inequality aversion and egalitarian motives; maximin, Rawlsian and Leontief preferences; and envy) are inevitably defined in terms of a "reference location" (an interval, a point, or whatsoever), where the DM's general attitude towards the other changes sign. In theory, this reference location can be anything, of course, and it can differ among individuals.
In existing models of reference-dependent distributional concerns, the reference location is a point, and the point is the egalitarian one for all individuals (see, for instance, Bolton 1991, Mui 1995, Fehr and Schmidt 1999, Bolton and Ockenfels 2000, or Charness and Rabin 2002. While Assumption 3 is more agnostic than existing models of reference-dependent distributional concerns, it is still restrictive. 11 For instance, there might exist individuals who consider it fair to get 20% more than others but unfair to get 30% more. Assumption 3 does not allow for this. While it would be feasible, in principle, to generalize Assumption 3 (and the test relying on it) so as to allow for heterogeneous reference points, this would seriously impair simplicity and transparency: Ultimately the aim of the paper is to propose a classification of subjects in distributional preference types that is helpful in organizing experimental data. For that purpose we need some kind of clustering and not a different distributional type for each single individual. Stated differently, as any model the approach proposed here is by design an abstraction of realty, and hence is deliberately constructed so as to not explain some behavior, in return for parsimony.
While parsimony calls for a unique reference point, it does not suggest equality as the reference point. Equality is suggested (a) by normative considerations and (b) by empirical evidence.
Regarding the normative basis Konow (1993Konow ( , p.1194 argues in his review of theories of justice that "[t]he most primitive, and probably oldest, notion of justice associates equity 11 What Assumption 3 essentially requires is that the egalitarian outcome is somehow focal among those subjects who change their general attitude (i.e., whether they are benevolent, neutral or malevolent) towards others at some point. It does not require the attitude to change, though. In other words, while existing models of referencedependent distributional concerns assume that "something special happens at equality", Assumption 3 "only" requires that "if something special happens with preferences then it happens around equality".
-7 -with equality." And indeed, two of the most prominent theories of distributional justice -Egalitarism and Rawlsianism-have explicitly been formulated with reference to equality. Rawls (1971), for instance, claims that two equity principles would be chosen in the 'original position' (that is, 'behind a veil of ignorance'), first the equality principle asking for equal rights and opportunities, and second the difference principle demanding that goods are distributed equally unless an unequal distribution is to the advantage of the least favoured. This is nothing else than a maximin rule for the distribution of resources that has equality as a benchmark. One might be inclined to ask what the Rawlsian original position has to do with the distributional concerns of subjects in the lab, or more general, with the fairness perceptions of actual people. Binmore (2005, p. 18) answers this question as follows: "All fairness norms in actual use share the deep structure of Rawls' original position. This deep structure is biologically determined, and hence universal in the human species [....]." And on another occasion (on p. 14) he argues: "Why do we care about fairness? ... [W]e care because fairness is evolution's solution to the equilibrium selection problem for our ancestral game of life. What evidence is there for this conjecture? All the societies studied by anthropologists that survived into modern times with a pure hunter-gathering economy had similar social contracts with a similar deep structure. [...] They tolerate no bosses, and they share on a very egalitarian basis." The statement that equality is evolution's solution to the equilibrium selection problem is discussible, of course, the statement that equality is suggested by normative considerations seems safe, though.
Regarding empirical evidence Bernheim (2009, 1607f) cite several studies showing that equal sharing is common in the context of joint ventures among business firms, partnerships among professionals, share tenancy in agriculture, and bequests to children.
They also provide evidence indicating that equality is a frequent outcome of negotiation and conventional arbitration in the field. In lab-experiments the assumption that the egalitarian outcome is somehow focal among subjects who change their general attitude towards others at some point seems even more natural than in the field: Subjects enter the laboratory as equals, their roles are assigned randomly and they have absolutely no information about each other. It seems therefore quite plausible that those subjects who attribute special meaning to an allocation (again, nothing in Assumption 3 requires them to do so) do this to the egalitarian one. And there is indeed considerable support for this assumption in existing experimental data.
For instance, one of the stylized facts in standard dictator games is precisely that a sizeable fraction of the subject population voluntarily cedes exactly half of the pie to the recipient, and that very few subjects cede more (Camerer 1997). This result survives even in experiments where the action space is continuous and where the price for giving is quite high (see Andreoni and Miller 2002, for instance). The frequency of equal divisions is even higher in ultimatum -8 -games, where expectations about the "reference point" of the recipient enter the picture (see Camerer 2003). While all this evidence indicates that the egalitarian outcome has something special for a substantial fraction of subjects, it does not tell us anything about the exact fraction of subjects for whom this is the case. 12 But this is exactly (one of) the question(s) the proposed test tries to address.
As is easily checked almost all (modeling) variants of distributional preferences discussed in the economics literature satisfy assumptions 1-3, notable exceptions being lexself preferences (discussed by Fisman et al. 2007) which -in a strict interpretation -violate the continuity part of Assumption 1, and maximin (or Rawlsian, or Leontief) preferences (discussed by Andreoni and Miller 2002, Charness and Rabin 2002, and Engelmann and Stobel 2004 which -in their purest form (but not in the form typically discussed in the literature) -violate strict m-monotonicity. 13

Nine Archetypes of Distributional Preferences and their Core Features
This subsection highlights the core features of different distributional preference types discussed in the literature and shows that delimiting types according to the three basic assumptions introduced in the previous subsection -the ordering part of Assumption 1, strict m-monotonicity and piecewise o-monotonicity -naturally leads to a distinction between nine mutually exclusive and comprehensive archetypes of distributional concerns.
• First consider selfish or own-money-maximizing preferences. They can be considered as a degenerated version of distributional preferences where an agent's well-being neither increases nor decreases in the monetary payoffs of other agents. Thus, the core property of selfish preferences in a two-person context is that indifference curves in (m, o) space are vertical (see Table 1 for the mathematical statement and Figure 1 for the geometric representation).
• The well-being of an altruistic agent increases in the monetary or utility payoffs of other agents (Becker 1974, Andreoni andMiller 2002); the well-being of an efficiency loving or surplus maximizing agent (Engelmann and Strobel 2004), the well-being of an agent 12 Here note that an egalitarian subject -according to the definition given in the next subsection-does not necessarily decide for an egalitarian allocation in a dictator game: If her preferences are smooth and satisfy strict m-monotonicity she will rather accept some advantageous inequality as this increases the utility derived from the own money component at a low cost in terms of the second component; similar arguments hold for other reference-dependent motives and other game forms. 13 The ERC model by Bolton and Ockenfels (2000) permits even violations of weak m-monotonicity. The same is true for (models of) some "social value orientation types" (the synonym for distributional preference types used by social psychologists), most notably, "martyrdom", "masochism" and "sadomasochism" (see Appendix B for details). It is important to note, however, that even in the social psychology literature violations of m-monotonicity are empirically irrelevant (I know of no study finding more than 5% of subjects in the mentioned categories).
-9 -with perfect substitutes preferences (Andreoni and Miller 2002) and the well-being of an agent with social welfare preferences Rabin 2002, Fisman et al. 2007) increases in the (weighted or unweighted) sum of payoffs. In all cases, well-being increases in o everywhere; thus, indifference curves in (m, o) space are negatively sloped everywhere (if o increases m has to decrease to hold the agent indifferent).
• An agent is spiteful (Levine 1998), or competitive (Charness and Rabin 2002), or status seeking or interested in relative income (Duesenberry 1949), if her well-being decreases in the payoffs of others everywhere; so the core property of such preferences is positively sloped indifference curves in (m, o) space.
• The well-being of an envious or grudging agent decreases in the payoffs of agents who have more but is unaffected by the payoffs of agents who have less (the role of envy has been emphasized by Bolton 1991 andMui 1995, for instance); thus, the core property of envious preferences is positively sloped indifference curves in the domain of disadvantageous inequality and vertical indifference curves in the domain of advantageous inequality. 14 • The well-being of an agent with maximin preferences (Engelmann and Strobel 2004), Rawlsian preferences (Charness and Rabin 2002), or Leontief preferences Miller 2002, Fisman et al. 2007) increases in the lowest of all agents' payoffs. Thus, its defining feature in a two-person context is that indifference curves in (m, o) space are negatively sloped if inequality is advantageous and vertical otherwise.
• An agent is inequity or inequality averse Schmidt 1999, Bolton andOckenfels 2000), or difference averse Rabin 2002, Fisman et al. 2007), or egalitarian (Dawes et al. 2007, Fehr et al. 2008) if she incurs a disutility when other agents have either higher or lower payoffs (as in the model by Fehr and Schmidt 1999), or when the agent's payoff differs from the average payoff of all agents (as in Bolton and Ockenfels 2000). Consequently, the defining feature of inequality averse or egalitarian preferences in a two-person context is negatively sloped indifference curves in the domain of advantageous and positively sloped indifference curves in the domain of disadvantageous inequality.
• The opposite constellation, benevolence in the domain of disadvantageous inequality combined with malevolence in the domain of advantageous inequality, is referred to as equality aversion (by Hennig-Schmidt 2002, for instance), or as equity aversion (e.g. by Rabin 2002 andby Fershtman et al. 2012). Its defining feature in a two- 14 Envy has also been discussed by Kirchsteiger (1994). His definition of envy corresponds to the current definition of "spite", though.
-10 -person context is that indifference curves in (m, o) space are positively sloped below and negatively sloped above the 45° line.   Note that the nine types listed in Table 1 and displayed in Figure 1 are well delimitated, mutually exclusive and comprehensive. Also note how the three basic assumptions introduced earlier enter the picture: ordering and continuity translate into existence and uniqueness of indifference curves through any point in (m, o) space; strict m-monotonicity means that upper contour sets are to the right of an indifference curve (the arrows in Figure 1); and piecewise o-15 A basic disposition related to our "kick-down" preferences has recently been discussed (by Kuziemko et al. 2011, for instance) under the heading "last-place aversion". A last-place averse individual has a psychological disgust against being "last", which creates a propensity for low-income individuals to punish individuals slightly below or above themselves, in the hope of keeping at least one agent below them.
-11 -monotonicity requires that the sign of the slope of an indifference curve (i.e., whether the agent is benevolent, neutral or malevolent) changes at most once -when crossing the equal-materialpayoff line. Thus, assumptions 1-3 together naturally result in a distinction between the nine mutually exclusive and comprehensive archetypes listed in Table 1 and displayed in Figure 1, meaning that qualitatively there is no room left for additional types.

Figure 1: Typical Indifference Curves of the Nine Archetypes of Distributional Concerns
Arrows → indicate the locus of upper contour sets There is room left for discussions on names, of course -see Footnote 14 for an example. And there is room left for discrimination within a given class; for instance, it might be interesting -12 -and important to discriminate between altruism and efficiency (or cake-size) concerns (both imply negatively sloped indifference curves in both domains), or between the Fehr and Schmidt (1999) and the Bolton and Ockenfels (2000) model of inequality aversion (both imply positively sloped indifference curves above and negatively sloped indifference curves below the 45° line). Although discrimination within a given class is not the main focus of the paper, Subsection 3.4 discusses a test version that might turn out to be helpful for this task as well.
Before proceeding it seems important to address two potential critiques. One is that the nine archetypes defined above are not really new. This is correct, of course. The main contribution of the present paper is not to introduce new preference types; one of the goals is rather to derive the number and core properties of preference types from a small set of primitive assumptions on preferences. This stands in contrast to previous studies which either start with a given list of types or a specific model of preferences. A second -related -critique is that a list of archetypes similar to the one presented in Table 1 could also be obtained by working off the possible sign combinations of the two parameters in the piecewise linear model originally introduced by Fehr and Schmidt (1999) as a description of self-centred inequality aversion and later extended by Charness and Rabin (2002) to allow for other forms of distributional concerns. If one is willing to assume that subjects have preferences of this very specific form then this critique is justified. But, a major point in the current paper is exactly that there is no need to impose such a tight structure. 16 This is true both for the type delineation and for the elicitation procedure. Stated differently, all modelling variants of distributional preferences satisfying the three assumptions introduced in Subsection 2.1 and all distributional archetypes tested for in previous experiments fall into one of the nine categories defined here. This is also true for the Charness and Rabin model, of course. On the other hand, there are many models of distributional preferences in the economic literature that do not fit into the piecewise linear framework of Charness and Rabin -the altruism models by Andreoni and Miller (2002) and Sadiraj (2007, 2012), the envy model by Bolton (1991), and the inequality aversion model by Bolton and Ockenfels (2000) are prominent examples.

Idea of the XY-Test
As mentioned earlier, the 3 basic assumptions introduced in Subsection 2.1 -the ordering part of Assumption 1, strict m-monotonicity and piecewise o-monotonicity -not only naturally 16 Using the piecewise linear model in empirical work is not the same as assuming piecewise linear preferences, of course: In experimental work where stakes tend to be small one might argue that the parameter estimates correspond to a piecewise linear approximation of the real preferences. We discuss this point further in Section 5.
-13 -result in a mutually exclusive and comprehensive distinction between 9 archetypes of distributional concerns, but also give rise to a clean identification procedure (a "test") that does not rely on unnecessary structural assumptions. This subsection explains how the test works and where the "XY" in the name of the test comes from.  DM is either altruistic, or equality averse, or kiss-up (because those are the only archetypes that imply negatively sloped indifference curves above the 45° line -see Figure 1). The other six possibilities listed in Table 1 and displayed in Figure 1 are inconsistent with her behavior. Adding to this the DM's choices in the domain of advantageous inequality (the Y-List) discriminates between the remaining three possibilities. 18 Basically, the behavior of the DM on the X-List (representing the domain of disadvantageous inequality) tells us in which column in Figure 1 her type is in, while her behavior on the Y-List tells us the row. Also note that the switch points on the two lists reveal not only the archetype of distributional preferences but also give information on preference intensities. For instance, an individual who switches between black and red on point B is more benevolent in the domain of disadvantageous inequality than an individual who switches on point A. As will be shown in Subsection 3.3, this information can be used to construct a two-dimensional index representing both, archetype and preference intensity.

The Symmetric Basic Version of the XY-Test
Although feasible in principle, asking subjects to color a subset of points (those preferred to the given equal-monetary-payoff allocation) out of a larger set (the X-List or the Y-List) might be too demanding a task in a lab experiment. Also, the choice sets themselves (in Figure  17 If the whole X-List line is red, we can infer from this that the upper contour set in the domain of disadvantageous inequality extends at least to point X L , if the whole X-List line is black then we can infer that the lower counter set in this domain extends at least to point X H . 18 The only case where some ambiguity is left on the exact sign of the slope of the indifference curve in the domain of disadvantageous inequality (advantageous inequality, respectively) is when the border between black and red points is exactly above (below, respectively) the point (m, o) = (e, e). Below a DM is referred to as weakly benevolent (malevolent) in the domain of disadvantageous inequality if she decides such that (i) the border between black and red points is exactly above the point (m, o) = (e, e) and (ii) the point exactly above (m, o) = (e, e) is colored red (black, respectively). A similar convention is used for the domain of advantageous inequality (where a red point exactly below the reference point is associated with malevolence and black with benevolence).
Here note that a weakly benevolent DM reveals benevolence only in the impartial decision where no own money is at stake. Also note that without explicitly asking for indifferences, vertical indifference curves can never be identified for sure; but they can be identified with arbitrary precision -see the discussion in the next subsection.
-15 -(ii) g is a "gap" variable characterizing the vertical distance between (e, e) and the two lists -see Figure 3; in order to avoid zero or negative monetary payoffs it seems sensible to restrict g to be strictly smaller than e; (iii) s is a "step size" variable characterizing the horizontal distance between two adjacent points on a list (a restriction on s is imposed in the next point); (iv) t ≥ 1 is a "test size" variable determining the number of steps (of size s) which are made to the left and to the right starting from the point just above or below (m, o) = (e, e); in order to preserve advantageous and disadvantageous inequality it seems sensible to impose the restriction t ≤ g/s.
In total the test consists of 4t + 2 binary decision problems. In each decision problem the subject is asked to decide between two alternatives (named Left and Right), each involving a    An important feature of the test is that within each of the two blocks the material payoff of the passive person in the asymmetric allocation is held constant, while the material payoff of the DM increases monotonically from one choice to the next. Together with the fact that the symmetric allocation remains the same in all choices, this design feature guarantees that strict m-monotonicity is enough to make sure that when facing the choice between Left and Right within a given block, each individual switches at most once from Right to Left (and never in the other direction). Here note that deciding for Left in a given row of Table 2 (Table 3, respectively) is equivalent to coloring red the corresponding point on the X-List line (Y-List line) in Figure 3.
The minimum test size has t = 1 yielding 6 binary decision problems. If classifying subjects into one of the nine archetypes is the main aim of a study then there is no need to use a larger test design since observing the behavior of a DM in 6 binary choices already allows discriminating between the nine archetypes at any arbitrary precision. More specifically, the researcher needs to define when an agent should be considered as egoistic in a particular domain (this is the meaning of arbitrary precision). Suppose we define an agent to be egoistic in a particular domain if she is not willing to give up c Cents in order to change the material payoff of the passive person by 1$. Then the appropriate 6 items test has to be such that c = -17 -100s/g ‹ s = cg/100 meaning that we can choose two of the remaining three parameters (after having fixed t at 1) freely.

Identifying Archetype and Characterizing Intensity of Distributional Concerns:
The (x, y)-Score This subsection describes a method to identify archetype and to characterize intensity of distributional preferences at the individual level and a procedure to represent the type-intensity distribution graphically.
Step 1 (Consistency Check): As argued above an individual whose preferences satisfy strict m-monotonicity has at most one switch from Right to Left (and no switch in the other direction) in each of the two tables.
Step 1 is to eliminate all subjects that fail this basic consistency check (in an implementation of the symmetric basic version of the test -see Section 6 for details -less than 5% of the subjects failed the consistency check).
Step 2 (Defining Scores): Represent each subject with consistent behavior by an (x, y) tuple defined as follows: The variable x (x-score) summarizes the behavior of the individual in the disadvantageous-inequality related block (X-List) and is defined as (t + 1.5) points minus the row number in which the individual decides for the first time for the asymmetric allocation (that is, for the payoff vector on the left hand side). If an individual always decides for the symmetric (or egalitarian) allocation, we take the convention that she decides for the first time for the asymmetric allocation in the (2t + 2) th row, so that she gets an x-score of -(t + 0.5). For instance, if in the test version displayed in Figure 3 (where t = 2) an individual decides for the symmetric allocation in the first row of the X-List and for the asymmetric allocation in the second (and in all other) row(s) then she gets an x-score of 3.5 -2 = 1.5. The variable y (yscore) summarizes the behavior of the subject in the advantageous-inequality related block (the Y-List) and is defined as the row number in which the individual decides for the first time for -18 -the asymmetric allocation minus (t + 1.5) points. If an individual always decides for the symmetric allocation, we take again the convention that she decides for the first time for the asymmetric allocation in the (2t + 2) th row; she then gets a y-score of t + 0.5.
Note that the definition of the two scores implies that each of them can take on 2(t + 1) different values (see Table 4); thus, the proposed test allows for 4(t + 1)² different (x, y)-scores.
Also note that the sign of the x-score corresponds to the sign of ∂u/∂o in the domain of disadvantageous inequality, while the sign of the y-score corresponds to the sign of ∂u/∂o in the domain of advantageous inequality. Furthermore, the magnitude of the x-score (y-score, respectively) is an ordinal index of the intensity of distributional preferences in the domain of disadvantageous inequality (advantageous inequality, respectively). 19 Step 3

Extending and Refining the Test
This subsection proposes three modifications of the symmetric basic version of the test that might help to shed light on more specific research questions. The first modification replaces the symmetric step-size in the basic version by an asymmetric one, the second modification extends the X-List to the left and the Y-List to the right and the third modification adds more lists.
19 Also note that a score of +0.5 indicates weak benevolence, while a score of -0.5 indicates weak malevolence.
-19 -In the (symmetric) basic version of the test, the DM's trade-off between her own and the passive person's material payoff depends on the relation between the horizontal distance between two neighboring dots in Figure 3 ("step size", s) and the vertical distance between those dots and the equal-material-payoff point ("gap size", g). By increasing g, keeping the rest of the test as it is (or by decreasing s, keeping the rest as it is) the power of the test to discriminate between selfish and different variants of non-selfish behavior is increased (remember the discussion on "identification with arbitrary precision" in Subsection 3.2).
However, if the test size is held constant this comes at a cost since the discriminatory power of the test at the borders is decreased (too many subjects end up with extreme values of x-or yscore). Here an asymmetric step-size version of the test -where step size is small at the centre but grows larger when moving away from the centre -might be a good compromise. See Figure 5 for an illustration.

Figure 5: Asymmetric Step-and Test-Size
For some research questions it might be interesting to know whether there are subjects who (in the relevant range) put more weight on the material payoff of the passive person than on their own material payoff (that is, whether there are subjects for whom ∂u/∂o > ∂u/∂m -20 -each list (starting from the symmetric basic version of the test). The test now consists of a series of 4t + 2a + 2 decision problems divided into two blocks (X-List and Y-List), each containing 2t + a + 1 pairs of alternatives. The alternatives are exactly as displayed in tables 2 and 3 with the exception that the X-List starts with m = e -ts -as (as before, it ends with m = e + ts) and that the Y-List ends with m =e + ts + as (as before, it starts with m = e -ts). An important feature of the asymmetric test-size version is that it contains decision problems for which strong altruism dictates a different choice than strong surplus-maximization concerns.
For instance, in the first decision problem on the X-List of the test version displayed in Figure   5, a surplus maximizer has to choose the point (e, e) (because this option maximizes both m and m+o), while a strong altruist might be inclined to color red the leftmost point on the X-List  (Becker 1974, Andreoni andMiller 2002).

Figure 6: A Multi-List Variant of the XY-Test
For some purposes (for instance, to discriminate between different parametric forms within a given class, as, e.g. between the Fehr and Schmidt 1999, and the Bolton and Ockenfels 2000 model of inequality aversion -both imply indifference curves that have negative slope below and positive slope above the 45° line) it might be interesting to gain more insights on the exact shape of indifference curves in (m, o)-space. With a multi-list version of the test, where subjects -21 -are asked to complete two or more X-and Y-lists distinguished by the size of the gap variable g (see Figure 6 for an example), indifference curves can be elicited with arbitrary precision. 21

Discussion of the Test in View of the Literature
The current paper contributes to the growing literature on identifying distributional preference types in the lab. This section highlights the main differences between the current approach and other approaches proposed in the literature.

Aggregate Level Studies in Search for "the Universal Utility Function"
Charness and Rabin (2002), Engelmann and Strobel (2004), Cox, Friedman and Gjerstad (2007) and Cox and Sadiraj (2012) 22 For those more interested in the heterogeneity of preferences and behavior an identification approach -as the one proposed here -seems more useful. It has the advantage that subjects who do not fit the average preference type (most importantly, inequality averse, envious, and spiteful agents) can be identified and their behavior can correctly be interpreted. This is important because -depending on the design of the gamethe preferences and behavior of even a small fraction of agents might have a large impact on the outcome. For instance, as experimental economists have successfully shown, a small 21 Of course, with several X-and Y-lists inconsistent choices (more than one switch in one of the lists or switches in the wrong direction) might become more important than they are in the basic version and econometric techniques (such as finite mixture models) might be needed to get meaningful results. We discuss this point further in Section 5. 22 Charness and Rabin (2002) propose a "quasi-maximin model" in which utility increases in the lowest of all agents' monetary payoff (the maximin property) and the total of all agents' payoffs (the surplus maximization property). Engelmann and Strobel (2004) present evidence that supports the view that the behavioral impact of both efficiency and maximin concerns is stronger than that of inequality aversion in the aggregate. Cox and Sadiraj (2012) present experimental evidence that is inconsistent with inequality aversion and quasi-maximin models and use this evidence to motivate their "egocentric altruism model" in which preferences are positively monotonic (i.e., utility is increasing in the material payoffs of all agents), strictly convex (utility is strictly quasi-concave in all its arguments), and "egocentric" (if a player has the choice between two (m,o) allocations where the m part in one allocation is equal to the o part in the other and vice versa, then he chooses the one with the higher m). Cox, Friedman and Gjerstad (2007) propose a similar CES utility function but allow an agent's distributional preferences to depend on relative status, on previous behavior of others (reciprocity) and on the set of alternative actions available to others.
-22 -fraction of inequality averse players in a public good game with punishment is sufficient to credibly threaten that free riders will be punished, inducing even selfish agents to contribute to the public good (see Fehr and Gächter 2000band 2002, or Fehr and Fischbacher 2003 instance). Similarly, as shown by Tyran and Sausgruber (2004) "a little fairness" (i.e., a small fraction of inequality averse agents) may induce a lot of redistribution in democracy. Also, as shown by Malmendier and Szeidl (2008) a small fraction of spiteful agents can have a large impact on the outcome of an auction because the auction format "fishes for fools" (i.e., disproportionately many buyers with an overbidding bias end up as winners). In such circumstances a utility or motivational function that fits well on average is not likely to be a good predictor for the outcome of the game (because the preferences of a minority have a disproportional impact on the outcome).

Continuous Dictator Game Studies Looking at Heterogeneity at the Individual
Level Andreoni and Miller (2002)  investigating consistence with GARP (the authors find that most subjects' choices nearly satisfy GARP), a task for which the test proposed here is neither suited nor intended to be suited. Also, in one of their treatments DMs are exposed to choices that have consequences for three persons, for the DM and for two other subjects. This allows them to investigate the relationship between 'preferences for giving' (tradeoffs between the DM's own payoff and the payoffs of others) and 'social preferences' (trade-offs between the payoffs of others).
-24 -quadrant of Figure 4). Also, while the parameter of aversion to advantageous inequality is Note that this figure is remarkably close to the one we obtain in the experiments reported in Section 6 (roughly 10% of the subjects have a negative y-score -see Figure 7). 26 This should not be read as a critique against their approach. After all the Fehr and Schmidt model has been designed to explain behavior in strategic games and the "calibration" of parameters with ultimatum game data has been suggested by the inventors themselves. A possible advantage of their approach (in terms of explaining behavior in strategic games) is that it captures not only distributional preferences but also other forms of otherregarding preferences (such as reciprocity motives, for instance).
-25 -designed to yield data for the estimation of the parameters of a given model. 27 A related difference is that their identification procedure relies on tight structural assumptions, while the non-parametric approach proposed here does not rely on any structural assumptions. 28 The ring-test has been developed by social psychologists (standard references are Griesinger andLivingston 1973 andLiebrand 1984) to assess "social value orientations" at the individual level, and it has been employed by economists in a variety of different experiments to identify type and measure intensity of distributional preferences (see, for instance, Offerman et al. 1996, Sonnemans et al. 1998, van Dijk et al. 2002, Brosig 2002, Brandts et al. 2009, or Sutter et al. 2010. In its simplest implementation (often called the "circle-test") the ring-test asks subjects to choose their most preferred point on a circle in the (m, o) space. The circle has its centre at the origin of the (m, o) plane and has a radius of r, say. Depending on their choice subjects are then classified either into the 5 "social value orientation" (SVO) types: "altruistic", "cooperative", "individualistic", "competitive" and "aggressive", or into 8 SVO types, the 5 listed types plus "martyrdom", "masochism" and "sadomasochism". Although interesting from an aesthetical point of view, the ring-test has serious theoretical shortcomings, the most important one being that the test assumes linear preferences (see Appendix B for details) and thereby systematically misclassifies those archetypes that are inconsistent with linearity (e.g., because they imply a sign change in ∂u/∂o). For instance, inequality averse subjects are typically classified by the ring-test -depending on the intensity of preferences -either as altruistic ("cooperative" in the language of the ring-test) or as selfish ("individualistic" in the language of the ring test), while inequality loving subjects are classified either as competitive (here the ring-test language corresponds to our language) or as selfish. Furthermore, the ringtest cannot distinguish between maximin and egoistic preferences (both are classified as "individualistic" types), or between envious and competitive preferences (both are classified as competitive types). Finally, archetypes that are consistent with linearity, in principle, might be misclassified if observed choices are only consistent with nonlinear indifference curves (as it is 27 The same is true for the test used by Cabrales et al. (2010). There, subjects are exposed to 24 choices among four (m, o) allocations drawn at random (but not uniformly) from a subset of the positive orthant, and evidence from these choices is then used to estimate subjects' distributional preference parameters within the realm of Charness and Rabin's piecewise linear basic model (without reciprocity). 28 The econometric approaches employed by Cabrales et al. (2010) and Iriberri and Rey-Biel (2013) have other virtues. For instance, by exposing subjects to a large number of tasks they allow for a formal statistical framework within which the degree of consistency of observed choices with preferences represented by a given functional form can be tested. By contrast, the main strength of the approach proposed here is to start with a small set of basic assumptions on preferences and to devise a procedure that assigns archetypes to people in a non-parametric way, given their preferences satisfy the assumptions. That is, our procedure does not -and is not intended to-test whether subjects' preferences satisfy those assumptions (the multiple-list version of the test proposed in Subsection 3.4. would allow for such tests, though). In that sense our approach is similar in spirit to the ones proposed by Holt and Laury (2002) and Dohmen et al. (2010Dohmen et al. ( , 2011  -26 -the case with convex altruism, for instance). 29 How serious those problems are depends on the context, of course. After all an inequality averse subject, for instance, behaves as an altruistic one in the domain of advantageous inequality but it behaves as a competitive one in the domain of disadvantageous inequality. The same is true of other inappropriately classified types. 30 Summing up, many interesting approaches to identify archetypes and to measure intensity of distributional preferences have been proposed in the literature but most of them require more time and effort from a subject to produce a score and therefore seem less suited as a tool in experimental economics to control for subject pool effects, or to help to interpret data from other (unrelated) experiments. Furthermore, the identification procedures typically discriminate between a somewhat arbitrary set of preference types or they rely on by far more demanding assumptions on preferences than the approach proposed here.

The 'More Altruistic Than' Relation by Cox et al. (2008)
In their 2008 paper Cox, Friedman and Sadiraj formalize the binary relation "more altruistic than" between two different preference orderings over income allocation vectors. For a given domain DÕ R 2 + this binary relation induces a partial ordering on admissible preferences over (m, o) tuples. For the case of negatively sloped indifference curves (altruism, taste for efficiency), preference ordering A is more altruistic than preference ordering B if A has shallower indifference curves than B in (m, o)-space, so A indicates a higher willingness to give up units of m for a unit increase in o than does B. Similar for the other eight archetypes of distributional preferences (for the case of positively sloped indifference curves the binary relation more altruistic than translates to "less malevolent than"). As is easily verified, if D is 29 Misclassifications are especially likely in the most widely used implementation of the test where subjects are not asked to choose the most preferred point on the circle but are rather exposed to a series of binary choices between adjacent (m, o) alternatives on the circle. In this case the standard identification procedure heavily uses the fact that with linear indifference curves the most preferred point on the circle lies exactly opposite to the least preferred one (see Appendix B for a discussion), a property that is typically violated if indifference curves are nonlinear. 30 Besides the conceptual problems inherent in the test itself, there are also some issues in the typical implementation that seem problematic. For instance, many researchers (i) use the double role assignment protocol, (ii) pay out each single decision and (iii) use fixed pairings throughout the whole classification procedure. Thus, the payoff received by a subject is determined by all the decisions she makes in the test and by all the decisions made by her "partner". There are at least two potential problems with this, first paying out all decisions means that in each binary decision a subject might not decide for the most preferred allocation but might want to implement her most preferred final allocation; and secondly, in the fixed pairings design not only preferences but also beliefs play a role in the decisions (suppose, for instance, a subject is inequality averse; if she expects the partner to behave selfishly, she will decide selfish too in order to implement a fair overall allocation; by contrast, if she expects the partner to act altruistically, she might have an incentive to act altruistically too, since altruistic acts then lead to a more egalitarian overall allocation).
-27 -the entire positive orthant then a necessary condition for the preference ordering of DM i to be more altruistic than the preference ordering of DM j∫ i is that x i ≥ x j and y i ≥ y j . 31

Parameter Ranges in Piecewise Linear Model and Willingness to Pay
The (x, y)-score as defined in Subsection 3.3 is an ordinal index of preference intensity (a higher x means a higher weight on the other's payoff when the DM is behind while a higher y means a higher weight on the other's payoff when the DM is ahead) and as such is not normalized with respect to the four design parameters (e, g, s, t). This makes it difficult to compare the results of studies which use different sets of design parameters. This might be regarded as a drawback as the proposed test design is per se well suited for measuring the distributional preferences in experiments and representative surveys with large samples. To make the results of different studies comparable (even if they use different sets of design parameters) it might be advisable to replace the (x, y)-score by a cardinal metric that is equally easy to compute and has a similar intuitive interpretation. One way to get to such a metric is to translate the (x, y)-score into parameter ranges in structured models frequently used in the literature. The most widely used functional form in the empirical literature (see, for instance, Cabrales et al. 2010, Blanco et al. 2011and Iriberri and Rey-Biel 2013 is the piecewise linear model introduced by Fehr and Schmidt (1999) as a description of self-centered inequality aversion and extended by Charness and Rabin (2002) to allow for other forms of distributional concerns. In the reciprocity free version the Charness and Rabin (CR) representation of preferences takes the form where g and σ are parameters assumed to satisfy σ ≤ g < 1 and where I is an indicator variable that takes the value of one if the condition in the subscript is met and the value of zero otherwise. This formulation says that the DM's utility is a linear combination of her own material payoff and the other person's material payoff and that the (otherwise constant) weight the DM puts on the other's payoff might depend on whether the other is ahead or behind. If one is willing to assume that subjects' preferences can be approximated by this form, how do (x, y)scores translate into parameter ranges in this model? This question is easily answered. Consider the X-List first. In this domain a DM with CR-preferences weakly prefers LEFT to RIGHT in row r oe {1,..., 2t+1} iff (1 -σ)[e + (r-t-1)s] + σ(e + g) ≥ e. Thus, assuming that a DM who is 31 For restricted domains of preferences (for instance, if only piecewise linear utility functions are admitted as in Schmidt 1999 or in Charness andRabin 2002, or if only specific kinds of CES functions are considered as in Andreoni and Miller 2002, in Fisman et al. 2007, or in Cox and Sadiraj 2007 combined with a test design with high resolution (large gap variable g combined with small step size s) and for restricted domains of income allocation vectors (adapted to the binary decisions in the test) the two notions are equivalent to each other.
-28 -indifferent decides for LEFT, the relationship between x-score and parameter range of σ in the piecewise linear model is as shown in Table 5. Using the same tie breaking rule (an indifferent DM decides for LEFT) for the Y-List we get a similar table (not shown) with x-score replaced by y-score, σ replaced by g, and strict inequalities replaced by weak ones (and vice versa).  • The structural model could also be applied in the context of a finite mixture specification. This would allow to identify the prevalent social preference types and to endogenously classify each subject into the type that fits her behavior best.
• The parameters of the piecewise linear model can also be estimated when the test is applied in its multi-list variant where the (x, y)-score is no longer available.

Experimental Results Based on the Symmetric Basic Version of the Test
Here the data from a paper-and-pen experiment based on the symmetric basic version of the test is reported. The experiment was conducted in paper-and-pen (and several other design features reported below were applied) to convince subjects that neither other experimental subjects nor the experimenters could identify the person who has made any particular decision.
This was done in an attempt to minimize the impact of experimenter demand and audience effects. See List (2007) for a discussion on experimenter demand effects and Hoffmann et al.
(1994), Andreoni and Petrie (2004), and Andreoni and Bernheim (2009) for experimental evidence indicating that -depending on the experimental design -audience effects might have a large impact on subjects' behavior in dictator-game like situations. 33 32 I thank an anonymous colleague for suggesting this interesting discussion. 33 See Hoffman et al. (1994) and Cox and Sadiraj (2012) for (almost double blind) experimental designs similar to the one employed here. Appendix A discusses alternative designs more suitable for the case where the XY-test is used as a tool in computer-supported experiments to address research questions in which distributional preferences play an important role, to control for subject pool effects, or to help to interpret data from other (unrelated) experiments. Those alternative protocols might be regarded as more problematic in terms of experimenter demand and audience effects but have advantages in other dimensions (an important one being that subjects are motivated -30 -Experimental Procedures: Five experimental sessions were conducted manually (i.e., in penand-paper) at the University of Innsbruck in autumn 2009. Forty subjects who had not participated in similar experiments in the past were invited to each session using the ORSEE recruiting system (Greiner 2004). Since not all subjects showed up in time, 192 (instead of the invited 200) subjects from various academic backgrounds participated in total, and each subject participated in one session only. After arrival, subjects assembled in one of the two laboratories and individually drew cards with ID numbers (which remained unknown to other participants and the experimenters). Then instructions were distributed and read aloud. 34 Instructions informed subjects (i) that there are two roles in the experiment, the role of an 'active person' and the role of a 'passive person'; (ii) that there is exactly the same number of active and passive subjects in the experiment and that roles are assigned randomly; (iii) that each active person is matched with exactly one passive person and vice versa, and that at no point in time a participant will get to know anything regarding the identity of the person she/he is matched with; (iv) that active persons are called to make a series of ten binary decisions that determine not only their own earnings from the experiment but also the earnings of the passive person they are matched with; (v) that passive persons do not have a decision to make in the experiment and that their earnings will depend exclusively on the decisions of the active person they are matched with; (vi) that only one of the ten choice problems of each active person will be relevant for cash payments; and (vii) that cash payments could be collected the day after the experiment at one of the secretaries who handles also the cash payments for other experiments (to ensure that the amount a subject earns cannot be linked to her/his decisions). Then subjects were randomly assigned to one of the two roles; active persons stayed in the same room while passive persons were escorted to the adjacent laboratory.
In both rooms subjects were seated at widely separated computer terminals (computers were switched off) with sliding walls. Active persons were handed out a form consisting of two pages -an empty cover sheet and a decision sheet as described in the next paragraph -and they were asked to fill out the decision sheet in private. Passive persons received a form consisting of three pages -an empty cover sheet and a two-page questionnaire unrelated to the experiment -and they were asked to complete the questionnaire in private. After the tasks in both rooms had been completed, for each active person one of the choice problems was randomly selected via a manual device -a bingo ball cage handled by the active person -for the purpose of cashpayment generation. The payoff-relevant decision problem was written on the cover page of the active person and the person was given the opportunity to take in private a look at her/his to think about their decisions carefully and make choices that reflect their true preferences even when the preference elicitation procedure is only one of several tasks in an experiment). 34 The instructions -not intended for publication-are in Appendix C.
-31 -choice in the payoff-relevant decision problem. Now subjects in both rooms were asked to label (in private) the cover sheet of their document with their ID number. Then participants in both rooms were called to put their documents (again in private) in boxes before leaving the room.
Anonymous cash payments started the next day -giving experimenters the opportunity to manually match active with passive persons in the meantime. Participants presented the card with their ID number to an admin staff person who did not know who did what for which purpose, nor how cash payments were generated and they got their earnings in exchange (the fact that cash payments would be made that way was clearly indicated in the instructions). On average subjects earned approximately 11 Euros plus a show up fee of 4 Euros.

Experimental Design:
The symmetric basic version of the test was implemented with e = 10, g = 3, s = 1, t = 2 and with experimental currency units corresponding to Euros. Thus, each active person (96 in total) was exposed to 10 binary decision problems with (10, 10) as the recurring equal-material-payoff allocation. The decision problems were presented in two tables, 5 in the X-Table (disadvantageous inequality) and 5 in the Y- Table (advantageous inequality).
The design of the two tables was similar to that of tables 2 and 3.
Experimental Results: Of the 96 active subjects 4 (i.e., less than 5%) were eliminated in Step 1 of the procedure described in Subsection 3.3. The (x, y)-scores of the remaining 92 subjects were distributed as shown in Figure 7. 35 It is worth noting that more than half of the 36 points in the (x, y)-plane, where a subject could potentially sit, remain unoccupied, and only nine points are occupied by more than one subject. Thus, there is a sizeable amount of endogenous clustering. Also note that almost all subjects (87/92 = 95% of the population) reveal (weakly) more benevolent (less malevolent) preferences in the domain of advantageous than in the domain of disadvantageous inequality (i.e., their y-score exceeds the x-score). Taken together those two pieces of evidence (endogenous clustering of subjects and decisions consistent with convex preferences) indicate that subjects understand the binary choices presented to them and that the results reported here are driven by well-behaved distributional preferences and not by noise. The second piece of evidence also implies that non-convex types (most importantly, kick down and equality averse) are empirically irrelevant. Turning to convex types (convexity refers to the shape of indifference curves here), it is interesting to note that the behaviour of about two thirds of the subjects (those in the positive quadrant; 61/92 = 66.30% of the subject population) is consistent with altruistic preferences (there are only 2 subjects who reveal non-convex altruism), while the behaviour of (only) about one forth of the participants (those in the N/W quadrant; 22/92 = 23.92% of the subjects) is consistent with (any form of) inequality 35 It is important to note that the data points in Figure 7 are jittered (to make each single point visible). For instance, the 29 observations scattered around the point (½, ½) all belong to the point (½, ½).
-32 -aversion. 36 Spiteful subjects (negative quadrant) exist, but they account for less than 7% of our population (and even spiteful subjects' score is consistent with convex preferences).

Figure 7: Absolute Frequency of (x, y)-Scores in Experiments Based on Basic Test Version
(96 active persons; 4 revealed inconsistencies; the figure is based on the remaining 92 subjects) It is also interesting to observe that the behaviour of types at the border between altruism and inequality aversion (x oe {-½, ½} and y > 0; 59/92 = 64.13% of the subject population) is consistent with maximin, while the behaviour of types at the border between inequality aversion and spite (x < 0 and y oe {-½, ½}; 18/92 = 19.57% of the population) is consistent with envy. Finally it is interesting to observe that the behavior of almost 50% of the population (those subjects with x and y in {-½, ½}; 45/92 = 48.91% of the population) is consistent with selfish preferences. Here note that the test assigns selfish subjects to one of the four quadrants in Figure 4 (Figure 7, respectively) according to their 'impartial distribution preferences' expressed in their choice behavior in the (t+1) th row of the two lists (where the DM decides between two allocations that differ only in the payoff of the passive person). For instance, a subject that is weakly benevolent in both domains gets (x, y) = (½, ½), while a subject that is 36 Given that differences in the distribution of types between subject pools are likely to be large this piece of evidence should not be interpreted as indicating that egalitarian motives are important only for a minority of subjects. See  for experimental evidence indicating that students (especially students of economics) are less egalitarian and more efficiency oriented than the rest of the population. See also the response by Engelmann and Strobel (2006) in the same issue.
Looking at Figure 7 we see that the choices of a majority of (but by far not the choices of all of) those subjects whose behavior is consistent with selfish preferences is also consistent with 'lexself' as defined by Fisman et al. (2007). 37

Conclusions
This paper has proposed a geometric delineation of distributional preference types and a nonparametric approach for their identification in a two-person context. Major advantages of the proposed test over previous ones are (i) that it is simple and short as subjects' task is to make a small set of diagnostic choices without feedback; (ii) that it is parsimonious as it relies on a small set of primitive assumptions; (iii) that it is general as it directly tests the core features of different types of distributional preferences rather than concrete models or functional forms; (iv) that it is flexible as test size and test design can easily be fine-tuned to the research question of interest; (v) that it is precise as it identifies the archetypes of distributional concerns with arbitrary precision and also gives an index of preference intensity; and (vi) that it minimizes experimenter demand effects as subjects are asked to make binary decisions in a neutral frame and do not have the option to do nothing. 38 Those features together suggest that the proposed test might be suitable as a tool in experimental economics to disentangle the impact of distributional preferences from that of other factors thereby helping to interpret data from other (unrelated) experiments (similar to the choice list tests used to elicit risk attitudes; see Holt and Laury 2002, or Dohmen et al. 2010 That the proposed test is indeed suitable for that purpose has been shown in two recent studies: Balafoutas et al. (2012) investigate in a standard lab experiment the relationship between distributional preferences and competitive behavior and find (a) that distributional archetypes (as assigned by the proposed test) differ systematically -and in an intuitively plausible way-in their response to competitive pressure, in their performance in a competitive environment and in their willingness to compete; and (b) that controlling for the effects of distributional preferences, as well as for risk attitudes and some other factors, closes the large gender gap in competitive behavior found in earlier studies (by Niederle and Vesterlund 2007 and 2010, for instance). This is an important finding because it indicates that the gender gap in competitiveness is largely driven by mediating factors (potentially accessible to policy 37 Here note that in the XY-test a subject with lexself preferences necessarily ends up with an (x, y)-score of (½, ½), independently of the parameterization of the test. As can be seen in Figure 7 about one third of the participants (specifically, 29 of the 92 classified subjects) ended up with such a score. 38 This is in contrast to the standard dictator game which gives the DM a windfall gain and then the option to share. This makes it pretty clear what would be considered decent behavior by the experimenter. 39 Some implementation issues relevant for this purpose are discussed in Appendix A.
-34 -intervention) and not by gender per se. Hedegaard et al. (2011) examine in a large-scale internet experiment the impact of distributional concerns on the contribution behavior in a standard (linear) public goods game and find (a) that distributional archetypes differ systematically -and in an intuitively plausible way-in their contribution behavior; and (b) that accounting for the differences explains roughly half of the gap between actual behaviour of subjects in the lab and the theoretical benchmark derived under the assumption that players are rational and selfish (and that this fact is common knowledge). Again, this is an important finding because it helps to disentangle the impact of distributional concerns on the behavior of subjects in social dilemma games from that of other factors -as beliefs on others' behavior or intentions, for instance. Together the findings in those studies clearly indicate that associating subjects with one of the proposed archetypes of distributional concerns has explanatory value and that the proposed test is indeed a valid control instrument in experimental economics.
Beyond its potential to act as a control instrument in experimental economics, other potentially fruitful applications of the test include (a) investigating the stability of distributional preferences over different domains (for instance, a potential shortcoming of the approach proposed here is its focus on the two-agents case; investigating whether the preferences revealed in that context carry over to a richer environment is surely an important issue); 40 (b) investigating possible links between distributional preferences and other forms of other-regarding preferences (for instance, "Are altruists more or less likely to be motivated by positive or negative reciprocity?", "Do altruism and altruistic rewarding (or altruistic punishment) go together or are they mutually excluding ways to reach the same goalpromoting private provision of public goods?" 41 , or "Is the test-based classification of subjects in distributional-preference types somehow correlated with the propensity to be motivated by trust?"); 42 and (c) applying the proposed test (together with tests for risk and time preferences and for personality traits) in experiments with large demographic variation (age, gender, income, education) or with a representative sample of the population to detect patterns and correlations (for instance, "Are distributional preferences and risk attitudes or time preferences 40 In this regard the XY-test is very similar to the standard risk-attitude elicitation procedures that ask subjects to compare a binary lottery either to another binary lottery (as in Holt and Laury 2002) or to a risk-free alternative (as in Dohmen et al. 2010 and. In both domains, the risk-preference domain and the distributional-preference domain, the hope is that the preference type revealed in the binary environment is informative of the attitude of the decision maker in richer environments. 41 Altruistic rewarding (punishment) is the propensity to give rewards to (impose sanctions on) others for 'normabiding-behavior' ('norm-violating-behavior') even if rewards (sanctions) are costly for the rewarder (punisher) and yield no private material benefits whatsoever. See Gächter (2000b and, or Fehr and Fischbacher (2004), for studies investigating the power of altruistic rewards and punishments. 42 At first blush this avenue of research seems to be closely related to that of Blanco et al. (2011). It is not that close, however, since the research question proposed here is not whether the behavior of players in strategic games can be explained by distributional preferences (alone) but rather whether there is a link between distributional preferences and other forms of other-regarding preferences.
-35 -somehow related?", "Are there gender differences in the distribution of archetypes?" 43 or "What is the impact of age and income on distributional preferences?"). Beyond economics the proposed test might help to address important research questions in biology and psychology as, for instance, "What determines human altruism (or spite)?" or, "What drives altruistic punishments and rewards?".
For those and many other interesting research questions identification of distributional preference types in a "clean" environment appears to be a natural first step. The proposed nonparametric approach seems to be well suited for this purpose. Turning back to the quote at the start of the paper the hope is that it turns out to be "as simple as possible, but not one bit simpler".

Appendix (For Online Publication)
This appendix consists of three parts. Parts A and B discuss two themes that seem important when the test is used as a tool in experimental economics to answer specific research questions in which distributional preferences play an important role, to control for subject pool effects, or to help to interpret data from other (unrelated) experiments: While Appendix A is devoted to a discussion of implementation issues for this case, Appendix B contains a more detailed discussion of the ring-test, probably the most important contender in this domain. Appendix C contains the instructions to the experiments reported in Section 6 of the paper.

Appendix A: Implementation Issues
While the non-parametric identification approach proposed in the body of the paper seems in principle well suited as a tool in experimental economics to be added to arbitrary (other) experiments, there are several practical issues that need to be addressed.
Role Assignment: At least 3 different protocols regarding role assignment have been used in the literature on elicitation of distributional preferences, fixed role assignment, where roles (active DM and passive person) are assigned ex ante, and only active DMs decide while passive persons do nothing (see, e.g., Sadiraj 2012); role uncertainty, where each subject decides in the role of the active DM, and only later subjects get to know whether their decision is relevant -i.e., whether they have been chosen as DM or as passive person (this procedure was used by Strobel 2004 andby Blanco, Engelmann, andNormann 2011); and double role assignment, where each subject decides, and each subject gets two payoffs, one as an active DM and one as a passive person (as in Andreoni and Miller 2002, Anderoni and Vesterlund 2001and in Fisman, Kariv, and Markovits 2007 While the fixed rule assignment (this protocol has been used to produce the data presented in body of the paper) seems to be the cleanest procedure from a theoretical point of view, it is not practicable when the test is intended as a tool to be added to arbitrary other experiments (since it would imply inviting twice as many subjects than needed for the main treatments). Each of the other two protocols seems to have some drawbacks. Consider the role uncertainty protocol first. Since it introduces an element of randomness in the determination of the payoff allocation 44 Actually, the double role assignment protocol comes in two varieties; while in version 1 (the version discussed in the main text) the computer program makes sure that a subject's active DM is a different participant than a subject's passive person (and instructions are very explicit about this), version 2 (often used in the implementation of the ring-test and in related tests designed by social psychologists) has fixed pairs, meaning that a subject's active DM is the same participant as her passive person. Version 2 seems theoretically problematic and is therefore ignored in the discussion in the main text.
-44 -resulting from a decision it raises theoretical questions related to the issue process fairness vs. outcome fairness (see Andreoni and Bernheim 2009 for evidence that some subjects care for process fairness). Secondly (and related to the first point), expectations about the behavior of the passive person in the counterfactual situation where she is the active DM might influence choices (provided process fairness matters; if not then not). 45 The double rule assignment protocol seems to be better in the former dimension, but it might be worse in the latter as a subject's expectations about what she gets as a passive person and about what her passive person gets as the active DM are even more likely to enter the picture. In sum, both double rule assignment and role uncertainty protocol have their own problems and it is ultimately an empirical question which one performs better (in predicting the decisions in other distributional tasks, for instance). Here, promising evidence in support of the role uncertainty protocol is provided by Hedegaard et al. (2011). In their large-scale internet experiment they employ both the fixed role assignment and the role uncertainty protocol in a between subjects design and they show that the two protocols yield results that are statistically indistinguishable, both regarding the distribution of archetypes they yield and regarding the ability to predict behavior in other games. 46

Presentation of the Binary Decision Problems:
In the paper-and-pen experiments reported in the body of the paper the binary decision tasks were presented to the subjects in ordered lists or tables (similar to the lists often used in risk-attitude elicitation tasks). In computer-aided experiments (using z-Tree developed by Fischbacher 2007, for instance) presenting the binary decisions one-at-a-time in random order (i.e., each binary decision on an own screen) might be an attractive alternative. Experience with both presentation techniques in computer-based experiments (where the test was added as a control at the end of the main experiments) suggests that the randomized test version produces (slightly) more inconsistencies (more than one switch in at least one of the two lists; or switches in the wrong direction) but might (slightly) increase the predictive power of the test for the classified subjects. This indicates that especially when the test is added as a control at the end of other experiments the presentation of the binary 45 If subjects have distributional preferences in the textbook variety (only outcomes matter) then expectations should not shape decisions in the role uncertainty protocol: with some probability a (= ½) the other person is the active DM und your expectations about her/his behavior influence what you expect to get in that case; given your preferences and expectations this yields a fixed "utility" that you get with probability a; to maximize your overall (expected) utility you still have an incentive to maximize that part of your "utility" that realizes with probability 1a. No expectations enter in that part of your overall expected utility. The story changes if (some) subjects are not (only) concerned with the "fairness" of outcomes but (also) with the "fairness" of lotteries. This is beyond pure distributional preferences, though. 46 This is in line with earlier evidence provided by Engelmann and Strobel (2004) who find relatively small and insignificant differences in the choices of subjects between their main treatment with role uncertainty and their control with fixed role assignment.
-45 -decision tasks might be critical in recovering reliable data on distributional preferences. The simplest procedure (presenting the binary decision tasks in a table) might not produce the most reliable results then, the main reason being that subjects do not think carefully enough about the tasks when they are presented all at once in a table. 47 A "third way" to implement the test in the lab is to present the binary choices first in a totally randomized way (i.e., also randomized across blocks and in the presentation of the recurring alternative on the left or the right hand side), and to show subjects the ordered lists (with their decisions in the different choice tasks) when they are done with all choices. They can then revise their choices if they like. 48 The latter procedure forces players to rethink their decisions; this might help to get fewer inconsistencies than in the random order design and might at the same time yield higher predictive power than the list versions. Again, it is ultimately an empirical question to sort out what the best way is to present the choice problems to the subjects when the test is used as a tool in experimental economics.

Clustering of Subjects:
Clustering means dividing subjects into groups (clusters) so that members of one group are somehow similar to each other and dissimilar from members of other groups. There are many ways how clustering might be performed after having assigned to each subject an (x, y)-score. An obvious one is to cluster subjects into 9 groups corresponding to the 9 archetypes described in the body of the paper. This will often not meet the needs of the experimenter, though. The preferred way to group subjects will rather depend on at least three factors: (i) on the number of subjects taking part in the experiment; (ii) on the test version used; and (iii) on the research question under investigation.

Number of Subjects:
If the number of subjects taking part in a study is small it does not make much sense to group them into too many clusters. Several approaches to divide subjects into 2-4 clusters spring to mind and the preferred one will, in general, depend on the other two dimensions discussed below. For instance, when a rough test version (high value of the quotient s/q; see the discussion in the next paragraph) is used a natural approach to divide subjects into 3-4 clusters is to use the sign of the two scores as the discriminator (both scores positive: altruistic; x-score negative, y-score positive: inequality averse; both scores negative: spiteful; xscore positive, y-score negative: equality averse; the latter class will be almost empty, though), while a finer test version would suggest that subjects with x-and y-score in {-0.5, +0.5} should be grouped in an own cluster (egoistic). On the other extreme, if the number of subjects taking 47 Misclassification of subjects who do not think carefully enough about the alternatives in a binary decision task seems less of an issue in the experiments reported in the body of the paper because subjects' task was merely to make 10 binary decisions there. We therefore opted for a design that is cleaner in terms of experimenter demand and audience effects. 48 Hedegaard et al. (2011) employ this protocol in their internet experiments.
-46 -part in the study is large there might be no need for exogenous clustering at all. For instance, in the experiments reported in Section 6 of the paper of the 36 points in the (x, y)-pane where subjects can potentially sit, more than half remained unoccupied, and only on nine points there was more than one subject sitting. Here it might make sense to work with nine clusters or less and to assign subjects who sit alone (or almost alone) at a point to one of the more frequented adjacent points according to some distance measure.
Test Version: One important design decision in the symmetric basic version of the test is the choice of the quotient s/g since this quotient determines the precision with which egoistic subjects are identified. If egoistic subjects are identified with high precision it might make sense to work with the following 5 clusters (where the last one is empty with high probability): x oe {-0.5, +0.5} and y oe {-0.5, +0.5}: egoistic; x ≥ 0.5 and y ≥ 0.5 and at least one inequality strict: altruistic (or efficiency loving); x ≤ 0.5 and y ≥ 0.5 and at least one inequality strict: inequality averse; x ≤ 0.5 and y ≤ 0.5 and at least one inequality strict: spiteful; and x ≥ 0.5 and y ≤ 0.5 and at least one inequality strict: equality averse.
Research Question: The research question under investigation is important for clustering, of course. For instance, for predicting the behavior of a subject in a standard dictator game, only the y-score should be important. So, for predicting behavior in dictator game like situations it might be sensible to divide subjects only according to this dimension (for instance, into 3 clusters, one with y<0, the second with y=0.5 and the third with y>0.5; or, depending on the test design, into 4 or more clusters by splitting up the y>0.5 group in subgroups).

Appendix B: The Ring-Test
The ring-test has been developed by social psychologists (standard references are Griesinger andLivingston 1973 andLiebrand 1984) to assess "social value orientations" (SVO) at the individual level. In the standard implementation of this test subjects are asked to make 4k Social psychologists have developed a standard procedure to identify distributional type and intensity of distributional concerns on the basis of observed choices in this test. The procedure is based on the assumption that preferences of subjects can be represented by a utility or motivational function of the form where µ and λ are two a priori unrestricted parameters (that is, no assumption guaranteeing that µ + λ =1 -or that at least one of the parameters is strictly positive, or whatsoever -is made; more on this below), which are assumed to be constant. Given a motivational or utility function of the form (rt) a subject has a unique most In other words, linear preferences as described by equation (rt) have the property that the consumer's entire preference relation can be deducted from a single indifference set (or "indifference curve"). 51 For instance, social psychologists allow µ to be negative and λ to be positive, yielding indifference curves that have positive slope (similar to the ones the body of the paper has attributed to spiteful agents) but where the upper contour sets are to the left of the indifference curves (if µ is not too negative an individual with such preferences is classified by the ring-test as a "SVO altruist"; see the discussion below and Figure A1 in this appendix for typical indifference curves). They also allow both parameters to be negative, resulting in negatively sloped indifference curves (similar to the ones the body of the paper has attributed to altruists) but have the upper contour set again to the left (in a certain range, such individuals are classified by the ring-test as "SVO aggressors"; again, see Figure  A1 in this appendix).
-48 -r) depending on whether λ > 0 (in this case points above a given indifference set -which is a horizontal line, as said-are strictly preferred to points in the indifference set) or λ < 0 (in this case points below a given indifference set -which is again a horizontal line-are strictly preferred to points in the indifference set). Furthermore, in the choice between two adjacent points on the circle subjects deciding such as to maximize a motivational or utility function of the form (rt) always select the alternative closest in distance to the point (m*, o*). This is so, because the alternative which is closer in distance to (m*, o*) is necessarily on a higher indifference curve (remember that indifference curves are parallel displacements of each other).
Consider a perfectly rational subject with a motivational or utility function of the form Subjects are then either classified into the 5 "social value orientation" (SVO) types altruistic, cooperative, individualistic, competitive and aggressive, or into 8 clusters, the 5 listed plus sadomasochistic, masochistic and martyrdom (see Figure A1 at the end of this appendix). 52 As is easily verified, the described "technique" for finding the point (m*, o*) from an indifference curve map can easily be reversed: simply take the tangent to the circle at the point (m*, o*) to get the relevant indifference curve; then solve for the associated utility function along the lines described in Footnote 50 above.
-49 - with most preferred point between -90° and -112.5° fall outside the considered range of preferences and are perhaps again more interesting for psychologists than for economists (in terms of the motivational function (rt) such subjects decide as if µ<0 and λ<0). See Figure A1 below for a characterization of the types "SVO sadomasochist", "SVO masochist" and "SVO martyr". 53 When discussing angles of rays the reference ("the 0°") is always the horizontal line corresponding to vertical indifference curves with upper contour sets to the right. Here note that indifference curves associated with a given ray are orthogonal to the ray.
-50 - Welcome and thank you for participating! You are taking part in an economic experiment on decision making. A research foundation has provided the funds for conducting the experiment. You can earn a considerable amount of money by participating. The text below will tell you how the amount you earn will be determined.

Anonymity
You will never be asked to reveal your identity to anyone during the decision-making part of the experiment. Neither the experimenters nor the other subjects will be able to link you to any of your decisions. In order to keep your decisions private, please do not reveal your choices to any other participant. The following means help to guarantee anonymity:

Non-Computerized Experiment and Private Code
The task you have to complete during the experiment is conducted in private on a printed form; that is, the experiment is not computerized. You have drawn a small sealed envelope from a box upon entering the room. PLEASE DO NOT OPEN YOUR ENVELOPE BEFORE THE EXPERIMENT STARTS.
Your envelope contains your participation number. We will refer to it as "your private code" in the following. Your private code is the only identification used during the experiment and you will also need it to collect your cash payments.
When you have completed your task in the experiment you will be asked to write your private code on the front page of your form, to put the form in a new (larger) envelope, to seal the envelope, and to put it in a box located at the front door of the room you are sitting in. It is important that you do not write anything on the envelope, it should be left blank. It is also important that you keep the card with your private code: you need it to collect your earnings!

Cash Payments
Cash payments can be collected from tomorrow onwards in room w.4.36 in the fourth floor (South/West) of this building. You will present your private code to an admin staff person (Mr. ...) and you will receive your cash payment in exchange. The admin staff person will not know who has done what and why, nor how payments were generated. No experimenter will be present in the room when you collect your money. Also, the private codes of this experiment will be mixed up with the codes of other experiments. This will again help to guarantee that the amount you earn cannot be linked to your decisions. Mr. ... is available from Monday to Friday between 9 a.m. and noon and between 2 p.m. and -52 -

Detailed Instructions No Talking Allowed
Please read this document carefully and do not talk to any other participant until the experiment is over.
If there is anything that you don't understand, please raise your hand. An experimenter will approach you and clarify your questions in private. In about ten minutes this document (the front page included) will also be read aloud (by an experimenter).

Two Groups and Two Different Tasks
Before the experiment starts, the participants in this room will be randomly divided into two groups of equal size (see the text on the next page for details). The groups are called Group A and Group B.
Members of Group A will be seated in this room, members of Group B will be seated in the adjacent laboratory. Each member of Group A will be asked to make a series of ten decisions that affect not only her or his own earnings but also the earnings of a member of Group B. The members of Group B do not have a decision to make in this experiment -their earnings will depend on the decisions of Group A members alone. Members of group B will be asked to fill out a questionnaire. This is their only task in this experiment.

Matching
After randomly assigning roles (member of Group A, or member of Group B) to participants, each member of Group A is anonymously paired with a member in Group B. The matching is 1:1; that is, each member of Group A is exactly matched with one member of Group B and vice versa. You will never learn the identity of the member of the other group you are paired with. In the same way, the member of the other group you are paired with will not learn your identity. In the following we call the member of the other group you are matched with the other person.

Task of Members of Group A
If you become a member of Group A you will be asked to make ten decisions. In each of the ten decision problems you are asked to decide between two alternatives which are called LEFT and RIGHT. Each alternative implies earnings for you and the other person. The ten decision problems will be presented as rows in a table. Note that only one of the ten decisions will be taken into consideration for the payoff determination -more on this below. Each decision problem will look like this: Group A member faces. The form members of Group A will receive will contain exactly two pages, the first page is an empty cover page, the second page contains the table on the last page of the current instructions (and nothing else)!

Task of Members of Group B
If you become a member of Group B you will be asked to fill out a two-page questionnaire. The form members of Group B will receive will contain exactly three pages, the first page is an empty cover page, the other two pages contain the questionnaire.

Show-Up Fee
Each participant in this experiment will receive a show-up fee of 4 Euros for participating. In addition, each participant receives earnings as specified in the next two paragraphs. That is, the final payoff of a participant is the sum of two parts, the show-up fee plus the earnings in the experiment (as specified below).

Your Earnings if You Are a Member of Group A
If you become a member of Group A your earnings and the earnings of the other person are determined as follows: At the end of the experiment (after you have made the ten choices in private), one of the 10 decision problems will be randomly selected as the payoff-relevant one. For this purpose an experimenter with a bingo cage containing ten balls numbered 1-10 will go from one member of Group A to the next starting on the leftmost cubicle of the first row. Please make sure that your completed form is closed when the experimenter approaches you. The experimenter will ask you to draw one of the balls with the device designated for that purpose. The number on the ball gives the decision task that will be used to determine your earning and that of the other person. Your actual earnings and those of the other person correspond exactly to the payoffs in the alternative (LEFT or RIGHT) you have chosen in that specific decision problem. You will be asked by the experimenter to write the number of the payoff-relevant decision problem on the cover page of your form. You (but no one else) will then be given the opportunity to take in private a look at your choice in the payoff-relevant decision problem.
Then you will be asked to label (in private) the cover sheet of the form with your private code and to seal the form in the envelope.

Your Earnings if You Are a Member of Group B
In addition to the 4 Euros show-up fee each member of Group B will receive the earnings as described in the previous paragraph from exactly one member of Group A.

Role Assignment and Start of the Experiment
After the instructions at hand have been red aloud and all questions have been answered you (and all other participants in this room) will be asked to open the sealed envelope you draw from the box when entering this room. The envelope contains a card with your private code. The code ends with a number.
If this number is even, you are a member of Group A, if it is odd, you are a member of Group B.
Members of Group A are asked to take a seat at one of the computer terminals with sliding walls in this room. Members of Group B will be escorted to the adjacent room and asked to take a seat at one of the computer terminals with sliding walls in that room. In both rooms computers are (and will remain) switched off. An experimenter will then distribute the forms in each room. Members of Group A will receive a form that contains an empty cover page and a page containing the decision tasks displayed on the next page, members of Group B will receive a form that contains an empty cover page and a twopage questionnaire.

The End of the Experiment
After you have completed your task you will be asked to write your private code on the empty cover page of your form. PLEASE WAIT UNTIL YOU ARE ASKED BEFORE WRITING THE CODE ON THE COVER. Then put the form in the envelope and seal it. Upon leaving the room you are asked to put the envelope in the box located near the front door of the room you are sitting in.