The Importance of Peers for Compliance with Norms of Fair Sharing

A burgeoning literature in economics has started examining the role of social norms in explaining economic behavior. Surprisingly, the vast majority of this literature has studied social norms in asocial decision settings, where individuals are observed to act in isolation from each other. In this paper we use a large-scale dictator game experiment (N = 850) to show that “peers” can have a profound influence on individuals’ perceptions of norms of fair sharing, which we elicit in an incentive compatible way. However, in contrast to these strong peer effects in social norms of fair sharing, we find limited evidence of the influence of norms and peers on actual sharing behavior. We discuss how these results can be explained by heterogeneity in normative views as well as in willingness to comply with norms.


INTRODUCTION
We study the driving forces underlying one of the fundamental principles of human social behavior: fair sharing. While earlier explanations have focused on the role of other-regarding preferences and preferences for equality (see, e.g., Camerer, 2003, Chap. 2), we investigate a more recent account of fair sharing that relies on the concept of norm compliance: many people have an intrinsic preference to conform to what is collectively perceived as "socially appropriate" and are willing to sacrifice material gain in order to comply with such norms. 1 In fact, social norms are thought to drive behavior in a variety of social contexts (e.g., Elster, 1989;Bicchieri, 2006;López-Pérez, 2008;Krupka and Weber, 2013). A number of recent experimental studies use a norm compliance framework to explain behavior across several settings, including dictator games (Krupka and Weber, 2013;Krupka et al., 2016;Kimbrough and Vostroknutov, 2016), third-party allocator games (Barr et al., 2015), gift-exchange games , oligopoly games (Krupka et al., 2016), public good, trust and ultimatum games (Kimbrough and Vostroknutov, 2016).
However, nearly all of these studies of social norms focus on tightly controlled, but surprisingly asocial decision environments, where individuals face neutral and abstract decision situations, under full anonymity, and in complete isolation from other decision-makers. While the use of contextually sterile decision environments is one of the hallmarks of experimental control, we also notice that contextual variablesfrom the framing of the decision task to the presence and behavior of other decision-makers in the decision settingplay a crucial role in nearly every conceptual account of social norms. Minimal variations in the context can profoundly change individuals' perception of the nature of the decision situation and the underlying norms of conduct (Bicchieri, 2006). This highlights the importance of studying the interaction between contextual variables and norm compliance. In this paper we take a step in this direction by systematically studying the influence on norm compliance in fair sharing of one 1 Another class of explanations for fair sharing and giving focuses on the role of self-or social-image concerns whereby individuals care about being perceived as fair (e.g., Andreoni and Bernheim, 2009;Ellingsen and Johannesson, 2011;Grossman and van der Weele, 2017). In our view this approach is complementary to the social norms approach in the sense that theories of image concerns often assume the existence of a norm of acceptable behavior (e.g. equal sharing) that individuals strive to adhere to in order to boost their image. 2 specific contextual variable: the presence of "peers", i.e. other decision-makers, in the decision setting faced by an individual.
We believe that understanding the influence of peers on individual decision-making is important for a number of reasons. First, information about peer behavior is typically available in many natural social settings, where individuals do not act in social isolation. On the contrary, people often have the opportunity to interact with others and observe their choices before making a decision. Thus, studying the influence of peers on individual decision-making is inherently relevant for understanding the general dynamics of human social interactions.
Second, the study of peer influence is of theoretical interest because peers are an important determinant of norm-driven behavior in most conceptual accounts of norm compliance across the social sciences. For instance, in economics, Sugden (1998) argues that observing instances of norm-compliance or norm-breaking can reinforce or weaken the expectations that the norm ought to be followed. In social psychology, Cialdini et al. (1990) contend that the behavior of peers exerts normative influence on individual behavior by shaping what individuals perceive as typical or normal behavior in a given situation (the "descriptive norm"). In philosophy, Bicchieri (2006) proposes that whether or not a norm will be followed depends partly on "normative expectations" (whether the individual expects that sufficiently many others expect him or her to comply), and partly on "empirical expectations" (whether the individual expects that sufficiently many others will comply). Sociologists Lindenberg and Steg (2013) argue that the behavior of others can shift the weights that individuals place on the normative-goal (following social norms) relative to the more self-centered hedonic and gain goals (need satisfaction and resource accumulation).
Despite the large theoretical literature on the importance of peers for norm-driven behavior, the empirical evidence is scant. In many of the settings where peer effects have been documented empirically (e.g., Keizer et al., 2008;Shang and Croson, 2009;Bicchieri and Xiao, 2009;Krupka and Weber, 2009;Gächter et al., 2012;Falk et al., 2013;Thöni and Gächter, 2015), other behavioral forces may explain the correlations between individuals' and peers' actions observed in the experiments. 2 Even in settings where the observed data patterns are difficult to reconcile with alternative explanations (e.g. McDonald et al., 2013) and results are strongly suggestive that 3 the presence of peers affects norms, the lack of direct data on how peers affect normative considerations makes it difficult to identify whether the observed impact of peers' actions on behavior is mediated by corresponding shifts in the normative evaluation of actions.
In this paper we present a new set of dictator game experiments that measure the influence of peers on both actual sharing and norms of sharing using the incentive-compatible normelicitation task by Krupka and Weber (2013). 3 Our experiments set us apart from the existing literature on peer effects mentioned above, in that we are able to explicitly identify the linkages between peers' actions, normative views, and individual sharing behavior. In this aspect our paper is related to Gächter et al. (2013), who, however, study peer effects in norms and behavior in a gift exchange game. They find that peer effects in norms do not explain the observed peer effects in actual gift exchange. While these results cast some doubt on the importance of norms for peer effects, it would be premature to base judgment on the importance of norm following solely on the study of one specific decision setting and one specific social norm. It is indeed unclear whether the results from the gift exchange game may also extend to other settings and norms, as it may be the case that the influence of peer behavior is more decisive for norms of fair sharing than for reciprocal gift exchange.
Moreover, all the experiments reported in Gächter et al. (2013) are based on gift exchange games where the decision-makers observe the decisions of a peer before making their own choices. In this sense, it is not obvious that their experiments allow assessing the causal impact that the presence of peers may have on norms and behavior, because their study lacks a treatment without peers. In this paper, we study settings where the decision-maker is exposed to the influence of a peer as well as settings where the decision-maker acts in isolation from peers. This allows us to examine the causal influence that peers have on norms and behavior.
Specifically, in our PEER treatment subjects play a sequential three-person dictator game, where two dictators can transfer money to one recipient. The dictators move sequentially and thus the second dictator can observe the transfer made by the first dictator (the "peer") before making her own transfer decision. In contrast, our NOPEER treatment is based on a two-person dictator game where there is no peer and her role is replaced with Nature: in this game, Nature moves first and randomly determines an endowment for the recipient; the dictator observes this 4 endowment and then transfers money to the recipient. The crucial difference between the two treatments is thus that, while in the PEER treatment the recipient's wealth (prior to the dictator's transfer) is determined by a peer, in the NOPEER treatment it is determined by chance and there is no decision-maker other than the dictator present in the decision context. Furthermore, to systematically investigate the extent to which the influence of peers on normative considerations and behavior depends on the nature of the underlying norms, our study examines two payoff-equivalent, but differently framed, versions of the dictator game. In one version the dictator can give money to another player, while in the other version the dictator can also take money from the other player. Krupka and Weber (2013) have used similar versions of the dictator game to measure the influence of norms on dictator's behavior. 4 They have shown that these "give" and "take" versions of the dictator game produce stark differences in the amounts of money that dictators share with recipients. Moreover, they explain these differences by the fact that the norm that governs behavior in the "give" version of the game is substantially different from the norm that applies to the "take" game. Hence, we use give/take framing to study the extent to which the influence of peers depends on the nature of the norm (norm of giving vs. norm of taking).
To summarize, our study is based on four treatments, using a 2x2 factorial design where we vary the frame of the game (GIVE vs. TAKE) and whether a peer is present or absent (PEER vs. NOPEER). For each treatment, we conduct two types of experiments, a norm-elicitation experiment and a behavioral experiment. In the norm-elicitation experiment, we follow Krupka and Weber (2013) and measure in an incentive compatible way the extent to which the peer's behavior affects the perception of what constitutes socially appropriate behavior. In the behavioral experiment, we check how these variations in perceptions of social appropriateness translate into actual decisions. A total of 850 subjects participated in our experiments.
Our norm-elicitation experiments reveal that the presence of peers has a systematic and strong influence on the perceptions of social appropriateness. In the PEER treatment, ungenerous monetary transfers to the recipient are viewed as relatively more appropriate when the peer is 4 However, in all games studied by Krupka and Weber (2013) there is only one dictator matched with one recipient and so they cannot study peer effects in fair sharing. See also List (2007) and Bardsley (2008), who compare a standard dictator game with a game where the dictator's choice set includes the option to take money from the recipient, and Goerg and Walkowitz (2010), who compare public good game experiments framed with positive externalities to those framed with negative externalities. 5 also ungenerous towards the recipient. However, when the same levels of recipient's wealth have been determined by chance (NOPEER treatment), the relation between recipient's wealth and appropriateness is reversed: ungenerous transfers are viewed as relatively more appropriate when the recipient is wealthier (i.e. when the recipient has randomly received a larger endowment).
Interestingly, we also find that the strength of these effects varies considerably across our two versions of the dictator game. The norm that governs behavior in the TAKE game is much more stable and resilient to peer influence than the norm in the GIVE game.
Based on the results of the norm-elicitation experiment, we should expect to observe systematic differences in the influence of peers' actions (and hence recipient's wealth) on dictator's actual behavior across our experimental conditions. In particular, we should expect a positive relation between dictator transfers and recipient wealth in the PEER treatment, while a negative relation should emerge in the NOPEER treatment. Moreover, these treatment differences should be more pronounced in the GIVE than in the TAKE game.
The results of our behavioral experiments are only partially in line with these expectations.
While we observe that dictators in the NOPEER treatment significantly reduce their transfers when the recipient possesses larger endowments, there is, on average, no relation between dictator and peer transfers in the PEER treatment. Moreover, we do not detect any differences in the magnitude of these effects between the GIVE and TAKE conditions. The absence of a peer effect in the PEER treatment is consistent with the findings reported by Panchanathan et al. (2013). They also conduct a three-person dictator game experiment where two dictators decide sequentially how much to give to a recipient. They find that, on average, the amount given by the first dictator does not affect the second dictator's giving. At the individual level, they observe substantial heterogeneity in the second dictator's responses: while some dictators increase their giving in the amount given by the peer, others give less when the peer gives more, and others do not vary their giving with the peer's giving. We observe similar heterogeneity in our experiment. This suggests that a potential explanation for the limited support of the norm compliance model in our experiments may lie in the existence of conflicting views about what constitutes a norm in our setting. In section 5 we examine this possibility in detail and show that there is considerable heterogeneity in the extent to which participants agree on what a norm is in our experiments as well as in the extent to which they are prepared to comply with it. 6

THEORETICAL FRAMEWORK
To illustrate our empirical strategy to identify the importance of peers for norms of fair sharing, we start by sketching a simple theoretical framework based on the social norms model introduced by Krupka and Weber (2013, hereafter KW). We assume that decision-makers are motivated by both material self-interest and a preference for conforming to norms, i.e. collectively recognized rules of behavior that define which actions are viewed as socially appropriate (Elster, 1989;Ostrom, 2000). Thus, decision-maker i's utility function is given by: where and − are the actions undertaken by the decision-maker and by others, respectively, and represents the decision-maker's material payoff. The second term of the utility function captures the preference for norm compliance. The parameter measures the extent to which the decision-maker cares about conforming to norms. The social norms function (. ) describes the mapping between utility and the collectively-recognized social appropriateness of the actions available to the decision-maker. Decision-makers who care about norm compliance ( > 0) enjoy a positive utility by selecting actions that are viewed as socially appropriate (i.e., actions whereby (. ) > 0), whereas they suffer a disutility from actions that are inappropriate ( (. ) < 0). Note that we do not specify, at this stage, what norms individuals may follow in their decision-making. Following KW, we instead measure these norms empirically, as we describe in detail in the next section. 5 Our only assumption regarding the norms function at this stage is that what constitutes appropriate behavior depends on social and contextual influences. In particular, we assume that the social appropriateness of an action is influenced by − , the actions of other decision-makers that i can observe. 6 7 Our empirical strategy relies on two types of experiments: a norm-elicitation experiment that we use to measure the social norms function (. ), and a standard behavioral experiment to examine how changes in the norms function translate into actual decisions. To explore the role of social influences, we systematically vary whether decision-makers observe the actions of another decision-maker (a "peer") before making a choice, or whether they instead observe a random "choice" made by Nature. We thus study how the norm functions ( | − ) varies when the action − observed by the decision maker is taken by a peer or by Nature. To explore the role of contextual influences, we study two distinct decision settings that are economically equivalent (i.e. in both settings the same actions produce the same material payoffs ), but differ in how actions are framed and thus in the norms (. ) that potentially apply to each setting. The next section describes each experiment and each experimental condition in detail. where the role of D1 is replaced with Nature. Thus, the NOPEER treatment is based on a twoperson dictator game, where one dictator is matched with one recipient. In the GIVE version of the game, the dictator receives an endowment of £12 while the recipient's endowment, E = {£0, £1, £2, £3, £4}, is randomly determined by Nature. After observing the value of the recipient's endowment, the dictator transfers an amount g ∈ {£0, £1, £2, £3, £4} to the recipient. Payoffs are computed as πD = £12 -g for the dictator, and πR = E + g for the recipient.

EXPERIMENTAL DESIGN AND PROCEDURES
Note that in both treatments we observe decisions by dictators facing the same five possible situations, each corresponding to a different level of initial wealth of the recipient (£0, £1, £2, £3, or £4). The difference between the two treatments is that in the PEER treatment the recipient's wealth (prior to the dictator's transfer) is determined by the donation of another dictator, whereas in NOPEER the peer is absent and the recipient's wealth is determined at random. 8 The corresponding TAKE versions of the games are analogously defined, except that the initial distributions of endowments differ relative to the GIVE version. In the PEER/TAKE game, D1 and D2 are endowed with £9 each, while the recipient is endowed with £6. Each dictator can give/take an amount ti{D1, D2} ∈ {-£3, -£2, -£1, £0, £1} to/from the recipient. Payoffs are computed as πi = £9ti for a dictator, and πR = £6 + tD1 + tD2 for the recipient. Analogously, in the NOPEER/TAKE game the dictator is endowed with £9, while the recipient's endowment is randomly determined from the set E = {£3, £4, £5, £6, £7}. The dictator transfers an amount t ∈ {-£3, -£2, -£1, £0, £1} to the recipient, and payoffs are computed as πD = £9t for the dictator, and πR = E + t for the recipient. Thus, in both the GIVE and TAKE version of the games, dictators can implement exactly the same final payoff allocations between themselves and recipients.
However, the GIVE and TAKE games differ in whether these allocations can be obtained through "giving to" or "taking from" the recipient.
For each treatment and each version of the game, we conducted two types of experiments: a norm-elicitation experiment and a behavioral experiment. The norm-elicitation experiment is based on the task introduced by KW. Subjects were given a description of the five possible 8 Note that our focus is on comparing situations where the dictator can be affected by a peer with situations where the peer cannot by construction exert any influence on the dictator's choices. Thus, in our NOPEER treatment we remove the peer from the decision setting and transform the three-person dictator game used in the PEER treatment into a two-person dictator game. An implication of this is that, in principle, the two treatments differ along more than one dimension (whether or not the dictator can observe the choice of a peer and whether the situation is a twoperson or three-person game). An alternative treatment to control for this would be one where a passive dictator is added to the NOPEER game. We did not run this additional control treatment because doing so while keeping the design balanced would have required an additional 500 subjects and we do not expect behavior in this treatment to differ from that observed in our NOPEER treatment. In section 5.3 we discuss possible implications of comparing treatments that involve three-player interaction with treatments involving two-player interaction, with particular reference to payoff comparison considerations.
We conducted the behavioral experiments with subjects who had not participated in the norm-elicitation task. Subjects were randomly assigned to either the PEER or NOPEER treatment.
In each treatment, half of the subjects participated in the GIVE game, and the other half in the TAKE game. In all cases, we paid subjects a £2 show-up fee in addition to any earnings made in the experiment. 13 At the beginning of the experiment we matched subjects randomly into groups and assigned a role. In the PEER treatment subjects were matched in three-person groups and assigned the role of D1, D2, or Recipient. In the NOPEER treatment, subjects were matched in two-person groups and assigned either the role of dictator or recipient. Subjects then played a one-shot version of the dictator game, either in the GIVE or TAKE frame. We elicited subjects' choices using the strategy method (Selten, 1967). That is, dictators in the role of D2 in the PEER treatment and dictators in the NOPEER treatment were asked to make one decision for each of the five possible sub-games of the game, corresponding to situations where D1 or Nature had endowed the recipient with £0, £1, £2, £3, or £4 (£3, £4, £5, £6, or £7 in the TAKE game). 14 In total, we conducted 44 sessions with 850 subjects, recruited using ORSEE (Greiner, 2015). All sessions were conducted at the University of Nottingham using z-Tree (Fischbacher, 2007). Sessions lasted between 40 and 60 minutes. Table 1 summarizes the experiment design and reports the number of subjects who participated in each treatment and version of the game.  13 Note that the show-up fee for the behavioral experiments is lower than the show-up fee used in the normelicitation experiments. These values of the show-up fees were chosen to ensure that average hourly earnings were approximately £10 in each experiment. 14 Most of the experimental literature directly comparing choices elicited with the strategy method and the direct response method find that the two elicitation methods do not lead to qualitatively different results. See Brandts and Charness (2011) for a review.

RESULTS
We start by presenting the data from the norm-elicitation experiments, to examine whether the behavior of peers influences the norms of fair sharing in our setting. We then turn to the behavioral data, and examine whether any differences in norms across conditions translates into differences in sharing behavior. Several interesting patterns can be observed. First, in all five situations and in all treatments and versions of the game, the appropriateness of transfers increases in their generosity: sharing the highest amount available ("give £4" in GIVE; "give £1" in TAKE) is always considered the most appropriate option. Similarly, in all cases, the least appropriate choice is the level of sharing that maximizes the dictator's payoff ("give £0" in GIVE; "take £3" in TAKE). 16 Second, the level of the recipient's wealth generally influences the perception of what constitutes an appropriate level of sharing. These differences are, however, much more marked in the GIVE than in the TAKE game. Thus, the norms of fair sharing in the GIVE game seem much more malleable than the corresponding norms in the TAKE game.

Figure 1: Elicited norms (social appropriateness) across treatments
Notes: We transformed subjects' appropriateness ratings into numerical scores using the following scale: very socially inappropriate = -1; inappropriate = -0.6; somewhat socially inappropriate = -0.2; somewhat socially appropriate = 0.2; socially appropriate = 0.6; very socially appropriate = 1.  Third, and most importantly, the levels of the recipient's wealth influence ratings of appropriateness differently depending on whether these levels have been determined by the transfers of another dictator (PEER treatment) or by chance (NOPEER treatment). In the PEER treatment giving little to the recipient is generally viewed as less appropriate when the recipient's wealth is large (i.e., when the peer has been generous) than when a recipient's wealth is small (i.e., when the peer has also given little). 17 However, in the NOPEER treatment the relation between appropriateness and recipient's wealth is reversed: giving little to the recipient is viewed as more appropriate when the recipient's wealth is large (i.e. when Nature selects a large endowment) than when it is small. 18 We examine these patterns more formally using OLS regressions, reported in Table 2. In Model I we use data from the PEER treatment only, whereas in Model II we use data from the NOPEER treatment only. In both regressions, the dependent variable measures the appropriateness of the dictator's transfers in the five different situations. We regress this on the amount that the dictator transfers to the recipient ("Amount transferred by Dictator"), the amount that the peer (PEER treatment) or Nature (NOPEER treatment) transfers to the recipient ("Amount transferred by Peer/Nature"), and an interaction between these two variables. Moreover, to gauge the extent to which the influence of peers varies across the GIVE and TAKE games, we also include a dummy variable taking value 1 for observations in the TAKE game, and an interaction between the TAKE dummy and the "Amount transferred by Peer/Nature" variable. 17 For example, in the GIVE game (top-left panel of Figure 1), giving £2 to the recipient is viewed as socially inappropriate (an average rating of -0.36) when the peer gives £4 to the recipient (dashed red line), but as socially appropriate (an average rating of 0.14) when the peer gives £0 to the recipient (solid blue line). Wilcoxon signed rank test result: p < 0.001. 18 For example, in the GIVE game (top-right panel of Figure 1), giving £2 to the recipient is viewed as socially appropriate (an average rating of 0.28) when the recipient receives an endowment of £4 (dashed red line), but as socially inappropriate (an average rating of -0.04) when recipient receives an endowment of £0 (solid blue line). Wilcoxon signed rank test result: p < 0.001. The regressions reveal that in both the PEER and the NOPEER treatments more generous transfers by the dictator are viewed as more appropriate than ungenerous transfers. The effect of increasing the dictator's transfer on its evaluation of appropriateness is 0.359 + 0.019 * "Amount transferred by Peer" in the PEER treatment and 0.411 -0.009 * "Amount transferred by Nature" in the NOPEER treatment. In both cases, the effect is positive for any possible amount transferred by the peer or Nature.
To gauge how changes in the recipient's wealth affect the judgments of appropriateness of the dictator's transfers, we need to inspect the coefficients of the variable "Amount transferred by Peer/Nature" and the interaction term "Amount transferred by Dictator * Amount transferred by Peer/Nature" (as well as the interaction with the TAKE dummy, for the TAKE game). In the PEER treatment, the peer's generosity negatively influences the judgments of appropriateness of the dictator's transfers. This effect is particularly marked for ungenerous dictator's transfers, while the influence of peers wanes for more generous dictator's transfers, as indicated by the 15 positive and significant coefficient of the interaction term between the "Amount transferred by Dictator" and "Amount transferred by Peer/Nature" variables. In contrast, in the NOPEER treatment the judgments of appropriateness of the dictator's transfers become more lenient the higher is the endowment that Nature transfers to the recipient. Again, this effect is particularly marked for ungenerous dictator transfers and it diminishes as dictators transfer more money to the recipient, as indicated by the negative and significant coefficient of the interaction term.
Finally, in both treatments, the impact of the recipient's wealth on norms is significantly weaker in the TAKE than in the GIVE game. This can be seen by noticing that, in both the PEER and the NOPEER treatments, the coefficient of the interaction term "Amount transferred by Peer/Nature * TAKE" takes an opposite sign relative to the "Amount transferred by Peer/Nature" variable. In both cases the effect is significant at least at the 5% level.
To account for the ordinal nature of the norms data, we ran additional ordinal probit regressions. The results are similar to those reported in Table 2. Moreover, we complement the regression analysis from from the peer. Moreover, we would expect these effects to be stronger in the GIVE than in the TAKE version of the game. We summarize these behavioral predictions as follows:

Hypothesis 1: In the NOPEER treatment, dictator's transfers correlate negatively
with the recipient's initial wealth.

Hypothesis 2:
In the PEER treatment, dictator's transfers correlate positively with the amount that the recipient received from the peer.

Hypothesis 3: These effects are stronger in GIVE than in TAKE games.
In the next sub-section we present the data from our behavioral experiments to examine the extent to which the observed variations in social appropriateness of transfers translate in differences in behavior. versions of the games. In the TAKE game, transfers have been rescaled to give a score between £0 and £4, to ease comparability with the GIVE game. 19

Behavioral experiments: The influence of peers on sharing behavior
The figure shows that there is on average no clear relation between the dictator's transfers and the recipient's wealth in the PEER treatment, both in the GIVE and TAKE versions of the games. Thus, whether or not the peer is generous with the recipient does not seem to affect the dictator's sharing decisions. In contrast, a negative relation between dictator's sharing and recipient's wealth seems to emerge in the NOPEER treatment, in both versions of the game. Thus, dictators seem to behave less generously towards recipients that have randomly received larger endowments.

Figure 2: Dictator's transfers across treatments
Notes: Bars indicate 95% confidence intervals.  These results are only partially in line with the results of the norm-elicitation experiment.
The negative relation between recipient's wealth and dictator's transfers in the NOPEER treatment is consistent with Hypothesis 1. However, the results of the norm-elicitation experiment also suggest that we should observe a positive relation between recipient's wealth and dictator's transfers in the PEER treatment (Hypothesis 2). Our data do not support this conjecture.
Moreover, the norm-elicitation experiment suggests that the norm of fair sharing may be more malleable in the giving than taking setting (Hypothesis 3). However, we do not observe any difference between GIVE and TAKE games in the extent to which the recipient's wealth affects dictator's sharing. More generally, we see only small differences in dictator's behavior between the GIVE and TAKE games, and only in some subgames of the PEER treatment. This is interesting because KW have shown that using give/take frames in dictator games can produce strong differences in behavior. However, we cannot replicate this result: in our NOPEER treatment, which is most similar to the games used by KW, we do not observe any difference in dictator sharing between GIVE and TAKE games, despite the existence of differences in the norms that apply to these games (see Online Appendix C for further detail).

EXPLAINING THE EXPERIMENTAL DATA
What can explain the observed discrepancies between the norm-elicitation and behavioral experiments? One striking aspect of the behavioral data is that we observe substantial heterogeneity at the individual level in the extent to which dictators are influenced by the level of 19 wealth of recipients (see Online Appendix D for more details). About half of the dictators are not affected by the recipient's wealth and opt for the same monetary transfer across all five subgames. A third of dictators reduce their transfer as the recipient's wealth increases, whereas about a tenth of dictators respond positively to increases in the recipient's wealth. Our findings are similar to those reported by Panchanathan et al. (2013) in a three-person dictator game that is closely related to our PEER/GIVE treatment. They find that about half of dictators do not respond to variations in the peer's behavior, a third give more when the peer gives less, and thirteenpercent give more when the peer gives more.
This suggests that there may be substantial heterogeneity in the extent to which dictators are willing to comply with norms of fair sharing, or in the extent to which they recognize these norms as applicable. Alternatively, (at least some) dictators may be driven by other types of considerations (e.g. inequity aversion; guilt aversion), that may conflict with normative considerations and pull behavior away from compliance with norms of fair sharing. The next sub-sections investigate these potential explanations. To examine this, we take a closer look at the norms data. Recall that in the norm-elicitation experiment subjects could rate the appropriateness of actions on a scale with three levels of "inappropriateness" (very inappropriate, somewhat inappropriate, inappropriate) and three levels of "appropriateness" (very appropriate, somewhat appropriate, appropriate). Figure 3 shows the percentage of subjects disagreeing with the majority view about the appropriateness of each action across the various situations that they rated. 20 We say that a majority of subjects rate an action as appropriate (inappropriate) if the sum of the relative frequencies of the ratings "very appropriate", "somewhat appropriate" and "appropriate" is greater (lower) than 50%.The light (red) bars indicate that there is a minority of subjects assigning one of the three levels of "inappropriateness" to an action, while the majority rated the action as appropriate. The dark (blue) bars show disagreement in the opposite direction (the majority view the action as inappropriate and a minority rates it as appropriate). For instance, the first dark bar in the top left panel of the figure shows that in the PEER/GIVE treatment 12% of subjects rated the action "give £0" as appropriate in the scenario where the peer also gives £0, indicating that the remaining 88% of subjects rated it as inappropriate. 21

Norm ambiguity
To assess the presence of norm ambiguity, consider first the NOPEER/GIVE treatment (top right panel). In most cases, relatively few subjects (less than 20%) disagree on the social appropriateness of actions. The main source of disagreement among subjects is the action "give £2", which between one-fifth and one-half of subjects view as inappropriate in contrast with the majoritarian view that the action is appropriate. Nevertheless, apart from this action, the general picture emerging from the NOPEER/GIVE treatment is that there is a reasonably low degree of ambiguity about the social norm in this setting.

Consider now the PEER/GIVE treatment (top left panel). As in the NOPEER/GIVE treatment,
there is little disagreement about the actions "give £0" and "give £4". Also as in NOPEER/GIVE, subjects tend to disagree on how to rate the action "give £2". However, relative to the NOPEER/GIVE treatment, subjects also disagree more on how to rate the actions "give £1" and "give £3". For both actions there are at least some scenarios where about 40% of subjects disagree with the majority view. Moreover, the source of disagreement seems to be related to the behavior of the peer. For example, when the peer gives £1 most subjects view the dictator action "give £1" as appropriate, presumably following the principle of social proof. However, 41% of subjects disagree and rate it as inappropriate, presumably following a Rawlsian norm similar to the one that subjects recognize in the NOPEER treatment. As another example, the dictator action "give £3" is generally viewed as appropriate by a majority of subjects. However, when the peer gives £4, 42% of subjects rate this action as inappropriate, again presumably because this action compares unfavorably with the peer's action. Overall, the observed patterns of disagreement suggest that observing what a peer has decided to do may introduce some ambiguity about the social norm.
Finally, Figure 3 corroborates our previous observation that the norm in the TAKE treatment (bottom panels) is substantially less malleable than the norm in GIVE. For all actions and in both the PEER and NOPEER condition, very few subjects disagree with the majoritarian view about the appropriateness or inappropriateness of actions. The degree of agreement seems somewhat stronger in the NOPEER condition, but again the differences are small.
To summarize, this qualitative analysis suggests that disagreement among subjects about what constitutes a norm of appropriate behavior can go some way in explaining the lack of support for the norm model in our experiments.

Heterogeneity in norm compliance
Another explanation for our experimental results is that there may be heterogeneity in preferences for norm compliance in the population of dictators we sampled for our experiment.
Thus, even if norms of fair sharing were prominent and clear in the population, not all dictators would be willing to follow these norms. Moreover, the dictators' willingness to follow norms may itself vary across treatment conditions. To explore these possibilities, we follow the econometric methodology used by KW and related papers and investigate the extent to which elicited norms can predict actual behavior in our experiments. Differently from previous papers, we use a mixed logit model (see, e.g., Train, 2003) that allows for heterogeneity in the concerns for norm compliance and allows us to estimate, for each treatment, the share of dictators that are in fact guided by a desire to follow social norms. 23 In order to do so we follow the theoretical framework introduced in section 2 and assume that the utility that dictator i derives from choosing a monetary transfer k in situation s depends on the material payoff implied by the transfer and the social appropriateness of the transfer. We also assume that dictators are heterogeneous in their concerns for norm compliance. Thus, dictator i's utility takes the form: where is dictator i's material payoff associated with transfer k in situation s, and is the average appropriateness rating of the transfer, as measured in the norm-elicitation experiment.
The parameter measures the weight that dictators place on monetary payoffs, while is an individual-specific parameter measuring the extent to which the dictator cares about norm compliance. Note that we are assuming homogenous preferences for money across subjects, but we allow for heterogeneous preferences for norm compliance. The term is a random error term, assumed to be i.i.d. extreme value distributed.
Conditional on , the probability that dictator i chooses monetary transfer k in situation s depends on the utility associated with that choice, , relative to the utility associated with the other alternatives: where ( | ) is the density of and are the parameters of the distribution. We assume that follows a normal distribution with mean g and standard deviation h, ~( , ℎ), and we estimate the parameters of the distribution using maximum simulated likelihood (Hole, 2007). Table 4 presents the results of the estimation. We estimate four different models, one for each treatment/game combination. 22 In all models, the coefficient on own payoff is positive and highly significant, indicating that dictators are more likely to choose transfers that yield higher own payoffs.  -likelihood -395.212 -370.768 -381.325 -398.907 Notes: Mixed logit regressions. The dependent variable takes value 1 for the monetary transfer that was chosen by a dictator in a given sub-game, and value 0 for the other transfers that were not chosen. Standard errors in parentheses. Significance levels: *** p < 0.01; ** p < 0.05; * p < 0.1.
Turning to norm compliance, Table 4 reports the mean and standard deviation of the norm rating coefficients. Looking first at the estimates of the mean, the regressions confirm the limited success of the norms compliance model in explaining the behavioral data. In the PEER treatment (Models I and II) the average effect of norm ratings on the choice of monetary transfers is not significantly different from zero: on average, dictators do not choose transfers that are deemed more socially appropriate more often. In the NOPEER treatment (Models III and IV) the effect is positive and significant in the GIVE game, indicating that the average dictator is more likely to 22 In Online Appendix D we report additional analyses of norm compliance where i) we perform the analysis using the median rather than the mean of the distribution of ratings in order to reduce the influence of outliers (normative disagreement) that we have discussed in the previous sub-section; ii) we address the issue of collinearity between the own payoff and average norm rating variable following an econometric approach suggested by Thomsson and Vostroknutov (2016), and iii) we estimate one model of norm compliance pooling data from the four different treatments. The results of this additional analysis support the conclusions discussed in the main text.
choose transfers that are more socially appropriate. The effect is, however, not significantly different from zero in the TAKE game.
Lastly, note that in all models the standard deviations of the norm coefficients are positive and highly significant, confirming that there is substantial heterogeneity in preferences for norm compliance in our sample. We can use the estimated means and standard deviations of the coefficients to make inferences on the share of dictators that place a positive weight on norm compliance. In particular, the share of dictators placing a positive weight on norm compliance is given by (̂/ĥ), where is the cumulative normal distribution, and ̂ and ĥ are the mean and standard deviation of the norm ratings coefficients (Hole, 2007 In order to explore the extent to which payoff comparisons may explain our experimental results, we apply the Fehr and Schmidt (1999) model of inequity aversion to our games. In this model, the decision-maker i's utility is given by: where is the player's material payoff from the game and is the number of players in the game (2 in NOPEER and 3 in PEER). The parameter measures her aversion to disadvantageous payoff inequality, and the parameter measures her aversion to advantageous payoff inequality.
Can the Fehr and Schmidt model explain the patterns of choices in the behavioral experiments? It turns out that the model does not predict behavior in either of the treatments. In the NoPEER treatment, the model predicts no relation between the recipient's wealth and dictator's giving. This is because in our games the dictator is always at least as well off as the recipient, at all levels of the recipient's wealth and for all the actions available to the dictator. This implies that the model predicts that the dictator either gives nothing (if < 1/2) or gives £4 (if ≥ 1/2), regardless of the wealth of the recipient. In contrast with this prediction, our data from the NoPEER treatments show that dictators reduce their giving as the recipient's wealth increases.
As for the PEER treatment, the Fehr and Schmidt model predicts that, if D2 gives any money to the recipient (which occurs when ≥ 2/3), the amount given is positively correlated with the peer's giving. This is because, in the three-person PEER games, D2 compares her payoff not only with the recipient but also with the peer. Thus, because of disadvantageous inequality aversion, D2 is willing to give money to the recipient only to the extent that the peer also gives money, so that her payoff does not fall behind the peer's payoff. Our data do not support this prediction and show no relation between the two dictators' actions in the PEER treatment. 23 A second potential motive that may play a role in our setting is guilt aversion (e.g., Charness and Dufwenberg, 2006). Guilt averse dictators suffer a disutility if they leave the recipient with less money than what the recipient expects to receive. Guilt aversion may predict differences in behavior between our treatments because in the PEER treatments dictators may adjust their beliefs about what the recipient expects to receive based on the giving of their peer. For instance, observing that the peer gives £4 to the recipient may induce dictators to adjust their beliefs upwards as they may interpret the peer's actions as a signal that the peer thinks that the recipient expects £4 from a dictator. This signal is instead unavailable to dictators in the NOPEER treatment.
In order to test models of guilt aversion, one needs second-order beliefs of dictators about what recipients expect to receive. This is particularly important if one wishes to test whether these models are observationally different from models of norm compliance (see, for example, Krupka et al., 2016). Because our design is already quite complex, we have not elicited beliefs and so we cannot perform a formal test of guilt aversion as a potentially distinct explanation of our data.
Nevertheless, if one plausibly assumes a positive correlation between dictators' second-order beliefs and peer's giving (along the lines discussed above), then a positive relation between peer's and dictator's giving should emerge in the PEER treatments. At the aggregate level our data do not support this prediction. In this sense, we think that guilt aversion is an unlikely explanation of our behavioral results.

CONCLUSION
Our study shows that the behavior of others can have important effects on the way individuals perceive what constitutes socially appropriate behavior in a given situation. In our dictator game experiments, whether or not an action is viewed as socially appropriate partly depends on the extent to which another dictator (the "peer") is willing to take it. These strong effects of peer behavior on norms do not translate, however, into corresponding effects in actual behavior in the aggregate. In particular, we do not observe a positive correlation between the dictator's and peer's generosity in the treatment where dictators receive information about peer (MAO) decrease in the payoff paid to a passive third party. However, when the third party's payoff is too low, responders disregard the comparison and their MAO are similar to those in a game without third parties.
behavior. Thus, generous peers do not breed more generosity, despite the strong impact of peer behavior on the average social acceptability of generous and ungenerous behavior. 24 We discuss a number of possible explanations for the discrepancies between normative considerations and actual behavior observed in our experiments. We find evidence of heterogeneity in normative views that is related to the presence of peers: the peer's behavior introduces normative cues that are in contrast with the notion of fair sharing that subjects seem to hold when peers are absent (see McDonald et al. 2013 for related evidence). This conflict in normative views can explain why we find a large fraction of subjects unwilling to comply with the average view of appropriateness and why dictators fail to follow the example of peers. Thus, our results suggest that the extent to which peers reinforce or counteract pre-existing notions of appropriateness may be an important determinant of the strength of peer effects.
Our results raise a number of interesting questions regarding the existing approaches to norm compliance (e.g., Krupka and Weber, 2013 Another interesting question relates to the role of sanctions for norm compliance. Recent research has shown that individuals are willing to use direct and indirect punishment to enforce social norms at a cost to themselves even in one-shot interaction with strangers, and this can help explain why norms are adhered to (Balafoutas and Nikiforakis, 2012;Balafoutas et al., 2014).
Punishment opportunities may also play a role in resolving norm heterogeneity, for instance if