Mean Markets or Kind Commerce?

Germany Abstract: Does market interaction influence morality? We study a particular angle of this classic question theoretically and experimentally. The novelty of our approach is to posit that people are motivated by reciprocity – an urge many argue affects humans. While many have suggested that market interactions make people more selfish, our reciprocity-based theory allows that market interaction on the contrary induces more prosociality. Our experiment provides a test of the empirical relevance of such an effect, in some highly stylized settings. The results are broadly (but not completely) supportive. They may shed light on the development of morality and prosocial behavior over time, with respect to episodes in history where the nature of commerce was


Introduction
Economists typically expect people's market behavior to be guided largely by self-interest. As famously expressed by Adam Smith in The Wealth of Nations (1776): It is not from the benevolence of the butcher, the brewer, or the baker that we expect our dinner, but from their regard to their own interest. We address ourselves not to their humanity but to their self-love, and never talk to them of our own necessities, but of their advantages.
Since that time, there is no doubt that markets have been instrumental in contributing spectacular improvements in material living conditions. Yet, our well-being does not depend on market outcomes alone; factors such as equality of opportunities, trust in institutions, and the level of crime play an important role as well. For example, most of us prefer to live in a society where people are kind to each other, and where they are hence neither violent nor trying to take advantage of others. Indeed, Smith himself, not least in his Theory of Moral Sentiments (1759), emphasized the importance of morality and social norms for understanding behavior and well-being. 1 A natural question is whether markets and trade make us more or less selfish, and more or less prosocial, toward those we trade with and others.
The answer is not obvious. While, as noted in the next section, many have argued that market interaction makes people more selfish and/or more immoral, there are also arguments that markets induce more prosocial behavior. We add to this discussion by theoretically articulating, and experimentally exploring, a novel perspective: We consider the implications of people being motivated by reciprocity, i.e., they desire to be kind to those they deem kind and unkind to those they deem unkind. Building on the reciprocity model developed by Dufwenberg & Kirchsteiger (2004) (D&K), we present a theory that generates the following optimistic prediction: People who successfully trade will be kind toward one another, and consequently they will treat each other well outside the trading institution. Several related predictions are then tested in the lab.
Our theory describes four players who interact across two stages. They first engage in one of three trading institutions, labeled AUTARKY, BARTER, and MARKET. They are then, unexpectedly, invited to play a four-player dictator game, representing interaction outside the trading institution.
1 Adam Smith had more generally a broad perspective on human behavior and motivations and discussed most psychological, and some sociological, mechanisms that are now analyzed within behavioral economicssee e.g., Ashraf et al. (2005) and Smith & Wilson (2019) which is in sharp contrast to the narrow-minded and purely selfish homo economicus caricature that is at times attributed to him. unkind to the unkind, which may also be a potential mechanism by which market integration is associated with prosociality.
To test our theory, we conducted a carefully controlled experiment with 524 student participants from various fields of study at the University of Innsbruck. We introduced three treatments, intended to capture the essence of the three societies/games described in our theory. Our design allowed us to hold payoffs (or income) fixed between different treatments, and thus disregard any income effect that might be associated with the efficiency-enhancing effects of markets. The experiment had two stages. In the first stage (Stage 1), the participants were randomly allocated to three treatments, reflecting economic transactions in AUTARKY, BARTER, and MARKET, respectively.
In the second stage (Stage 2), which was identical across treatments, each subject could distribute money between themselves and others in two versions of a dictator game, d-game-1 and d-game-2: In d-game-1, one of the four individuals in each group was selected to be the dictator and was then asked to divide a fixed amount of money between themselves and the other group members (the same amount to each of the others); the amount given to others was doubled by the experimenter. d-game-2 worked like d-game-1, except that the dictators could give varying amounts to the other group members.
Regarding external validity, we acknowledge that our Stage 1 trading games are highly stylized and that there are many other experimental studies that use richer trading institutions, e.g., "double auctions." The main reason for our modelling choice is that we want to achieve as much trade as possible, as well as to ensure minimal variations in pay across participants, to provide clean tests on how individuals distribute money to the other participants in the dictator games based on the trading institution ceteris paribus. This would be difficult to obtain with, e.g., a double auction design.
Exploring designs using more complicated institutions would still be interesting, but we leave that for future research.
The experimental results are broadly (but not completely) consistent with the theoretical predictions. As predicted, we find statistically significantly higher average contributions in the MARKET treatment than in the AUTARKY and BARTER treatments in d-game-1 and d-game-2. However, contrary to our predictions, we find no differences between AUTARKY and BARTER in either version of the dictator game. Finally, as predicted, in BARTER, we find statistically significantly higher contributions to former trading partners in Stage 1 than to non-trading partners in d-game-2.
In our theory as well as our experiment, we obviously provide a rather optimistic picture of the market, where one's market interactions induce only positive implications for oneself as well as others, in terms of gains from trade. Real-world markets are of course more complex and for example also include situations where agents fail to reach an agreement, engage in cheating, and sell lemons, and where the outcomes become highly inequitable. However, we do not explore the theoretical or empirical relevance of such possibilities but focus solely on the kindness induced by gains from trade and explore what happens subsequently outside the trading institution. We choose not to focus on negative effects for two reasons: First, even without ignoring them, there is hardly any doubt that market interactions in terms of gains from trade have overall contributed, and continue to contribute, to material well-being for most people. Second, attempting to experimentally disentangle various mechanisms from market interactions simultaneously would make it difficult to identify any of them.
We view these challenges as potentially interesting avenues for further research and will return to this topic in the concluding section.
Overall, while several papers have analyzed effects of market interactions on prosocialitysee the next sectionour contribution appears to be the first to, theoretically and experimentally, analyze direct spillover effects of positive reciprocity induced by market interaction. As such, it does certainly not aim to provide a complete picture of markets and market interaction on prosocial behavior, but to suggest and test a potentially important mechanism. This mechanism, in turn, seems to have been largely overlooked in the more recent literature, even though positive indirect effects of market interactions were discussed already by, e.g., Adam Smith, Montesquieu, and David Hume.
The remainder of the paper is organized as follows: Section 2 briefly reviews related literature and Section 3 presents the game forms we use to represent our three stylized societies, where D&K's theory is then applied to generate testable predictions. Section 4 outlines the experimental design, while Section 5 presents the results and Section 6 provides some concluding remarks.

Related Literature
Over the centuries, manyincluding philosophers, politicians, and religious authoritieshave argued that market interaction tends to make people more selfish and/or more immoral, and that this will have spillover effects outside of markets. For example, St. Augustine considered lust for money and possessions to be one out of three deadly sins (see Deane, 1963, pp. 44-56), whereas Karl Marx (e.g., Marx, 1844) explicitly or implicitly claimed that capitalism and markets cause many ills, such as dishonesty. 3 Others on the contrary have argued that markets enhance morality and induce prosocial behavior. For example, Albert Hirschman (1977), in his The Passions and the Interests, Political Arguments for Capitalism before Its Triumph, shows that many thinkers contemporary with Adam Smith, such as Montesquieu, Hume, Turgot, and Condorcet, largely argued in favor of market capitalism because of its supposed civilizing effectseffects that would reduce conflict and violence when passions were largely replaced with material interests. As expressed by Montesquieu (1748, Book 20, Chapter 1): 4 Commerce is a cure for the most destructive prejudices; for it is almost a general rule, that wherever we find agreeable manners, there commerce flourishes; and that wherever there is commerce, there we meet with agreeable manners.
Consistent with this, Steven Pinker (2011) has, in a much discussed and partly controversial contribution, argued that the amount of human violence has in most periods of humanity decreased over time. He explains how this decrease has been particularly dramatic during certain time periods, notably during the transition from hunter-gatherer to the agricultural society roughly 10,000 years ago, and during the transition from an agricultural to an industrial society. In Europe, this second transition took place around the time when Adam Smith wrote The Wealth of Nations. In terms of trade, the first transition largely also implied a transition from an AUTARKY to a BARTER society, whereas the latter implied a further transition to a MARKET society. There is also cross-culture experimental evidence indicating that market interaction tends to make people more prosocial; see Henrich et al. (2001Henrich et al. ( , 2004Henrich et al. ( , 2005, who compared 15 small-scale societies with quite different institutions, as well as the follow-up studies by Henrich et al. (2006Henrich et al. ( , 2010 and Ensminger & Henrich (2014). Relatedly, while McCloskey (2006, 2010 in her trilogy Bourgeois Virtues, Bourgeois Dignity, and Bourgeois Equality emphasized the importance of held values for the industrial revolution and the birth of the modern market economy, Mokyr (2016) adds to this picture the importance of institutions, including markets, and how these influence values.
We suggest that the human tendency to reciprocate, i.e., the desire to be kind to those deemed kind and unkind to those deemed unkind, can be an important mechanism behind this pattern. This should be compelling insofar that scientists from many fields, as well as many other authors, have forcefully argued that reciprocity constitutes a basic form of human motivation; see Mauss (1954), Goranson & Berkowitz (1966), Trivers (1971), and Akerlof (1982) for early influential work in anthropology, social psychology, biology, and economics, respectively, and Fehr & Gächter (2000) and Sobel (2005) for some critical surveys by economists. 5 Modern theory on reciprocityincluding Rabin (1993), D&K, and Falk & Fischbacher (2006) uses tools of so-called psychological game theory (see Geanakoplos, Pearce & Stacchetti, 1989 for a pioneering contribution and Battigalli & Dufwenberg, 2020 for an overview).
Yet, even if we accept the description of the historical episodes, and the comparison of different societies, causation via varying levels of market integration and incentives to reciprocate does not follow from correlation. There are many potential mechanisms behind the historical patterns.
For example, one could argue that increased prosociality, and decreased violence over time, are largely resulting from income effects. Falk & Szech (2013) pioneered the use of lab experiments for shedding light on the market-and-morals debate. They show that a smaller share of participants is willing to forsake money for preventing the death of a mouse when they are bargaining over the life of the mouse in double auction markets than when they are deciding individually. Their interpretation is that the market interactions undermine moral values. However, follow-up studies have called the 5 As regards older history, morality based on the idea of reciprocal justice, e.g., an eye for an eye, is very old as reflected, for instance, in the Hebrew Bible and the Quran. Fehr & Gächter (p. 159) quote from The Edda from the 13th century that "A man ought to be a friend to his friend and repay gift with gift. People should meet smiles with smiles and lies with treachery." For more modern examples (though with an emphasis on negative reciprocity) from literature, film, business, as well as lab experiments, see Dufwenberg, Smith & Van Essen (2013, Section III). robustness, and even the interpretation, of their finding into question, and provided alternative designs; see, e.g., Bartling et al. (2015Bartling et al. ( , 2020 and Kirchler et al. (2016). For example, Kirchler et al. (2016) show that immoral behavior in the setting of Falk & Szech (2013) is robust to various nudges but can be reduced with monetary punishment, whereas Bartling et al. (2020) replicate the main treatment effect of Falk & Szech and include additional treatments, leading them to conclude that repeated play rather than market interaction seems to cause the erosion of moral values.
Choi & Storr (2021) experimentally investigated whether there is an influence of market interactions on prosociality outside the market. The authors set up experimental goods markets in the first stage and trust games in the second stage of their experiment. They find that positive relationships based on previous market interactions, which are personal rather than impersonal in nature, lead to more trust (higher first-mover transfers in the trust game) and trigger trustworthiness (higher secondmover transfers in the trust game) compared with negative relationships based on previous market interactions. Although trust and reciprocity are undoubtedly related, we apply dictator games to determine whether markets trigger reciprocity directly and thus independently of trust.
As far as we know, our study is novel in theoretically exploring implications of reciprocity on prosocial behavior in different trading regimes, based on modern theory on reciprocity, as well as in providing corresponding direct experimental tests.

Theory
We consider three different societies (treatments) that reflect different degrees of commercial activity: AUTARKY: a hunter-gatherer society where everyone is economically independent.
BARTER: a society where pairs of individuals meet and engage in bilateral exchange.
MARKET: a monetary economy, where elaborate trading cycles occur.
To arrive at simple special cases of these societies that are amenable to experimental testing, we assume hereafter that each of the three societies corresponds to a game form with four interacting players, labeled A, B, C, and D. Within each society, the framing of the game involves locks and keys, presented as potential carriers of value. Each participant is endowed with two unique keys (K) and two unique locks (L). The players (or participants) in all societies are informed that locks have values only when they are paired with the corresponding key, and vice versa, which follows intuition.
In AUTARKY, we mimic a society without any trade or exchange possibilities. In BARTER, bilateral exchange agreements are expected; and in MARKET each player is assumed to sell and buy keys in the induced market-like setting. When all decisions are made, the expectation is that each player will have the same endowment within as well as between societies. The details of the setup are as follows:

Game forms:
AUTARKY: Each player is endowed with two numbered "locks" and two numbered "keys." Let L1, L2, … be "lock #1," "lock #2," etc., with K1, K2, … defined analogously. A is given L1, L3, K1, K3; B is given L2, L4, K2, K4; C is given L5, L7, K5, K7; and D is given L6, L8, K6, K8. The players are told that each matching pair -(Ln, Kn') such that n = n'is worth 50 tokens and that unmatched locks or keys are worth nothing. Players make no choices and, thus, there are no trade or exchange possibilities. Since each of them already holds two matching pairs, each receives 100 tokens. BARTER: Each player is endowed with two locks and two keys. A is given L1, L3, K1, K4; B is given L2, L4, K2, K3; C is given L5, L8, K5, K7; and D is given L6, L7, K6, K8. The players are told that each matching pair is worth 55 tokens and that unmatched locks or keys are worth nothing. The players are also told that to get a second matching pair (they already have one each), they may say Yes-or-No to a bilateral trade agreement with the player holding the key with the number that matches their own unmatched lock. In exchange for that key, they would give to that player the key with the number that matches that player's unmatched lock. Saying Yes costs a player 10 tokens, regardless of what the other players do. The real-world analogy might be a time or transportation cost for bringing goods or services to a market. A trade occurs if and only if both players involved in a trade say Yes. If all players say Yes, so that A trades with B and C trades with D, then each player will in total obtain 100 (= 55+55-10) tokens.
MARKET: Each player is given two locks and two keys, as follows: One is given L1, L5, K2, K3, another L2, L6, K4, K7, the third L3, L7, K1, K8, and the fourth L4, L8, K5, K6. Each player knows about the four key-lock packages and their own locks and keys, but not the distribution for the other players. 6 Players are told that each matching pair is worth 55 tokens and that unmatched locks or keys are worth nothing. Players are told that there is an opportunity to sell their keys and to buy keys that match their locks. Each such transaction involves a price of 15 tokens, paid from buyer to seller.
However, the number of transactions that will occur is decided as follows: Each player must make a single Yes-or-No choice regarding whether they are willing to sell (all) the keys they are endowed with as well as to buy (all) the keys that would match the locks they hold. Choosing Yes in this fashion costs 10 tokens (interpreted as in BARTER) regardless of what other players do. If a player says Yes, the numbers of sales and purchases this player subsequently will be involved in depends on what the other players choose. If all players say Yes, so that all feasible trades occur, then each player will in total obtain 100 (=55+55-15-15+15+15-10) tokens. Note that with this outcome, participants trade in a cycle mimicking the nature of a market economy. If it is not the case that all players say Yes, then some of their payoffs will be lower, with details depending on who chooses No (some calculations are presented below).

Reciprocity, maximal trade, and players' kindnesses:
Suppose the players are motivated by reciprocity; they desire to be kind to those deemed to be kind and unkind toward those deemed to be unkind, specifically, as in D&K's theory. We focus our analysis on the kindness of players in what we shall call a "maximal-trade outcome," meaning the strategy profile where all players choose Yes in BARTER and MARKET, and the automatically generated outcome (without trade!) in AUTARKY. Using D&K's theory, the maximal-trade outcome is an equilibrium in any of the three game forms described in this section. 7 In D&K's (as in Rabin's) theory, kindnesses can range from negative to positive, and while the former case breeds hostility, the latter breeds generosity. It turns out that in our game forms, and in a maximal-trade outcome, negative reciprocity is never an issue. Therefore, our analysis to follow will only concern how positive (or at least non-negative) kindness breeds generosity.
We have parameterized our game forms such that each player's material payoff will be 100 in a maximal-trade outcome. However, a player's "kindness" to others, a notion that is central in reciprocity theory, differs between the game forms. We will not describe all the details about D&K's theory here, but merely explain how to calculate players' kindnesses in our games. Namely, i's kindness to j in a maximal-trade outcomelabeled ijequals half of the difference between what j gets with maximal-trade (=100) and what j would get if i did whatever is feasible to block trade. We calculate the kindness ij for each of our three game forms: Explanation: In AUTARKY, maximal-trade involves no trade. There is nothing to block. Trivially, since i has no choice, there is no difference between what j gets with maximal-trade (=100) and what j would get if i did whatever is feasible to block trade (=100). We get ij = ½×(100-100) = 0.
To verify the calculations, consider first AB. Accordingly, in the all-choose-Yes equilibrium, we get AB = ½×(100-60) = 20. DA, BC, and CD are calculated analogously. The next four kindnesses (AD = BA = CB = DC = 7.5) concern a player who can deny another player a sale including a payment of 15, and hence reduce that player's income to 100-15 = 85 tokens. Kindness in these cases equals ½×(100-85) = 7.5 tokens. The final four kindnesses (AC = CA = BD = DB = 27.5) concern players interacting via both a sale and a purchase.
Thus, we can "sum up" the results of the two calculations just described, which results in a kindness equal to 20+7.5 = 27.5 tokens.
9 To visualize the interdependencies between the players in the all-choose-Yes equilibrium, the reader may find it helpful to draw a flowchart of indicated trades using arrows.
However, in MARKET, the players are not given information about co-players' identity, i.e., their ID (A,B,C,D) hence cannot perform the just stated calculations and associate them with particular others. The reasonable way to calculate the kindness with respect to any other player, in an all-choose-Yes equilibrium, is to take expected values. Hence, perceived kindness is in equilibrium equal to the average kindness, so we get ij = (20+7.5+27.5)/3 = 18.33.

Predictions for Stage 2:
Imagine that individuals in AUTARKY, BARTER, or MARKET unexpectedly run into someone toward whom, at some cost, they can be generous in Stage 2 of the experiment. Will they give anything, and if so, how much? We propose that the kindness generated in the preceding societal activity in various magnitudes, which only depended on the treatment allocation, may now, so to speak, "spill over." Namely, in the spirit of kindness-based reciprocity, if i runs into j, then the higher ji was in the preceding societal activity, the more i will give to j.
Specifically, envisage that the unexpected opportunity to be generous appears as a version of the dictator game. For testing purposes, we shall consider two varieties: d-game-1: One individual -A, B, C, or Dper group is selected to be the dictator. This person receives 90 tokens and is asked to divide this amount between him-or herself and the other group members. The dictator must give the same amount to each of the others. Finally, whatever is given to another will be doubled (by the experimenter). 10 d-game-2: Works like d-game-1, except that the dictators can give individual amounts to the other group members. 11 Next, refer to Table 2, which summarizes the players' kindnesses, in each society, with maximal trade:  This is different from our setting where no players, at the time that they interact in an initial society (i.e., AUTARKY, BARTER, or MARKET), are aware of the dictator game to come. Nevertheless, we generate predictions based on the spirit of players reciprocating kindness by appeal to the intuitive principle that the kinder j has been to i, the more inclined i will subsequently be to give to j. More precisely, and with reference to Tables 2 and 3, we hypothesize as follows.
In words, average contributions in MARKET will be higher than average contributions in BARTER, which will be higher than average contributions in AUTARKY. The prediction is based on inequalities 0 < 9.166 < 18.33 in Table 3. Thus, the key reason for the different theoretical predictions for MARKET, BARTER, and AUTARKY in Stage 2 arises from the higher average perceived kindness in equilibrium in MARKET compared with BARTER and AUTARKY. Intuitively, and ignoring again potential negative effects of markets on prosociality, higher average perceived kindness would generally be an expected feature also of a real-world market economy compared with an exchange economy or an autarky society. The reason for this is that markets simply provide more opportunities to be kind to each other through shared gains from trade, and many more people relate to each other in the process. Thus, if people were motivated by positive reciprocity, we would expect, ceteris paribus, a higher average level of prosocial behavior in market economies. 12 Let %[xi(t) > 0] be the percentage of strictly positive xi(t) choices. We will also test: The motivation is in part analogous to that for H1.1-3. However, we now focus on the frequency with which participants give positive amounts rather than on how much they give. The justification relates to the intuitive idea that participants may be heterogeneous as regards whether and how much reciprocity matters to them. This may lead to a difference between the effect of reciprocity on how many participants give at all compared with the effect of reciprocity on the magnitude of contributions conditional on individuals' willingness to give a positive amount.
We get: H3.1: zi(1) < zi (2) Table 2, analogous reasoning motivates the following hypotheses concerning how participants will discriminate between their trading partner 13 and the others. Again, we distinguish between the amount participants give (H5) and the frequency with which participants give positive amounts (H6):

The Experiment
The 19 experimental sessions (including a pre-test) were carried out in the EconLab at the University of Innsbruck, in the summer and autumn of 2019. We recruited 524 student participants across all academic domains (faculties). We randomly and anonymously allocated participants to groups of four i.e., everyone was assigned an ID letter (A, B, C, or D), and no further information about the other participants was given. The experiment proceeded with two stages. In Stage 1 of our experiment, we introduced the three between-subject treatments (AUTARKY, BARTER, and MARKET), where each group of four was randomly allocated to one of these treatments. The framing of the treatments and the possible actions to be taken were described in Section 3. Thus, each player in each treatment, reflecting the different societies, was told about the locks and keys as carriers of value and that locks have values only when they are paired with the corresponding key, and vice versa. The experiment was designed such that the expectation was that each player would have the same payoff within as well as between treatments after Stage 1 to rule out possible income effects in the upcoming dictator game.
The second stage involved the two dictator games (d-game-1 and d-game-2) within each treatment and group, corresponding to the ones described in Section 3. We used a variation of the strategy method such that we elicited all participants' behavior should they become the dictator in The benefit of this design was that we generated 2×4 = 8 times as many observations as we would have obtained had we selected one version and one designated dictator per society a priori.
The reason we implemented only the decision of one individual in each group and why we had no revelation of non-randomly selected individuals' decisions was to maintain the spirit, as far as possible, of a dictator game, where the co-players (receivers) of the dictator are inactive. The reason we allocated 90 tokens (rather than, e.g., 100 tokens) is that 90 is divisible by 3, so it is easy to give it all away in equal amounts while sticking to integers. Recall also that the amounts given by the dictators were doubled and that the dictators kept the remainder of what they did not send to the other group members.
In each session, all three treatments were run simultaneously. We programmed the experiment using z-Tree [3.6.7]. Moreover, we implemented a show-up fee of 4 EURO, protecting participants from making negative payoffs overall regardless of their choices. We set the tokens-to-EURO exchange rate to 15:1. The average duration of the experiment was 10.53 minutes (SD 1.47 minutes) and the average payout (including show-up fee) was 12.89 EURO (SD 1.84 EURO).

Stage 1 and Descriptive Overview of Results in Stage 2
A prerequisite for the experimental examination of the predictions (based on reciprocity theory established in Section 3) is a design in Stage 1 of the experiment that ensures that almost all groups in BARTER and MARKET arrive at the Pareto-optimal "all-choose-Yes" equilibrium. This means that the groups exclusively consist of participants who have agreed to exchange/trade. Therefore, before we present the treatment results, we check whether the design applied in Stage 1 of the experiment meets this requirement.
Of the 348 participants in the BARTER and MARKET treatments, only eight (1.72%) held fewer than two matching key-lock pairs at the end of Stage 1. Most participants thus agreed with the trade agreements, which means that the design worked as intended. Consequently, for BARTER and MARKET, we only include the groups that have arrived at the "all-choose-Yes" equilibrium to test the predictions developed in Section 3. In doing so, we also ensure that there are no differences in participants' income prior to Stage 2, which rules out confounding income effects in the dictator game. Therefore, we arrive at a total of 516 participants (distributed as 176, 168, and 172 participants between the AUTARKY, BARTER, and MARKET treatments, respectively) for the econometric tests of the hypotheses. 14 We follow Benjamin et al. (2018) and apply a 5% and a 0.5% significance level in all statistical tests in the paper.
First, we present Table 4, which shows descriptive statistics on contributions for all three treatments. Recall that in d-game-1 [d-game-2], the contributions to others must [not] be equal; Table   1 reports data concerning the contributions to each of the other three group members. Note that Figure   A1 in Section A1 in the Appendix shows a heatmap of contributions in d-game-1 and d-game-2 (Spearman's rho = 0.76, p < 0.005, N = 516). We find that behavior is consistent across the two versions of the dictator game.

Result 1: Average contributions by dictators in the MARKET treatment are significantly higher than average contributions by dictators in AUTARKY and BARTER in d-game-1 and d-game-2. Nevertheless, we find no difference between contributions in dictator decisions in AUTARKY and BARTER under both versions of the dictator game.
Support: We test the hypotheses derived from reciprocity theory in Section 3 and start with hypotheses H1 to H4. In Section 3, we established the predictions that the average contributions in dgame-1 and the aggregated average contributions in d-game-2 will be highest in the dictator game in MARKET, followed by BARTER, and finally AUTARKY (H1 and H3). Figure 1 visually compares the average contributions under both versions of the dictator game between the three treatments. The visual impression suggests that there is no difference between the average contributions in AUTARKY and BARTER, but that average contributions in the MARKET treatment appear to be higher than in both other treatments. In AUTARKY, no interaction between agents takes place. In BARTER agents face a bilateral exchange setting, and in MARKET agents face a multilateral market setting. The whiskers indicate the 95% confidence intervals.
First, we statistically test for treatment differences in d-game-1; compare hypotheses H1.1 -H1.3. We apply non-parametric, pairwise Mann-Whitney-U tests and report the results in Table 5 (we refer to one-sided p-values). Specifically, we do not find statistically significantly higher dictator contributions in BARTER than in AUTARKY. Nevertheless, participants who engaged in market interactions in Stage 1 of the experiment in MARKET contribute statistically significantly higher amounts to the other group members compared with participants in AUTARKY and BARTER, Barter Market respectively. Considering the magnitude of these effects, we find that they are not only statistically but also economically significant. Table 4 shows that participants in MARKET and d-game-1 contribute on average 2.60 (2.63) more tokens to the other group members than participants in AUTARKY (BARTER). This corresponds to 21.35% (21.64%) higher contributions in the MARKET treatment than in AUTARKY (BARTER) and is therefore sizeable. Furthermore, as a robustness check, we re-test H1 by applying OLS regressions. The results, which are reported in Table A1 in Section A1 in the Appendix, remain qualitatively robust. We also apply a multivariate model controlling for gender and political preferences. 15 We find that participants who self-report being more right-wing contribute lower amounts in the dictator game. Additionally, we find that female participants are more likely than male participants to make positive contributions. If we compare the magnitudes of the statistically significant effects on contributions in the dictator game between the variable representing political attitudes (-1.258) and, for example, the treatment dummy MARKET in Table A1, we find that the treatment effect is about 82% larger in absolute terms than the effect of political attitudes.
Based on these results, we can only support parts of hypothesis H1. In contrast to the predictions, compared with AUTARKY, we only find higher contributions in MARKET, and not in BARTER. Furthermore, to examine H2, we test for pairwise differences regarding the share of positive contributions between treatments in d-game-1; compare hypotheses H2.1 -H2.3 in Table 5. 15 We checked randomization to ensure that the randomization procedure worked and that there are no differences in the distribution of the personal characteristics we collected (gender and political preferences). The results are presented in Table A7 in Section A1 of the Appendix. We observe that the distributions of the variables are not always statistically indistinguishable between treatments. Therefore, we proceed cautiously and, as a robustness check, retest all hypotheses in the paper using multivariate regression analyses. Table 5: Non-parametric test statistics for treatment differences in the contributions in d-game-1 (equal contributions to all group members). In AUTARKY (1), no interaction between agents takes place. In BARTER (2) agents face a bilateral exchange setting, and in MARKET (3) agents face a multilateral market setting. To test our unidirectional hypotheses, we refer to one-sided p-values from normal approximation in the Mann-Whitney-U tests. For Fisher's exact test, we report one-sided p-values. H1.1 to H2.3 indicate the hypotheses on contributions in dgame-1 across treatments made in Section 3. A, B, C, and D indicate the IDs of the four group members and xi the equal amount participant i contributed to each of the other three group members. Specifically, we expect the share of positive contributions in the dictator game to be highest in MARKET, followed by BARTER and finally AUTARKY. Therefore, we apply pairwise Fisher's exact tests and report the one-sided p-values in the second half of Table 5.

Number
We do not find supporting evidence for hypothesis H2, as we do not find any statistically significant difference in the share of positive contributions between any of the three treatments.
Furthermore, as a robustness check, we re-test H2 with regressions outlined in Table A3 and A4 (see Section A1 in the Appendix). Again, the results remain qualitatively robust. From an experimental perspective, however, these treatment comparisons must be interpreted cautiously as treatments differ in more than one aspect, making causal inference difficult. However, we consider the analyses informative as they represent a direct test of the theoretical predictions.
Next, we proceed by econometrically examining hypotheses H3 and H4, which deal with the results in d-game-2, where participants could send differing amounts to the three other group members. Specifically, in Section 3 we established the same predictions for d-game-2 as for d-game-1. Therefore, we replicate the analyses reported in Table 5 for average contributions in d-game-2 and show the results in Similar to the results for d-game-1 and the associated hypotheses H2.1-H2.3, we do not find a statistically significantly higher share of positive contributions in MARKET than in BARTER or AUTARKY, nor do we find a higher share in BARTER than in AUTARKY. We also apply regression models as robustness checks for H3 (Table A2 with univariate and multivariate OLS regressions) and H4 (Table A4 with univariate and multivariate Logit regressions) in Appendix A1. We find that the results remain qualitatively robust. This suggests a tendency of reciprocity concerns to seem to have an influence on the magnitude of contributions conditional on the willingness to give a positive amount but not on the willingness to give any amount per se. Table 6: Non-parametric test statistics for treatment differences in the contributions in d-game-2 (deviating amounts to all group members were possible). In AUTARKY (1), no interaction between agents takes place. In BARTER (2) agents face a bilateral exchange setting, and in MARKET (3) agents face a multilateral market setting. To test our unidirectional hypotheses, we refer to one-sided p-values from normal approximation in the Mann-Whitney-U tests. For Fisher's exact test, we report one-sided p-values. H3.1 to H4.3 indicate the hypotheses on contributions in dgame-2 across treatments made in Section 3. A, B, C, and D indicate the IDs of the four group members and zi the average amount participant i contributed to each of the other three group members. In the last step, we go more into detail in the barter treatment and test for a causal effect of reciprocity in Stage 1 on prosociality in our dictator games. To determine this effect, each participant in the group was informed of the decisions and outcomes of all group members to eliminate a possible confounding factor due to information differences.  We apply Wilcoxon signed-rank tests to statistically test for differences and show the results in Table   7 (we refer to one-sided p values). We infer that the contributions to former exchange partners are statistically significantly higher than contributions to former non-exchange partners (see hypothesis H5 in Table 7). This means that participants share more with group members with whom they engaged in a barter in Stage 1 of the experiment than with group members with whom they did not. This result supports hypothesis H5 and suggests that participants in our experiment gain utility by reciprocating intended kind behavior (gains through trade) of matched participants in Stage 1 by contributing more to these participants in the dictator games. Table A5 in Section A1 of the Appendix shows results of univariate and multivariate Logit regressions with contributions in the BARTER treatment and d-game-1 as the dependent variable. Again, we find qualitatively robust results to the non-parametric tests in Table 7.  yij (2) indicates average contributions to former exchange partners in BARTER, while yi 1 (2) represents average transfers to the first non-exchange partner and yi 2 (2) to the second non-exchange partner in BARTER. 18 Overall, we find that positive contributions by dictators are statistically significantly more frequent in dictator decisions where participants were matched with group members who were exchange partners in Stage 1 than in decisions where they were matched with group members who were not. This is in line with hypothesis H5 and further supports the notion that reciprocity concerns do matter for participants. It further suggests that reciprocity not only affects the magnitude of giving but also the willingness to give any positive amount. In Table A6 in Section A1 of the Appendix, we

Number
show results of univariate and multivariate OLS regressions with a dummy variable that equals 1 for positive and 0 for non-positive contributions in BARTER and d-game-2 as the dependent variable. We find qualitatively similar results to the non-parametric tests in Table 7, but the effects remain borderline insignificant.

Conclusion
In this paper, we contributed to the old question of whether and how market interactions influence moral behavior. We approached this issue both theoretically and experimentally. We introduced three market institutions: AUTARKY, where no interaction between participants took place; BARTER, where participants faced a bilateral exchange setting; and MARKET, where decisionmakers faced a multilateral market setting. In our theoretical contributions, we built on the D&K reciprocity theory to obtain theoretical predictions of market interaction on subsequent prosociality.
We first showed theoretically that if people are motivated by reciprocity, then whether people are prosocial depends on the structure of preceding trade and on whether we consider a trading partner or someone else. Under AUTARKY, people will not be inclined to be kind to others. In MARKET, reflecting a modern economy where all individuals trade with each other (via chains of exchange mediated by monetary payments), people will generally tend to be kind to others. In the intermediate case, BARTER, where there is 1-on-1 exchange between some individuals but not others, people will be inclined to be kind to their trading partners but not to others. Our theoretical insights harmonize well with some prominent thoughts about key transitions that occurred through economic history mentioned earlier, such as those provided by Pinker (2011) and McCloskey (2006, 2010, and they are also consistent with cross-cultural experimental findings of Henrich et al. (2001Henrich et al. ( , 2004Henrich et al. ( , 2005).
Yet, none of these studies could of course identify causal links and there are several potential mechanisms behind the observed patterns, e.g., related to the rapid income increase resulting from the development of market economies.
Therefore, we have also conducted a lab experiment where the income is held fixed to provide simple tests of the derived theoretical hypotheses. We found, in line with the theoretical predictions, higher prosociality following market interactions compared with the barter interactions and the autarky setting. We showed that dictator contributions in the market setting (MARKET) were significantly higher than those in AUTARKY and BARTER in d-game-1 (where contributions had to be the same to each other player) and in AUTARKY in d-game-2 (where differing contributions were possible). In contrast to our predictions, we did not find any differences in prosocial behavior between participants in AUTARKY and BARTER. However, we nevertheless found that people gave significantly more to exchange partners than non-exchange partners in BARTER, also in line with the theoretical predictions.
All in all, our theory seems to stand up fairly well to our experimental tests. These support some but not others of the positions adopted by the philosophers and other thinkers we cited in Section 2. Our experimental results also harmonize with those of Choi & Storr (2021), but in our case independently of trust. As discussed in the introduction, we do not claim that our theory and experiment reflect all relevant aspects of how market interactions affect individual prosociality. Our paper merely offers a complementary and novel way to think about those positions and patterns, where market interaction may make people behave more prosocially because of their inclination to reciprocate. Correspondingly, historical episodes that involve the use of markets may then promote prosocial choices outside those markets. We encourage future theoretical as well as experimental research on how different mechanisms of market interaction directly affect prosociality, including instances when trade may make people view each other as unkind (e.g., in situations where fraud or embezzlement occurs), thereby increasing external validity. Another extension would concern cases where the market interaction is based on agents with large differences in initial endowments, and where the gains from trade are highly unequitable. We plan to return to these issues in future research.         Table A8: Non-parametric test statistics for treatment differences in the contributions in d-game-1 (equal contributions to all group members) with the full sample. In AUTARKY (1), no interaction between agents takes place. In BARTER (2) agents face a bilateral exchange setting. and in MARKET (3) agents face a multilateral market setting. To test our unidirectional hypotheses, we refer to one-sided p-values from normal approximation in the Mann-Whitney-U tests. For Fisher's exact test, we report one-sided p-values. H1.1 to H2.3 indicate the hypotheses on contributions in d-game-1 across treatments made in Section 3. A, B, C, and D indicate the IDs of the four group members and xi indicates the equal amount participant i contributed to each of the other three group members.  Table A9: Non-parametric test statistics for treatment differences in the contributions in d-game-2 (deviating amounts to all group members were possible) with the full sample. In AUTARKY (1), no interaction between agents takes place. In BARTER (2) agents face a bilateral exchange setting, and in MARKET (3) agents face a multilateral market setting. To test our unidirectional hypotheses, we refer to one-sided p-values from normal approximation in the Mann-Whitney-U tests. For Fisher's exact test, we report one-sided p-values. H3.1 to H4.3 indicate the hypotheses on contributions in d-game-2 across treatments made in Section 3. A, B, C, and D indicate the IDs of the four group members and zi indicates the average amount participant i contributed to each of the other three group members.