Learning in the WTO/DDA Negotiations?: An Experimental Study

The purpose of this paper is to identify learning in games in experimental economic settings, and apply their results to real multilateral trade negotiations, such as the Doha Development Agenda (DDA) in the World Trade Organizations (WTO). This paper argues that the structure of games including a veto player (Veto games) is similar to the WTO/DDA negotiations in that the players do not possess identical power. This paper’s main contribution to the literature involves showing that learning about power is dominant over learning from simple repetition in Veto games. Additionally, this paper shows that players are concerned about how much they have gained in previous games in Veto games, although their memories generally do not last beyond the next game, and thus they tend to be selfish as they have less shares. Based on these results, there is a possibility to be more generous in the distribution of benefits by allowing players without veto power to retain special rights so that they would not be totally powerless. It also shows the necessity of having “respite” in the process of negotiations and policy options for choosing partners for winning coalitions.

This paper argues that the DDA negotiations, in reality, are actually Veto games, which includes a veto player; thus, not all the participants have equal rights in the process of the negotiations, and the inequality between players is (mostly) generically determined. This paper shows learning from simple repetition, which means learning about rules of the games or the expected consequences of games, is observed in the bargaining games among identical players from Control games; however, it finds different types of learning from games with idiosyncratic players, such as Veto games, which dominate learning from simple repetition. In addition, this paper also shows short memory dependency in the Veto games, which is not clearly identified in the Control games with no veto players. Based on the results, it suggests policy implications for dealing with problems in the DDA negotiations.
There are some previous studies on bargaining and learning in the experimental economics literature. With respect to bargaining, Frechette, Kagel, and Lehrer (2003), Frechette, Kagel, and Morelli (2005a, 2005b, 2005c) discussed legislative bargaining based on Baron and Ferejohn (1989), demand bargaining based on Morelli (1999), and weighted voting by Gamson (1961). 3 Those studies mostly show the qualitative similarity between theory and experiments by showing proposal power, but in their details there are differences on which theory may be silent. Both Kagel, Sung, and Winter (2010) and Sung (2012) discussed the Veto games based on Baron and Ferejohn (1989) and Winter (1996), and showed the strength of a veto power and proposal power and how to restrain these powers. In particular, comparing power between the veto right and multiple votes, Sung (2012) showed that a veto power is superior to multiple votes in the bargaining. This paper has a similarity to Sung (2012) in that both deal with veto power and try to apply things to the DDA negotiations, however, this paper tries to observe the power of veto from the perspective of learning by comparing it with learning on the games using the players' proposing data.
Learning has been an important topic for experimental and behavioral economics in that it was considered as the link between economic theories and applications, where experimental and behavioral economics plays a crucial role. As for the literature on learning, Erev (1995, 1998) and Camerer and Ho (1999) suggested models for learning in a game setting. Using the findings of Camerer and Ho (1999), Frechette (2009) argued that learning can be affected by long and short memory, and tested legislative bargaining game experimentally, thereby suggesting methods to fix the problem of variance-covariance matrix. Frechette (2009) tried to test the efficiency of estimators on learning from Camerer and Ho (1999), however, this paper chooses to focus on veto power in the learning of games and tries to suggest some policy implications for real trade negotiations such as the DDA.
The rest of the paper is as follows. Section 2 introduces a theory on multilateral bargaining. Section 3 explains the experimental designs of the games. Section 4 reports the experimental results, of which the policy implications from the experimental results are discussed in section 5. Section 6 states the conclusions.

II. THEORY 4
The multilateral bargaining theories in this paper are from Baron and Ferejohn (1989) and Winter (1996), which discussed "divide the dollar" games. For the multilateral setting of this paper, I assumed three players playing infinitely repeated stage games, which discounts the available total benefit, as the game moves via stages. While all three players have identical power in Control games, consistent with Baron and Ferejohn (1989), one out of the three players in the Veto games from Winter (1996) is a veto player who can defeat the proposals that it finds unsatisfactory, whereas the other two players are non-veto players who have no veto power although their voting right is same as the veto players.
At the first stage, player i from the three players is selected at random as a proposer of the distribution of the benefit among them, where the proposal is an allocation (x 1 , x 2 , x 3 ) of the single unit of benefit among the three players, i.e., x i ≥0 and ∑ i x i =1. Then, the proposal suggested is voted up or down by all players. If the proposal is accepted and a winning coalition is formed, then it passes and the game is over as each player is granted the proposed payoff. In Control games, ⓒ 2015 Journal of East Asian Economic Integration a winning coalition can be formed on the approval of at least two out of the three players on the proposal, but a winning coalition in Veto games is a majority including a veto player. If the selected proposal does not have majority approval, in other words, if it is rejected, the game moves on to the second stage with the discount of the benefit. As they move on to the second stage, a proposer is newly selected at random and the game repeats the process until a winning coalition is formed. In the end, player i receives the payoff x i δ t-1 , where δ is the common discount factor, if the game reaches stage t. I adopted the stationary subgame perfect equilibrium (hereinafter, SSPE) of the game as the theoretical benchmark, hence, two basic principles on SSPE are expected: the share proposed by players and the length of the game. For Veto games, the ex-ante expected payoffs of the players in an SSPE must satisfy the following Eq. 1 and Eq. 2: where is the payoff of the veto player, is the payoff of a non-veto player, and δ is the discount factor. The theory also predicts the game is over in the first stage, thus, as the ex-post expected payoff, the veto proposer takes and the non-veto proposer gets . For Control games, the ex-ante expected payoff of the player is (1/3), and as the ex-post expected payoff, the proposer would get . More importantly for this paper, the theory predicts minimal winning coalitions which consist of only two players from all the three players.

III. EXPERIMENTAL DESIGNS
Subjects were recruited through e-mail solicitations from a set of students enrolled in undergraduate economics classes at The Ohio State University for the then current and previous academic quarter. Three subjects in a group had to divide $30 among themselves through a total of 10 games. Between 12 and 18 subjects were recruited for each experimental session, so that there would be between 4 and 6 groups bargaining simultaneously in each session. After each game was over, subjects were randomly rearranged in different groups, with the restriction that in the Veto sessions each group contained a single veto player. In the veto sessions, veto players were selected randomly at the beginning of the session, with their role as veto players remaining fixed throughout the session.
The procedures for each game were as follows. Departing from theories, all subjects first entered a proposal on how to allocate the $30 among the three subjects in their group. Then, one proposal was picked randomly to be the standing one. 5 This proposal was posted on the subjects' screens, giving the amounts allocated to each player by its subject number. If the proposal was accepted, then the proposed payoff was implemented and the game ended. But, if it was rejected, then the process repeated itself, with the amount of money available reduced by the relevant discount factor. Complete voting results were posted on the subjects' screens, giving the amount allocated by subject number, whether that subject voted for or against the proposal, and whether the proposal passed or not. 6 In Veto sessions, the veto players were clearly distinguished on everyone's computer screen throughout the entire session.
For each treatment, there were two inexperienced subject sessions and one experienced subject session. Experienced subjects all had prior experience with exactly the same treatment for which they were recruited. 7 However, since not everyone chose or was able to return, this study could not attempt to hold the type constant between the inexperienced and experienced subject sessions. 8 5 Since all proposals have an identical probability to be selected, the experimental setting has virtually no differences from the theory. 6 Screens also displayed the proposed shares and votes for the last three games, as well as the proposed shares and votes for up to the past three stages of the current game. 7 All subjects were invited back for the experienced subject sessions. In case an uneven number of subjects returned, we randomly determined who would be sent home. The experiments were executed in 2003. 8 So, for experienced sessions, when they were inexperienced subjects, a veto player could be a non-veto player, and vice versa. A total of 10 games were held in each experimental session with one of the games, selected at random, where the subjects were paid off. In addition, each subject received a participation fee of $8. For sessions with inexperienced subjects, these cash games were preceded by a bargaining round in which subjects were "walked through" the contingencies resulting from either rejecting or accepting an offer. The inexperienced subject sessions lasted approximately 1.5 hours; the experienced subject sessions were approximately 1 hour as summary instructions were employed and the subjects were familiar with the tasks. Although each game could potentially last indefinitely, there was never any need for intervention by the experimenters to ensure completing a session within the maximum time frame (2 hours) for which the subjects were recruited. Table 1 lists the number of sessions and the number of subjects in each treatment condition, as well as theoretical predictions on the shares the players take. answer the questions, which is a typical way to assess the behavior of proposers or voters when using this kind of panel analyses. The experimental results are reported by answering the following questions. 1) As players repeat games, were players more likely to propose MWCs than the others? Was there any learning from the repetition? 2) Did players care about the shares they acquired in the previous game when they proposed the allocation of benefits among them (i.e., learning from a previous outcome)? In addition, did they depend on short memory or long memory?
The frequencies with which players proposed MWCs were increasing as they repeated the game, mostly by proposing more shares to themselves (as can be seen in Figs. 1, 2, and 3). In particular, the veto players proposed MWCs at higher frequencies. It is also confirmed by the following regressions.
Equations. 3 and 4 are the random effect probit models of which the dependent variables for the equations are MWC it that have a value of 1 if the proposals are MWCs, and 0 otherwise. In theory, proposing MWCs should be the most preferred but it is not really desirable for communities in that it excludes a player. Thus, analyzing MWCs can provide policy implications to reduce those kind of selfish distributions, as well as the players' learning about theory and reality.
This paper considers two approaches on estimations, a probit model and a two-step method from Rivers and Vuong (1988) because of the concerns about endogeneity between the proposers' shares and MWCs. As a result, it turns out that the equations for Veto sessions are free from an endogeneity problem but those for Control sessions are well fitted with the two-step method by Rivers and Vuong (1988). Therefore, Equations 3 and 4 are used for Veto and Control games, respectively. 9 In Equations 3 and 4, A V , B V , A C , and B C are vectors of coefficients for "share" variables and "learning" variables, , , , , and . The and are unobserved components, and and are idiosyncratic errors. 10 9 Other results are available in Appendix A. 10 The Residuals C,it in Eq. 4 is from e i +u it in the first step regressions, . The detailed equations are reported in the Appendix B.
The explanations on variables used for the analyses are provided in Table 2. The "share" variables explain the dependent variable more than the other variables, and are of greater interest to players, because for them obtaining a greater share means larger prizes. In Veto sessions, the aforementioned share type of variable is classified as the following: variables such as DV V , it , which is 1 if the player is a veto player, and 0 otherwise; PS it , which is the share the player i proposes for itself at time t, the interaction term between DV it and PS it ; and Urg it , which is 1 if the discount factor of the games in the experiment is 0.5, and 0 if it is 0.95. 11  The "learning" variables are used to explain the learning behavior of players when they make a proposal. The Equations include two types of learning variables, learning from repetition and previous outcome. The learning variables from repetition are, for both Veto and Control sessions, Dj, which is 1 if j=T, and 0 otherwise, where T are the games it takes from 3 to 10, and Exp it , which is 1 if the players are experienced subjects, and 0 otherwise. Thus, Dj and Exp it represent learning from repetition within sessions and between sessions, respectively. The learning variables from previous outcome are ShareT-1 it that represents the share players acquired in the game immediately before; DShareT-1_50 it that is 1 if they gained equal to or more than 50% of share in the previous game, and 0 otherwise; DShareT-1_0 it that is 1 if players gained a zero share in the previous game, and 0 otherwise; and ShareT-3 it and ShareT-5 it that represent the shares the players gained three and five games ago, respectively. The results of the random effects probit models in the Veto and Control sessions are shown in Tables 3 and 4. For each session, five types of subequations are estimated. In Table 3, the estimates of the coefficients of DV, PS, and their interaction terms are mostly statistically significant at better than 1% level. It (obviously) implies that veto players are more likely to propose MWCs, other things being equal. 12 For players in Control sessions, PS is positive and statistically significant at better than 1% level considering both approaches, as shown in Table 4. 13 However, the Urg it is not statistically significant at any conventional level in both the Veto and Control sessions.
Looking at the learning variables there are clear differences in learning behaviors from repetition between Veto and Control sessions, as shown in Tables 3 and 4. The estimates of the coefficients of D7 and D10 are not statistically significant at any conventional level in the Veto sessions, unlike those in the Control sessions that are positive and statistically significant at better than the 1% or 5% levels. It implies that as they repeat games, the players in the Control sessions are more likely to propose MWCs, other things being equal, but in the Veto sessions, it is not clearly identified. The Exps for veto players in the Veto sessions are negative and statistically significant variables at better than the 1% or 5% levels, but those are not statistically significant in the Control sessions. It implies that as players have prior experience participating in the same experiments, the veto players were less likely to propose MWCs; however, it is not clear for players in Control sessions. 12 The marginal effect of DV is positive as it is calculated from Table 2 and 3. 13 Unlike other equations, the estimates on the all coefficients considered for Eq. 4-(5) in Table 4 are not statistically significant at any conventional level.  Standard errors in parentheses: * significant at 10%; ** 5%; *** 1%; + at 5.2%.
Note: The estimates for D3, D4, D5, D6, D8, and D9 are intentionally omitted without a loss of generality in the arguments.
Thus, players in the Veto games seem not to learn the benefits from the MWCs, whereas players in the Control games do learn as they repeat the games. However, it is hypothesized that the estimates for D7 and D10 are overwhelmed by the PS in the Veto games. To identify the learning from repetition in the Veto sessions, Eq. 5 excludes an independent variable, PS, from Eq. 3. Although the estimates of the coefficients of D7 and D10 and their interaction terms with DV (in Table 5) are not individually statistically significant, according to loglikelihood test results (in Table 6), the game time dummies (e.g., D7 and D10) and their interaction terms (D7*DV and D10*DV) are statistically significant at the 1% or 5% levels in Equation.5.  From the findings, it can be concluded that for players in the Veto games within sessions, there are two opposite learning processes from repetitions. The first one is preference for proposing MWCs, as observed in the Control games, which is partly shown in Table 6. Those are typical behaviors for subjects who participate in laboratory experiments as well as field practices. The second learning, which exceeds the first one only in the Veto games, is "recognizing a veto power." Analyzed by experimental results, the veto players, as strong players, might want to "show off" their powers during the games facing identically weak non-veto players. Thus, the veto players did not hesitate to use MWCs to expand their PS as they repeated the games, which prevail over all other considerations (shown in Figs. 1 and Fig. 3). 14  Concerning learning between sessions, for experienced players, the veto players were less likely to propose MWCs than other players when they were participating in experienced sessions. This is a bit odd, as previously there had been an empirical finding of a higher frequency of MWCs by veto players in expanding their shares in inexperienced sessions. However, it turned out that in experienced sessions, two out of the five veto players, who were non-veto players when they were in inexperienced sessions, proposed non-MWCs in many of the chances they had to propose MWCs. 15 Because of their experience being nonveto players was an untoward one of exclusion, they might have realized that a little generosity in offering a positive share to both non-veto players would attract their cooperation, whereas their generous offers to non-veto players were of small amounts that did not reduce their shares. Furthermore, a player who was a veto player in both inexperienced and experienced sessions did nothing but propose MWCs.  as they repeat games within sessions, but only in the Veto games, thus the usual self-interested processes of veto power. In addition, the experience being non-veto players would lead veto players to propose positive shares to both non-veto players so as to induce more cooperation.
Other than learning from repetition, learning from previous outcomes is not identified in both the Veto and Control sessions. The estimates of the coefficient of ShareT-1 are shown in Table 3, which are negative and statistically significant at better than 5% in the Veto sessions; also, looking at the interaction terms, DV*DShareT-1_50, those are negative and statistically significant at better than the 5.2% or 6.3% levels. In addition, the estimates of the coefficient for DShareT-1_0 are positive and statistically significant at better than 1%. These results imply that players in the Veto games were more likely to propose MWCs as they got smaller shares in the previous games, but veto players were less likely to propose MWCs as they acquired more than or equal to 50% of the total share, other things being equal. In addition, when non-veto players had zero shares in the previous game, they were more likely to propose MWCs, other things being equal. However, there were no estimates that represented learning from previous outcomes which are statistically significant at any conventional level, as can be ascertained from Table 4.
Tables 3 and 4 also report the results of regressions on Veto and Control sessions replacing response variables based on short memory, ShareT-1, by those based on relatively long memories, such as ShareT-3 and ShareT-5. The results show that those variables are not statistically significant at any conventional level. This implies that the players' long memory would not be clear determinants of their current proposing behaviors.
Learning from previous outcome captures the players' current behavior in response to their previous outcomes in games within sessions. If the players are rational as predicted in the theory, then the players' response should be independent of history because of the stationarity of games, and the probability that the players will propose MWCs should not depend upon their payoffs in the previous games. In particular, in the experiments, the stationarity between games was guaranteed by regrouping players at the end of every game for the experiment. However, from the findings, unlike players in the Control games who seemed not to care what they had gained in the previous games when they proposed, the players' 262 Hankyoung Sung ⓒ Korea Institute for International Economic Policy proposing behaviors in the Veto games depend upon their previous payoff, at least in the short-run.
As such, players in the Veto games would behave more strategically than those in the Control games. Thus, the overall experimental results imply that in the Control games with relatively fewer noises from less complexity in the incentive structures, the players have room to focus on the stationary equilibrium as the games progress because they are not concerned with how to cope with different type of players. As a result, learning in the Control games is relatively simple and thus clearly identified. However, players in the Veto games might suffer from the difference in effects because they had to face idiosyncratic players during the games, and hence players in the Veto games behave more strategically than those in the Control games and their learning is mixed and not clearly identified in some cases.
Another interesting finding comes from the estimates about memory. This paper finds that a long memory would not be as effective as a short one as a response to the previous outcome, whereas Frechette (2009), who adopted the learning models of Camerer and Ho (1999), found similar ones only from the repetition. This finding suggests the extension of the learning models into interactive behaviors.

Conclusion 2:
The players in the Veto sessions were more likely to propose MWCs as they gained smaller shares than in the previous games, but it is not clear for the players in the Control sessions. Those responses to the results of the games three or five periods ago are not clearly identified for both the Veto and Control games.

Is there Any Learning As Time Passes?
It is desirable if most participants benefit from the multilateral trade negotiations and none of participants are excluded from the benefits. This leads to the question: what is the role of learning in fulfilling expectations in negotiations? In this paper I focus on that issue by observing the players' proposing behaviors. 16 ⓒ 2015 Journal of East Asian Economic Integration This study suggests it may not be so, thus it should be expected that a learning process is quite different from real trade negotiations such as the DDA. I argue that aspects of real trade negotiation are more like the Veto games than the Control games because all participants do not have equal power and some of them have sufficient power to defeat the agenda on the table which may be acceptable to others. In addition, power, especially the so-called veto power, is generally not conferred by the rules of a negotiation but it is a result of the "real world" power of a political economic entity. 17 For example, concerning the DDA, some countries, such as the U.S, the EU, China, India, and Brazil, have power based on their economic or political clout and not from the intrinsic rules of trade negotiations; thus they may be considered veto players whereas others are non-veto players. Hence, if we try to analyze the DDA negotiations using the Control games, which are less strategic than the Veto games because of the simplicity of the game structure (although those are based on strategic game theories), then they may not result in plausible explanations or suggestions concerning the DDA negotiations. As observed, the structure of a game can totally change the type of learning and make the game more complicated by accepting the inequality of power between the players.
From the results of this study, it can be seen that for veto players the learning about power exceeds the expected gradual learning from simple repetitions. In reality, it can be observed in trade negotiations, as most of the time in multilateral trade negotiations (e.g., the DDA and UR) is be spent on the sharing of interests between strong participants, whereas the weaker ones are excluded. The less powerful countries attempt to gain support for their positions, however, as shown in the DDA negotiations in July of 2008, the more powerful countries determine the outcome of the DDA.
As for policy implications from the experimental results of the Veto games, although the existence of strong players mostly leads to games resulting in a less egalitarian equilibrium, there is a possibility of reducing the inequality between players in that certain players who had previously been weak could become generous as they become strong, possibly because of their memories of being relatively powerless and alienated, etc. If we endowed weaker players with power comparable 17 So, it is not an ideal assumption that some players have multiple votes and others do not. In addition, the veto power prevails over multiple votes, according to Sung (2012). to the stronger ones, for example, by allowing weak players to have proposal rights, then the shares of players may become more egalitarian. 18 In more realistic terms, in the case where special advantages are provided to individual members, one should allow only one group of weak players to have special rights to form an efficient coalition of weaker players against the stronger ones.

Does Memory Matter?
According to the theory, the memory of outcomes from previous games should not matter in the process of negotiations in experiments on games, but in the Veto games, it is suggested that players do care about their gains in a previous game when they make proposals. However, it is not clear whether they consider their benefits from a game that was a long time ago.
First, when participants in negotiations assume inequality in power between players and have bad memories, they become more selfish and parsimonious in the process of negotiations. But they tend to lose those memories as time goes by. In the Veto games, it is presumed from the results that a lesser share in a previous game may lead to a less egalitarian proposal. In particular, as shown experimentally, excluding weak players could make the participants more selfish and the games more intense, at least in the short run. However, their memory is relatively short, and it is suggested that the players in real trade negotiations would have the advantage of timing in negotiations by implying it would be ideal to have enough time to wrap up the issues after blunt negotiations. Possibly, "respite" is sometimes necessary to achieve a more egalitarian distribution between the players.
Second, as long as stronger players gained sufficient benefits, which should not have to be as large as some theories predict (e.g., Winter, 1996), they could become generous. It suggests policy options to choose between guaranteeing a substantial share to stronger players by distributing small shares to many weaker players, and thereby giving relatively smaller shares to stronger players by excluding some weaker partners. 18 Some theoretical and experimental studies about proposal power in negotiations have been carried out by Kagel, Sung, and Winter (2010), Sung (2012), Baron and Ferejohn (1989), Winter (1996), and Frechette, Kagel, and Lehrer (2003). Note that it would not be desirable that all players become veto players because it is generally costly as a whole.
ⓒ 2015 Journal of East Asian Economic Integration VI. CONCLUDING REMARKS This paper suggests two types of learning in multilateral trade negotiations: learning from repetition and learning from previous outcomes. It shows that the learning from simple repetition would be dominated by learning about power in the Veto games, which is closer to the structure of real trade negotiations. In addition, in the Veto games, as in learning from previous outcomes, the players care for how much they gained in the previous games, and thus they tend to be selfish as they have lesser shares, although their memories do not last long. Based on these results, this paper suggests a policy alternative for a more generous distribution of benefits and the necessity of a respite in the process of negotiations, especially long ones such as the DDA.
This paper did not intend to argue that every aspect shown in an experimental study would also be observed in the DDA negotiations, because the experiment is a controlled environment, whereas the DDA negotiations have many uncontrollable factors. Nevertheless, it implies some valuable implications in that the experimental results are controlled ones that may directly represent plausible insights for the DDA negotiations that reveal mixed signals.
This study has some limitations. First, it does not present any models for learning. But as it opens up new perspectives in learning about the DDA negotiations, it may be worthwhile to modify existing learning models, including issues that this paper tackles. Second, as bilateral trade negotiations are more popular, it would be necessary to study through experiments the bilateral negotiations that could be more strategic, cognitive, and intense. Third, even though the experiments were designed to test bargaining games, such as trade negotiation, those may not reflect aspects in real trade negotiation. All the limitations, including the aforementioned, are possible issues for future research. Standard errors in parentheses: * significant at 10%; ** 5%; *** 1%; ++ at 6.3%.