Solving the second-order free rider problem in a public goods game: An experiment using a leader support system

Punishment of non-cooperators—free riders—can lead to high cooperation in public goods games (PGG). However, second-order free riders, who do not pay punishment costs, reduce the effectiveness of punishment. Here we introduce a “leader support system,” in which one group leader can freely punish group followers using capital pooled through the support of group followers. In our experiment, participants engage in three stages repeatedly: a PGG stage in which followers decide to cooperate for their group; a support stage in which followers decide whether to support the leader; and a punishment stage in which the leader can punish any follower. We compare a support-present condition with a no-support condition, in which there is an external source for the leader’s punishment. The results show that punishment occurs more frequently in the support-present condition than the no-support condition. Within the former, both higher cooperation and higher support for a leader are achieved under linkage-type leaders—who punish both non-cooperators and non-supporters. In addition, linkage-type leaders themselves earn higher profits than other leader types because they withdraw more support. This means that leaders who effectively punish followers could increase their own benefits and the second-order free rider problem would be solved.

The leaders of Groups 23 to 27 are other types. The leaders of Groups 23 to 25 punish only followers who neither contribute nor support leaders, and thus, it is difficult to categorize either L-, S-or G-type leaders.
The leaders of Groups 26 and 27 punish followers who contribute and support leaders. pool and support their leaders throughout the 15 periods in this group and the leader punishes 0 in total.
** The leader of Group 1 never encounters followers who do not contribute and do not support their leaders throughout the 15 periods, but this leader punishes both non-contributors and non-supporters. Thus, we regard the leader as L-type.

Supplementary analysis 1
Analysis including groups in which leaders punished followers who contributed and supported leaders In the main text, we show the analysis that excludes the data of two groups (26 and 27), in which the leaders punish followers who contribute and support the leaders, because of interpretation difficulties. We describe the results of the analysis, including these two groups below.
First, we describe the comparison between L-and NL-types. We categorize the leaders of the two groups as NL-type because these two leaders behave differently from the typical L-type leaders, who punish only non-contributors and non-supporters. A Mann-Whitney U-test is conducted in this categorization and the results show the same tendencies as the results in the main text. There is a significant difference in all indexes; PGG contribution, p< .001, L-type higher; support for the leader, p< .001, L-type higher; profit of the leader, p< .001, L-type higher; profit of followers, p< .001, L-type higher.
In addition, when we ignore the punishment to followers who contribute and support their leaders, we could categorize the leader of Group 26 as NL type and the leader of Group 27 as L type. A Mann-Whitney U-test is conducted in this categorization and the results show the same tendencies as the results in the main text: there is a significant difference in all indexes; PGG contribution, p< .001, L-type higher; support for the leader, p= .002, L-type higher; profit of the leader, p= .007, L-type higher; profit of followers, p= .002, L-type higher.
Second, we perform the comparison among L-, S-and G-type leaders. When we ignore the punishment to followers who contribute and support their leaders, we can categorize the leader of Group 26 as G type and the leader of Group 27 as L type. A Mann-Whitney U-test is conducted in this categorization.
Bonfirroni's correction is used to determine the significance of the comparisons of the three leader types L, S, and G from this point onward. The results show the same tendencies as the results in the main text: PGG contribution, L versus S, p= .052, L-type higher, L versus G, p= .037, L-type higher; support for the leader, L versus S, p= .087, L-type higher, L versus G, p= .005, L-type higher; profit of the leader, L versus S, p= .693, L versus G, p= .022, L-type higher; profit of followers, L versus S, p= .055, L-type higher, L versus G, p= .011, L-type higher.
In conclusion, the findings reported in the main text are robust, even when we include the data of the two groups in which the leaders punished followers who contribute and support their leaders. The categorization of leaders in the main text might have a problem because a leader's single decision shifts a leader from one type to another. For example, when a leader who punishes only non-contributors throughout the first 14 periods punishes the follower who contributes and does not support the leader only in the 15 th period, this leader's type shifts from G-type to L-type due to this single decision. This sensitive categorization means the leader type can change easily and readers might question the validity of the analysis in the main text.
Here, we perform two analyses that are less sensitive to single decisions. First, we regard punishments of each follower type of less than 5% as zero. In this categorization, the leaders of Groups 18, 20, and 21, who are G-type in the main analysis, shift to the other type. A Mann-Whitney U-test is conducted in this categorization and the results show the same tendencies as the results in the main text: PGG contribution, L versus S, p= .013, L-type higher, L versus G, p = .042, L-type higher; support for the leader, L versus S, p = .024, L-type higher, L versus G, p = .003, L-type higher; profit of the leader, L versus S, p = .315, L versus G, p = .006, L-type higher; profit of followers, L versus S, p = .008, L-type higher, L versus G, p = .041, L-type higher.
Second, we regard punishments to each follower type of less than 10% as zero. In this categorization, the leaders of Groups 18, 20, and 21, who are G type in the main analysis, shift to the other type and the leader of Group 9, who is L type in the main analysis, shifts to S type. A Mann-Whitney U-test is conducted in this categorization and the results also show the same tendencies as the results in the main text: PGG contribution, L versus S, p = .012, L-type higher, L versus G, p = .059, L-type higher; support for the leader, L versus S, p = .019, L-type higher, L versus G, p = .004, L-type higher; profit of the leader, L versus S, p = .573, L versus G, p = .008, L-type higher; profit of followers, L versus S, p = .008, L-type higher, L versus G, p = .059, L-type higher.
In conclusion, the tendencies reported in the main text are strong, even in the less sensitive categorizations of leader punishment types.

Analysis with categorization of leaders by cluster analysis
We categorize leader punishment types without a priori assumptions. We perform cluster analysis with Ward's method, in which clustering variables are the percentages of punishment for each four follower types; the followers who contribute and support their leaders, who contribute and do not support their leaders, who do not contribute but support their leaders, and who do not contribute and do not support their leaders. The data of Group 1 are eliminated because this group do not have the data of punishment to followers who do not contribute and do not support their leaders. Figure S1 shows the results of the cluster analysis. A solution with three clusters is utilized in the present analyses. Only three leaders, those of Groups 8, 9, and 10, except for other types, are clustered in the different groups from the original categorization of L, S, and G types.
This result indicates that these three clusters are very similar to the original categorization of L, S, and G types.
We calculate the means of the important indexes (see Table S2). Cluster 1 leaders strongly punish both non-contributors and non-supporters. The Wilcoxon matched-pairs signed-rank test reveals there is no difference among the punishment to three follower types, that is, followers who contribute and do not support their leaders, followers who do not contribute but support their leaders, and followers who do not contribute and do not support their leaders (ps.>10). Therefore, Cluster 1 leaders can be regarded as L type.
Cluster 2 leaders punish followers who contribute and do not support their leaders and followers who do not contribute and do not support their leaders more than those who do not contribute but support their leaders (Wilcoxon matched-pairs signed-rank test, ps<. 001). This means that Cluster 2 leaders focus more on punishment to non-supporters, and thus, they can be regarded as S-type leaders. Cluster 3 leaders punish followers who do not contribute but support their leaders, and those who do not contribute and do not support their leaders more than those who contribute but do not support their leaders (Wilcoxon matched-pairs signed-rank test, ps<. 001); thus, they can be regarded as G-type leaders. Figure S1. Cluster dendrogram of punishment behavior by leaders A Mann-Whitney U-test is conducted in this clustering and the results show homogeneous tendencies of the results in the original categorization of L-, S-, and G-type leaders. PGG contribution, Cluster 1 versus Cluster 2, p= .009, Cluster 1 higher, Cluster 1 versus Cluster 3, p < .001, Cluster 1 higher; support for the leader, Cluster 1 versus Cluster 2, p = .001, Cluster 1 higher, Cluster 1 versus Cluster 3, p < .001, Cluster 1 higher; profit of the leader, Cluster 1 versus Cluster 2, p = .263, Cluster 1 versus Cluster 3, p = .006, Cluster 1 higher; profit of followers, Cluster 1 versus Cluster 2, p = .001, Cluster 1 higher, Cluster 1 versus Cluster 3, p < .001, Cluster 1 higher. We consistently find the same results in this more objective categorization, and thus, we conclude that the categorization of L-, S-, and G-type leaders in an original way is valid and reasonable. 6

Analysis without categorization of punishment type
Here, we demonstrate the analysis without categorized punishment type, because categorized punishment types L, S, and G might be somewhat arbitrary.
In the support-present condition, cooperation levels are clearly polarized (see Figure 1). We perform cluster analysis with Ward's method, in which clustering a variable is a PGG contribution. The results are shown in Figure S2. The results reveal that groups are categorized as high cooperation groups (N=10, from 326.7 to 493.3 for average total PGG contribution) and low cooperation groups (N=17, from 26.7 to 180.0 for average total PGG contribution).

Figure S2. Cluster dendrogram of PGG contribution
We compare the punishment of leaders in high cooperation groups with those in low cooperation groups in order to investigate why this polarization occurred. Table S3 shows the comparison between low and high cooperation groups for the 15 periods. These results clearly indicate that the leaders of high cooperation groups are more likely to punish both non-contributors and non-supporters. In other words, strong linkage punishment by a leader leads to a high cooperation level in PGG. The Mann-Whitney U-test reveals that the leaders of high cooperation groups are more likely to punish followers who do not contribute but support their leaders (p<.001), followers who contribute and do not support their leaders (p<.001), and followers who do not contribute and do not support their leaders (p<.001) than the leaders of low cooperation groups.
In addition, support for the leader and total profit of the leader are larger in high cooperation groups than in low cooperation groups (p< .001, p< .001, respectively), which indicates that strong linkage punishment induces support for the leader and benefits not only the group but also the leader himself or herself.
In summary, the analysis without punishment type of leaders suggests that linkage punishment leads to high group cooperation and is beneficial for the leader. After a brief verbal introduction, participants read the following instructions on the computer monitor telling them that they will take part in an experiment on decision making.

General Guidance
This is an experiment about decision making. You will be paid for participating, and the amount of money you will earn depends on the decisions that you and the other participants make. At the end of today's session you will be paid in cash for your decisions privately.
You will never be asked to reveal your identity to anyone during the course of the experiment.
Your name will never be associated with any of your decisions.
At this time, you will be given 500 yens (= 5~6 dollars) for coming on time. All the money that you earn after this experiment will be yours to keep.

Earnings
In this experiment you are in a group of size 6 (you plus 5 others) and you will be asked to make a series of choices about how to allocate a set of tokens. You and the other subjects has been randomly assigned to the group, and you will not be able to know each other's identities. But the group members remained the same throughout the experiment.
The details of the experimental transactions are as follows. There are two different roles in the experiment. Five members named A, B, C, D and E will play the same role, but one member named Z will play a different role. Who will be assigned as Z will be selected randomly in the beginning of the experiment and these roles remained the same throughout the experiment. The experiment comprised three stages, 1 st stage, 2 nd stage and 3 rd stage. These stages will be repeated 15 times, and the tokens you earn during transactions will be redeemed as monetary remuneration.
Now, let us explain the details of each stage.

st stage:
Each of the six members, including Z, are given 100 tokens at the beginning of the stage. The members except for Z are asked to decide whether to contribute all 100 tokens to the group pool or not at all.
The tokens each member contributed are doubled and distributed equally to five members except for Z.
This means that each time one member make a contribution, all five members except for Z received 40 tokens each. Z was completely independent from the other members. Although Z are given 100 tokens, like the other members, s/he does not make decisions during this stage and simply earns 100 tokens.
Examples of choices you will make in this experiment and earnings . Then, Z determines, in increments of 20 tokens, how many tokens to reduce from A to E. If Z uses 20 tokens to reduce the token of a certain member, the member will lose 40 tokens. As long as there is sufficient capital, Z can reduce anyone's amount of tokens. The amount Z does not use for reduction is added to Z's own profit.
Examples of choices you will make in this experiment and earnings Example 1: Suppose that you are A, not Z. You obtained 200 tokens in the 1 st stage and 20 tokens in the 2 nd stage. Z decides to reduce 40 tokens from you. You will earn: 200 (1 st -stage earning) + 20 (2 nd -stage earning) -40 (the reduction by Z) = 180 (the total earning in the period).
Example 3: Suppose that you are Z. you obtained 100 tokens in the 1 st stage and 80 tokens in the 2 nd stage. You decide to use 60 tokens in total to reduce the other members' tokens. You will earn: 100 (1 st -stage earning) + 80 (2 nd -stage earning) -60 (that used to reduce the other members' tokens) = 120 (the total earning in the period). Feedback: All six members are informed about the results of 1 st stage, that is, who contributes or does not contribute to the group, after the 2 nd stage. In addition, all the members are informed about members who provide their 20 tokens for Z after the 2 nd stage as well. Thus, during the 3 rd stage, Z is able to decide whose tokens to reduce after ascertaining who contributed in the 1 st stage and who provided their tokens for Z. Furthermore, all members are informed whose tokens were reduced and by how much immediately after Z's decision.
These three stages will be repeated 15 times. The total attained score will be converted to money using the rate 1 token＝0.7 yen, and the converted amount will be provided plus 500 yen (the show-up fee) given to you at the end of this experiment.
After this general instruction above, all participants start the experiment after filling out a confirmation test.

Confirmation Test
Before you start to make your decision, we should solve all questions on the paper. Read carefully through the provided information and write down the number of points on the paper. We will watch you solving the examples, check whether you get the right answers, and help you in case there is a problem or a question.
Before the decision-making Good, now everybody has correctly solved the problems. We will distribute the form on which you will write down the results of each stage, such as who contributed, who provided for Z, and how many tokens Z reduced from A to E (see Figure S3). Whenever you want, you can refer to the previous results by 11 referring to the form. If anybody has any more questions, raise your hand now. Otherwise, let us practice how to make your decisions on your computer screens and how to write down the results on the form. Your total profit of this period：( ) Figure S3. Form in which the participants fill out the results of each period in the support-present condition (above) and the no-support condition (below)