Sex differences in trust and trustworthiness

We present a meta-analytic review of the literature on sex differences in the trust game (174 effect sizes) and the related gift-exchange game (35 effect sizes). Based on parental investment theory and social role theory we expected men to be more trusting and women to be more trustworthy. Indeed, men were more trusting in the trust game ( g = 0.22), yet we found no significant sex difference in trust in the gift-exchange game ( g = 0.15). Regarding trustworthiness, we found no significant sex difference in the trust game ( g = (cid:0) 0.04), and we found men, not women, to be more trustworthy in the gift-exchange game ( g = 0.33). These results suggest that men send more money than women do when their money is going to be multiplied, thereby creating an efficiency gain. This so-called “ male multiplier effect ” may be explained by a stronger psychological tendency in men to acquire resources.


Introduction
Trust is one of the pillars of society.Without people trusting each other, there would arguably be no intimate relationships, no economic transactions, and no effective institutions.As Simmel (1978) put it: "Without the general trust that people have in each other, society itself would disintegrate."Indeed, trust has been shown to play a vital role in the development and durability of close, personal relationships (Mogilski, Vrabel, Mitchell, & Welling, 2019;Van de Rijt & Buskens, 2006) as well as anonymous, transactional relationships (Kim & Peterson, 2017;Ter Huurne, Ronteltap, Corten, & Buskens, 2017).In turn, these relationships foster cooperation in communities (Balliet & Van Lange, 2013), create value for organizations (Caldwell & Ndalamba, 2017;Dirks & Ferrin, 2001), and increase compliance with governmental policies (Batrancea et al., 2019).
Despite its importance for a well-functioning society, not everyone can be trusted all the time.Indeed, studies have found that people differ in their levels of trust and trustworthiness and this typically depends on the particular situation they find themselves in ☆ The authors would like to thank Daniel Balliet, Raoul Grasman, and Matthijs van Veelen for valuable comments during the different phases of the project, and Anton Olsson Collentine for his assistance in the coding of the data.This work was partly supported by a Consolidator Grant (IMPROVE) from the European Research Council (ERC; grant no.726361).(Thielmann & Hilbig, 2015).Several attempts have been made to systematically review studies on individual differences in trust and trustworthiness.For example, meta-analyses have linked trust and/or trustworthiness to people's facial appearance (Bzdok et al., 2011), personality (Thielmann, Spadaro, & Balliet, 2020), leadership style (Dirks & Ferrin, 2001), group membership (Balliet, Wu, & De Dreu, 2014) and age (Bailey & Leon, 2019).Surprisingly, no meta-analysis has linked trust and trustworthiness to people's sex.Our study aims to fill this gap in the literature by presenting a meta-analysis of sex differences in the most commonly used games to measure trust behavior: the trust game and the gift-exchange game.
The results of this meta-analysis could highlight the conditions in which men and women differ in trusting behavior, which might help improve interventions aimed at increasing trust among women and men.One application could be in the treatment of personality disorders that are characterized by a lack of trust in other people, like paranoid personality disorder and borderline personality disorder.Knowledge about the underlying mechanisms of trusting behavior in women and men could be instrumental in developing tailor-made treatments for disorders like this (Langley & Klopper, 2005).
The rest of the introduction is structured as follows.First, we will provide a theoretical framework explaining potential sex differences in trust and trustworthiness from both an evolutionary and sociocultural perspective.Second, we will describe the rules of the trust game and gift-exchange game and explain why these games are suitable to measure trust and trustworthiness.Finally, we will look at previous research on sex differences in trust games and gift-exchange games and link this research up with our own predictions.

Evolutionary explanations for trust and trustworthiness
Evolutionary psychology suggests that sex differences in social behavior may result from an asymmetry between the sexes in the costs of parental investment.Parental investment theory (Trivers, 1972) is based on the idea that the investments of men and women in producing and raising offspring are different.Women are faced with a 9-month gestation period and a lactation period after birth that can take several years.Men's investment, on the other hand, requires at a minimum only a contribution of their sperm.Because women have to spend a large amount of energy and time raising a child, they are only able to raise a limited number of children during their reproductive lifecycle.This means that women must be selective in choosing a mate as the fitness of the child is influenced greatly by the quality of the father.The higher selectivity of women implies that men engage in intense competition to obtain the best mates.In practice, men compete on traits that convey their genetic qualities as well as their parenting qualities.Such traits can be physical (e.g., physical dominance) as well as psychological (e.g., social dominance).Differences in parental investment thus select for traits in men that enable them to compete with other men as well as traits that enable them to attract potential women.This is the core tenet of sexual selection theory (Darwin, 1871).
An important psychological trait difference between men and women that may result from parental investment theory is risktaking.Whereas women may want to avoid taking certain excessive physical and social risks so as to avoid comprising their reproductive potential, men may take risks to signal that they possess high-quality genes and a capacity to procure as well as provide resources.Taking risks has been shown to be beneficial for men to achieve a higher social status because it can lead to the acquisition of resources (e.g., through risky financial investments or through collaborations with uncertain outcomes like in hunting or warfare) or to a higher place in the social hierarchy (e.g., through engaging in competition with other males) (Wilson & Daly, 1985).This is important because a high social status is often seen by women as an indicator of a man's potential to aid in raising a child (Wilson & Daly, 1985).In addition, men signal genetic quality by taking risks.This is because taking risks is costlier for men with lower genetic quality than for men with higher genetic quality, which means that usually only high-quality men take risks (Baker & Maner, 2009).In short, it pays for men to be relatively more risk-taking in contexts where they can acquire resources like status, goods and money.This is evidenced by many studies that find men to take more (social) risks than women across different ages and cultures, and from modern to traditional societies (Apicella, Crittenden, & Tobolsky, 2017;Byrnes, Miller, & Schafer, 1999;Fischer & Hills, 2012;Wilson & Daly, 1985).
Differences in risk-taking might help explain sex differences in trust because trust (as defined above) involves a willingness to be vulnerable to other people's adverse behaviors (i.e., to take social risks).Given the risky nature of the first transfer in the trust game and gift-exchange game, and given the finding that men take more social risks than women on average, we predict that men transfer more money than women as the first mover in the trust game and gift-exchange game. 1  An evolutionary perspective can also illuminate potential sex differences in trustworthiness.Whereas men may benefit from engaging in competitions to acquire more resources than other men, women may benefit from engaging in reciprocal arrangements to protect valuable resources.In raising offspring women benefit from making reciprocal arrangements with both men, the fathers of their children, and other women to assure mutual parental care (Hrdy, 2005;Mace & Sear, 2005).Mutual parental aid is based on a simple reciprocity principle: "If you help me with raising my child, I will help you with yours."Indeed, this cooperative breeding hypothesis is supported by many anthropological studies that find that women engage in reciprocal, egalitarian relationships with other women, kin and non-kin, to raise their children (for a review of this literature, see Kramer, 2010).In addition, mutual aid in child care has been found to be related to higher infant survival and child well-being (Sear & Mace, 2008).From this evolutionary mechanism, we infer that in trust games and gift-exchange games, women will reciprocate more than men.In other words, women will be more trustworthy in their decisions as second movers.

Sociocultural explanations for trust and trustworthiness
Potential sex differences in trust behavior may also be explained from a sociocultural perspective.Note that sociocultural explanations are often complimentary to evolutionary explanations because they assume that certain evolved sex differences, even minor ones, may either be exacerbated or undermined by differences in socialization practices (like parental upbringing or formal education; Laland, Brown, & Brown, 2011).For instance, some cultures enhance men's greater propensity to take physical and social risks (e.g., through conveying culturally masculine stereotypes in social play or through offering single-sex education) whereas other cultures may suppress these propensities (e.g., through a gender neutral upbringing).
Theories of gender role socialization, notably social role theory, state that men and women internalize cultural expectations about the way they ought to behave, based on traditional sex roles, and that men and women will behave accordingly (Eagly, 1987;Wood & Eagly, 2012; for an overview see Dulin, 2007).Traditional sex roles, which often follow deeper evolutionary logic such as those inferred from parental investment theory, may convey social norms that men should take more risks, behave more competitively and independently, and be more self-confident.In contrast, cultural expectations may demand from women that they behave in a more nurturing, communal, and caring way thus fulfilling the feminine stereotype role.Bakan (1966) labeled these distinct clusters of stereotypically masculine versus feminine traits as agentic and communal, respectively.
Like evolutionary explanations, sociocultural theories assume that men will behave in more agentic ways and thus should be more willing to take risks to acquire resources in cooperative interactions with others.This should lead men to be more willing to send money and expect returns in games of trust.In contrast, because women are more communally oriented (Schmitt, Realo, Voracek, & Allik, 2008;Weisberg, DeYoung, & Hirsh, 2011), social obligations are expected to have a stronger impact on the behavior of women than men (Buchan, Croson, & Solnick, 2008).This suggests that women are less likely to want to violate trusting relationships by failing to reciprocate in interactions with strangers.In short, sociocultural theories, like evolutionary psychological theories, predict that men will be more trusting than women in trust and gift-exchange games, and that women will be more trustworthy than men in those games.

The trust game and the gift-exchange game
The trust game, originally called investment game (Berg, Dickhaut, & McCabe, 1995), was developed more than two decades ago and works as follows.One player, the first mover, is endowed with a certain amount of money.This first mover has the choice to send a proportion of this money to another player, the second mover.The money they decide to give away is multiplied by a given factor before reaching the other player.This multiplication factor varies across studies, but is typically three.In the second and final round, the second mover can decide how much of the money they will send back to the first mover.The amount sent by the first mover is seen as a manifestation of trust, whereas the amount returned by the second mover is seen as a manifestation of trustworthiness (Ben-Ner & Halldorsson, 2010;Croson & Gneezy, 2009;Sutter & Kocher, 2007).
A game that is conceptually similar to the trust game is the bilateral gift-exchange game 2 (Fehr, Kirchler, Weichbold, & Gächter, 1998).In the gift-exchange game the first mover can allocate a certain amount of money (the wage, wtypically an integer between 20 and 120) to the second mover.In many (but not all) studies, the second mover can either accept or reject this wage offer.In case of a rejection, both players get a payoff of zero.In case of acceptation, the second mover must decide on an effort level, e.This effort level is costly to himself or herself, but beneficial to the first mover, sometimes creating a multiplier effect similar to the one in the trust game.Typically, the payoff for the first mover is ∏ 1 = (120 − w)e and the payoff for the second mover is , where the cost, c is related to the effort level according to Table 1.However, note that the payoff functions for both the first and second mover vary markedly over studies.Common for all studies, though, is that the wage is seen as a measure of trust while the effort level is seen as a measure of trustworthiness (Rau, 2011).
There are three noteworthy differences between the trust game and the gift-exchange game that are relevant to understanding potential behavioral differences in gameplay.First, in some variants of the gift-exchange game, the second mover has the option to reject the first mover's offer.This option can have important consequences for the first mover's behavior, as he or she may be concerned that the offer might be rejected (which leads to a payoff of zero).Second, the experimental instructions of the gift-exchange game are often framed in terms of a working relationship.That is, the first mover is referred to as the firm or the employer, while the second mover is referred to as the worker or the employee.This labor context could also have implications for the behavior of both players, although the literature does not provide guidance as to what these implications could be.Third, the added value of the exchange comes about differently for both games.In the trust game the added value of the exchange comes from the first transaction (i.e. the transfer made by the first mover) because the offer of the first mover is multiplied by the experimenter before the money arrives at the second mover.In the gift-exchange game, in contrast, the added value comes about through the decision of the second mover, the decision of both the first and second mover, or it can be ambiguous which player determines the added value.Whenever the added value comes about through the second mover, the gift-exchange game can be seen as a 'reversed' trust game, where the multiplication factor resides in the second transfer instead of the first transfer.In the other cases, determining the origin of the added value in the giftexchange game is more complex.More information about the determination of efficiency in the gift-exchange game can be found in the coding protocol for the gift-exchange game at https://osf.io/dp9xu.Despite these differences between the trust and gift-exchange games, both games are commonly used to measure trust because the first mover's decision problem corresponds to a trust problem according to most definitions of trust.Trust is commonly defined as "a psychological state comprising the intention to accept vulnerability based upon positive expectations of the intentions or behavior of another" (Rousseau, Sitkin, Burt, & Camerer, 1998).This definition implies that trust has two components: the intention to make yourself vulnerable to another person (i.e., social risk taking), and an expectation that the other person will not take advantage of your vulnerability (i.e., trustworthiness expectations).The first transfer in the trust game and the gift-exchange game appears to capture these two components well.In our analyses, we only use the anonymous, one-shot variants of these games because the trust decision in these variants is not contaminated by implicit bias, reputation management, and other factors that play a role in non-anonymous and repeated games.
However, this individual, nonsocial measure of someone's willingness to take risks might not be appropriate to measure the type of risk that is associated with trust.Trusting behavior is inherently related to a trustee and as such is captured better by measures of social risk than by measures of nonsocial risk (Bohnet & Zeckhauser, 2004;Fairley, Sanfey, Vyrastekova, & Weitzel, 2016;but see Fetchenhauer & Dunning, 2012).The studies that directly measure the social aspects of risk (e.g., through people's willingness to participate in an interpersonal system of loans, or by directly asking participants about their willingness to take risks in varying social settings) do find social risk taking to predict first mover decisions in the trust game (Ben-Ner & Halldorsson, 2010;Karlan, 2005;Lönnqvist, Verkasalo, Walkowitz, & Wichardt, 2015;Thielmann, Spadaro, & Balliet, 2020).Moreover, men are also found to be more risk-taking in experiments where social elements are manipulated like the presence of a potential romantic partner (Baker, & Maner, 2009) or the presence of a potential reproductive competitor (Fischer & Hills, 2012).Finally, prenatal testosterone has been linked with more social risk-taking (Stenstrom, Saad, Nepomuceno, & Mendenhall, 2011) suggesting that social risk-taking is a sex-typical behavior (Hines, 2006).
The relationship between trustworthiness expectations and first mover decisions in trust games has been investigated more directly by specifically asking for people's expectations.Almost all studies find that the higher people's expectations are about the second mover's return transfer, the more they send in a trust game (Barr, 2003;Fetchenhauer & Dunning, 2009;Holm & Danielson, 2005;Naef & Schupp, 2009;Garbarino, & Slonim, 2009;Sapienza et al., 2013; for a review see Thielmann & Hilbig, 2015).
The relationship between trust and first mover decisions in trust games is further supported by studies that have found that "trust" and "risk" are the concepts that most frequently come to mind when people are asked to describe the trust game (Dunning, Fetchenhauer, & Schlösser, 2012) and a study that found trust game behavior to be associated with self-reported trusting behaviors in everyday life (Glaeser, Laibson, Scheinkman, & Soutter, 2000).
Building on the definition of trust above, we label people as trustworthy when they do not take advantage of the vulnerability of someone else when given the opportunity to do so.In the case of the trust game and the gift-exchange game, that means that higher second mover transfers can be seen as more trustworthy than lower second mover transfers.This operationalization has been used by many researchers (e.g., Berg, et al., 1995;Derks, Lee, & Krabbendam, 2014;Fehr, Fischbacher, Von Rosenbladt, Schupp, & Wagner, 2002) and makes sense in the light of findings that second mover behavior is related to trustworthy behavior in real-world situations (Baran, Sapienza, & Zingales, 2010;Karlan, 2005).However, we only focus on anonymous, one-shot games to avoid confounding in our measure of trustworthiness.
Besides the trust game and the gift-exchange game there are other economic games that measure trust and/or trustworthiness, but they have not been used frequently enough to attempt a meta-analysis.Examples of such games are the trading game (Lyons & Mehta, 1997), the real-effort dictator game (Heinz, Juranek, & Rau, 2012) and the moonlighting game (Abbink, Irlenbusch, & Renner, 2000).

Empirical evidence of sex differences in trust and trustworthiness
Two earlier narrative reviews have been undertaken to summarize the evidence of sex differences in the trust game and giftexchange game (Croson & Gneezy, 2009;Rau, 2011).The current meta-analysis is more complete than these narrative reviews because it is based on a systematic literature search, includes more (recent) studies, and also includes studies that did not intentionally set out to study sex differences (but did register participants' sex).Nonetheless, these narrative reviews are informative because both

Table 1
The typical relationship between the second mover's effort and cost in the gift-exchange game.reviews found that men are more trusting and women are more trustworthy, supporting evolutionary and sociocultural theories as explanations for sex differences in trust and trustworthiness.Given this match between theory and empirical findings, we hypothesized that our meta-analysis would reveal that, overall, men would transfer more money as the first mover and women would transfer more money as the second mover in both the trust game and the gift-exchange game.
Another important study directly relevant to the current project is a meta-analysis of the trust game by Johnson and Mislin (2011).The authors of that meta-analysis lacked sufficient data to study sex differences, yet they found that changes in the experimental protocol significantly altered behavior in the trust game (also see Chaudhuri, Li, Paichayontvijit, 2016;Alós-Ferrer & Farolfi, 2019).Changes in the experimental protocol could also affect sex differences in the trust game (and gift-exchange game).With regard to other economic games, Croson and Gneezy (2009) found that changes in the experimental protocol of several public goods games studies influenced female behavior more than male behavior, while Andreoni and Vesterlund (2001) found that changes in the price of a modified dictator game changed men's behavior more so than women's behavior.Based on these findings it has been suggested that sex differences are sensitive to the protocol and context of economic games (Chermak & Krause, 2002;Croson & Gneezy, 2009).Because we do not yet know whether this holds for the trust game and the gift-exchange game, we added five moderators to the analysis pertaining to the experimental protocol of both the trust game and the gift-exchange game, one moderator pertaining to the protocol of the trust game, and four moderators pertaining to the protocol of the gift-exchange game.We chose these moderators because they are the most common variations of the trust game and gift-exchange game in the literature.
The common moderators are (1) whether participants were paid based on their decisions in the game, (2) whether participants played as both the first mover and the second mover, (3) whether the second mover had an initial endowment, (4) whether the strategy method (Selten, 1967) was used to elicit the decisions of the second movers, and (5) how many times the game was played during the experiment.The moderator unique to the trust game is the multiplication factor of the first transfer.The moderators unique to the giftexchange game are (1) whether the experimental instructions were framed neutrally or in a labor context, (2) whether the first mover had to fill out a desired effort level, (3) whether second movers were able to reject the first mover's wage offer, and (4) whether the efficiency in the game is determined by only the second mover, by the first mover and the second mover, or whether that is ambiguous.
Parental investment theory and social role theory can be used to derive some predictions about sex differences with regard to the effects of these moderators.First, several studies have shown that people become more risk averse when games are played for higher stakes or use real money instead of hypothetical money (Holt & Laury, 2002, 2005;Xu et al., 2016), so when participants get paid for their choices they may send less as first movers in the trust game and gift-exchange game.Second, if the second mover has the opportunity to reject the first mover's offer in the gift-exchange game, risk averse first movers may be less inclined to trust second movers.Based on the higher tendency for men to take risks (Byrnes et al., 1999) we predict that men are more trusting than women in games where they are paid based on their choices versus games where they are not, and in gift-exchange games with a rejection phase versus games without such a rejection phase.
Our theoretical framework also predicts that women may be influenced more by the presence of social obligations (Buchan, Croson, & Solnick, 2008).Two moderators may tap into these social obligations.First, it could be that participants feel less social obligation in games where multiple periods are played (with different opponents) as their decision in a single period has less impact on the other player's total earnings.Second, the presence of a desired effort level in a gift-exchange game can make the social contract more concrete and with that the social obligation more salient.Based on this reasoning, we predict that women send more as second movers than men in games with more iterations, and in gift-exchange games where first movers have to set a desired effort level.
Regarding the other moderators, neither parental investment theory nor social role theory provide information about what to expect with regard to sex differences in the trust game and gift-exchange game.Therefore, we looked at these moderators in an exploratory way.
In summary, based on the theoretical and empirical reviews above, our main predictions are: (1) men send more than women as first movers in both the trust game and the gift-exchange game, and (2) women send more than men as second movers in both the trust game and the gift-exchange game.With regard to the moderators we expect men to send more as first movers than women in games where participants are paid for their decisions (versus games where they are not), and in games with a rejection phase (versus games without such a phase).We expect women to send more as second movers in games with more iterations (versus games with less iterations), and in gift-exchange games where the first movers have to provide a desired effort level (versus games where first movers need not).

Search strategy
To find eligible studies for our meta-analyses of the trust and gift-exchange games, we employed four search strategies.The searches were planned and executed at different times, so they are not completely similar.The first search strategy was to use the terms "trust game" and "investment game", and the term "gift-exchange game" in searches on a range of databases.For trust game studies we searched on PsycINFO, EconLit, the Web of Science Core Collection, SSRN (for unpublished papers), and OpenGrey (for grey literature).These searches were carried out in April and May 2017 and included papers from 2011 onwards3 .For gift-exchange game studies we searched on Google Scholar and the Web of Science Core Collection.We only looked for gift-exchange game papers published since 1998 because the bilateral gift-exchange game was introduced in that year.These searches were carried out in February 2016 and September 2017.None of the search terms were cross-referenced with terms pertaining to a person's sex or gender because studies typically ask participants to indicate their sex.
Our second search strategy was to check for papers citing the original trust game paper (Berg et al., 1995) and the original giftexchange game paper (Fehr et al., 1998).For these searches, we used the Web of Science Core Collection and Google Scholar respectively.Third, we looked at references in review articles and other relevant articles that we found using the first two search strategies.Examples of such review articles are the articles by Croson and Gneezy (2009) and Rau (2011).Fourth, we sent out a call for papers in the Economic Science Association's experimental methods discussion group (https://groups.google.com/forum/#!forum/esa-discuss).This call for papers can be found at https://osf.io/3tves.The search for trust game studies yielded a total of 1648 references (of which 1199 were unique) and the search for gift-exchange game studies yielded a total of 1200 references (of which an unknown number was unique4 ).For a flow diagram of the search for papers, see https://osf.io/3ga6p(trust game) and https://osf.io/y8zhe (gift-exchange game).The flow diagram is more extensive for the trust game search than for the gift-exchange game search as that search was logged in more detail.A complete overview of the trust game and gift-exchange game search results can be found at https://osf.io/jbrf6and https://osf.io/pgm7n,respectively.

Inclusion criteria
We used several inclusion criteria to select studies for our analysis.First, because of language barriers, we decided to only include papers written in English.Second, for obvious reasons we only included studies with data on both men and women (e.g., excluding Kurzban, Rigdon, & Wilson, 2008).Third, we included only games wherein players thought they played against a human player because we wanted to investigate trust among humans (e.g., excluding Kirkebøen, Vasaasen, & Teigen, 2013).Fourth, only studies with student samples or adult samples were included because there are indications that behavior in trust games and gift-exchange games might differ between children and adults (Sutter & Kocher, 2007;Owens, 2011).Fifth, participants had to be from a sample that is not characterized by physical or psychological dysfunctions.Examples of excluded studies were studies that used participants with Parkinson's disease (Javor, Riedl, Kirchmayr, Reichenberger, & Ransmayr (2015) and borderline personality disorder (Ebert, et al., 2013).Sixth, to make the studies in our analysis comparable we only included studies that involved the trust and gift-exchange games as described in the introduction.Specifically, we only used games with two players, wherein the first mover could transfer a certain amount of money to the second mover, the money was multiplied by a given factor (in the trust game only), and the second mover could return some of the money to the first mover (in the gift-exchange game that second transfer is costly to the second mover and beneficial to the first mover).Any games that deviated from these designs, aside from the variations captured by the moderators, were not included.
For example, we excluded studies in which participants could communicate with each other (e.g., Fooken, 2013;Kimbrough & Rubin, 2015), games in which players had personal information about the other player (e.g., Chaudhuri, Paichayontvijit, & Shen, 2013;Hargreaves Heap & Zizzo, 2009;Lönnqvist, Verkasalo, Wichardt, & Walkowitz, 2013), and repeated games with the same partner (e.g., Fehr, Tougareva, & Fischbacher, 2014;Samson & Kostyszyn, 2015).These exclusions were necessary because communication, personal information, and the possibility of establishing a reputation are likely to influence behavior in the games such that the games do not measure trust and trustworthiness in isolation.
Finally, we excluded games that were non-continuous (e.g., Servátka, Tucker, & Vadovic, 2008;Simpson & Eriksson, 2009).We a priori defined continuous games as games with ten or more response options for the first and second mover.This was mainly done to exclude sequential prisoner's dilemma games where first movers decide to either send money or not and second movers decide to either return money or not.The binary nature of these games makes it impossible to compare them to more continuous games where trust is measured by comparing the amount sent by the first mover to the initial endowment, and trustworthiness is measured by comparing the amount sent by the second mover to the amount sent by the first mover.A similar incomparability holds for games with three or four response options.For this reason, we decided to take the original games as our baseline and only include games with as much or more response options (for an overview of the different types of trust games, see Alós-Ferrer & Farolfi, 2019).
In all, we found 167 trust game papers and 35 gift-exchange game papers with one or more studies eligible for inclusion.For the trust game meta-analysis, we were able to retrieve 174 effect sizes from 77 papers (see Table 2).For the gift-exchange game metaanalysis, we were able to retrieve 35 effect sizes5 from 15 papers (see Table 3).Excel-files of the two datasets can be found at https://osf.io/5bmsa(trust game) and https://osf.io/u8zjc(gift-exchange game).
The search and the selection of papers was carried out solely by the first author.However, an independent coder used the inclusion criteria on a random sample of papers (N = 95 for the trust game and N = 81 for the gift-exchange game) to verify the first author's coding.The decisions of the first author and second coder were consistent for 95.8% of the trust game papers and 96.3% of the giftexchange game papers.The coding protocol can be found at https://osf.io/xm9pk(trust game) and https://osf.io/dp9xu(gift-exchange game), while the detailed results of the recoding effort can be found at https://osf.io/8kv4w(inclusion criteria) and https:// osf.io/sgekf (moderators).

Data collection
Extracting the required information for the meta-analyses proved to be difficult because only three of the trust game papers and none of the gift-exchange game papers included the required data for us to calculate the effect sizes.The remaining papers in our database only used sex as a control variable or did not mention sex at all, so in those cases we had to contact the authors to request the required information.We first contacted the corresponding authors of each paper, and if we received no response, we sent out a reminder e-mail about three weeks later.If we still did not receive a response after six weeks, we sent out a final data request e-mail to the co-author(s) of the paper with a remark that we had already tried to reach the corresponding author.Templates of the different emails can be found at https://osf.io/pjrku.Authors could either provide us with the raw data or with the summary statistics we needed to calculate the effect sizes ourselves.From the 164 trust game papers for which we contacted the authors, we received the data 74 times, we did not receive the data 32 times, and we were unable to contact the authors (i.e., they did not reply even after two reminders or we could not find up-to-date contact information) 58 times.From the 35 gift-exchange game papers for which we contacted the authors we received the data 18 times, we did not receive the data 26 times, and we were unable to contact the authors 12 times.Thus, Note: The 'Condition' column indicates which of the conditions in the paper was included.The 'Pay' column indicates whether participants of the study were paid based on their decisions in the game.The 'Both' column indicates whether players in the trust game had to play as both the first mover and the second mover.The 'Sec' column indicates whether the second mover in the trust game was allocated an initial endowment.The 'SM' column indicates whether the second mover decisions were elicited using the strategy method.The 'It' column indicates the number of iterations of the game.The 'Mtp' column indicates the multiplication factor used in the study.The 'N t ' and 'N tw ' column indicate the sample size for the first mover's and second mover's decisions, respectively.The 'g t ' column indicates the effect size of a sex difference with respect to the first mover's decision.The 'g tw ' column indicates the effect size of a sex difference with respect to the second mover's decision.If the effect sizes are positive, that means that men sent more than women.
O.R. van den Akker et al.
we received data from around 51% of the papers.

Coding procedure
To measure trust in both the trust game and the gift-exchange game, we used the proportion of the first transfer to the initial endowment.This meant that we had to retrieve the following information: the mean first transfer for both sexes, the standard deviations of those means, the number of men and women, and the initial endowment.If we were able to retrieve these values for a particular study we were able to calculate an effect size of sex differences in trust for that study.The calculation of effect sizes is described in the section Statistical Analysis.
Finding a good measure of trustworthiness was more complex because it can be argued that the second transfer by itself is not a good measure of trustworthiness.This is because the concept of trustworthiness is only relevant with regard to a preceding behavior, in this case the first transfer.For that reason, in line with other studies (e.g., Ashraf, et al., 2006), we chose to use the second transfer divided by the multiplied first transfer as the measure of trustworthiness.For instance, if the first transfer was 8, the second mover would receive 24 in the trust game.We then divided the second transfer by that multiplied amount, so when the second transfer is 12 the trustworthiness measure will be 0.5 and when the second transfer is 18, the trustworthiness measure will be 0.75.This proportion was calculated for every individual participant, and then the mean and standard deviation of those proportions were calculated, for both sexes.Coupled with the number of men and women, we were then able to calculate the effect sizes of sex differences in trustworthiness.When trustworthiness was assessed using the strategy method, the reciprocity of individual participants was first calculated by averaging their proportions for every amount they received.These numbers were then used to calculate the average reciprocity for men and women.
Besides coding all effect sizes, we coded for ten moderators that concerned the protocol of the games.One of those moderators Note: The 'Condition' column indicates which of the conditions in the paper was included.The 'Pay' column indicates whether participants of the study were paid based on their decisions in the game.The 'Both' column indicates whether players in the trust game had to play as both the first mover and the second mover.The 'SM' column indicates whether the second mover decisions were elicited using the strategy method.The 'It' column indicates the number of iterations of the game.The 'Sec' column indicates whether the second mover in the trust game was allocated an initial endowment.The 'Frame' column indicates whether the experimental instruction of the game was framed neutrally (0) or was framed in a labor context (1).The 'Des' column states whether the first movers in the study had to fill out a desired effort level.The 'Rej' column states whether the second movers were able to reject the first movers wage offer.The 'Eff' column states whether the efficiency was determined by only the first mover ('FM'), both the first and second mover ('Both'), or whether that is ambiguous ('Amb').The 'N t ' and 'N tw ' column indicate the sample size for the first mover's and second mover's decisions, respectively.The 'g t ' column indicates the effect size of a sex difference with respect to the first mover's decision.The 'g tw ' column indicates the effect size of a sex difference with respect to the second mover's decision.Positive effect sizes correspond to higher means for men than for women.
O.R. van den Akker et al.
pertains to the trust game only (the multiplication factor), four pertain to the gift-exchange game only (whether the experimental instruction was framed neutrally or was framed in a labor context, whether first movers had to suggest a desired effort level, whether second movers were able to reject the first mover's wage offer, and whether efficiency was determined by only the second mover, both the first and second mover, or whether that was ambiguous), and five pertain to both games (the number of iterations of the game, whether participants were paid based on their decisions in the game, whether participants played the game as both first and second mover, whether the second mover was allocated an initial endowment, and whether second mover decisions were elicited using the strategy method).In addition, three other moderators were coded to carry out sensitivity analyses: whether the study was published in a scientific journal, whether the experiment involved additional (unrelated) tasks, and whether sex differences were part of the main hypothesis in the paper.Table 4 provides an overview of these moderators.The coding of the moderators was carried out solely by the first author.However, an independent coder used the coding protocol on a random sample of papers (N = 14 for the trust game and N = 32 for the gift-exchange game) to determine whether there was potential bias in the coding of the first author.The decisions of the first author and the independent coder were consistent for 88.9% of the trust game papers and 95.1% of the gift-exchange game papers.The coding protocol can be found at https://osf.io/xm9pk(trust game) and https://osf.io/dp9xu(gift-exchange game), while the detailed results of the recoding effort can be found at https://osf.io/8kv4w(inclusion criteria) and https://osf.io/sgekf(moderators).

Statistical analysis
To calculate the effect sizes of individual studies, we used the Hedges' g effect size measure, which is preferred over Cohen's d because the latter is biased for small sample sizes (Hedges & Olkin, 1985).Hedges' g is calculated as follows: where x 1 is the mean for men, x 2 is the mean for women, s * is the pooled standard deviation, and c(n 1 , n 2 ) is a constant that depends on group sizes.Both s * and c(n 1 , n 2 ) are defined below: where n 1 is the number of males, n 2 is the number of females, s 2 1 is the variance for males, and s 2 2 is the variance for females.Finally, we also estimated the variance of the Hedges' g effect size measure:

Table 4
Moderators Analyzed in the Current Meta-Analysis.
Because we cannot exclude that sex differences in the studies in our meta-analysis vary on a host of unknown factors we used a random effects model to combine the individual effect sizes into an overall effect size 6 .In a random effects model the true effect size is allowed to vary between studies (i.e., there can be heterogeneity between studies).To assess the heterogeneity in our sample, we computed both the Q-statistic, which tests the null hypothesis of no heterogeneity, and the I 2 -statistic including confidence interval, which measures the extent of the heterogeneity.To assess the extent of heterogeneity we used the commonly used threshold values of 0.25, 0.5, and 0.75 for small, moderate, and large amounts of heterogeneity respectively (Higgins, Thompson, Deeks, & Altman, 2003).
The main downside of using a random effects model is that it leads to biased estimates in the presence of publication biasthe tendency to publish significant findings more often than non-significant findings.Publication bias has been prevalent in many metaanalyses (Bakker, Van Dijk, & Wicherts, 2012) and is troublesome because it unjustly inflates the overall effect size.A standard random effects meta-analysis does not correct for publication bias like other methods, so should only be used when publication bias is unlikely or absent.In our case, publication bias is unlikely because the primary studies in our sample overwhelmingly focused on other factors besides sex differences.Indeed, most papers did not even report results with relation to sex.Given that publication decisions are usually based on the primary outcomes, we deem it unlikely in the current sample that studies with significant sex differences were published at a higher rate than studies with non-significant sex differences (i.e., there is probably no publication bias).However, we did test for publication bias by using funnel plots and Egger's test for funnel plot asymmetry (Egger, Smith, Schneider, & Minder, 1997).In all, we used a random effects model to compute the overall effect sizes due to the probability of heterogeneity of the studies in the meta-analysis and the fact that publication bias is unlikely.
The random effects meta-analyses were complemented by moderator analyses in which we regressed the effect size of sex differences in trust and trustworthiness on each moderator variable separately (see Tables 5 and 6 for the trust game analyses and Tables 9  and 10 for the gift-exchange analyses).Because this involves multiple significance tests, we applied the Benjamini-Hochberg procedure (Benjamini, & Hochberg, 1995) to control for false positives.Benjamini-Hochberg critical values were calculated using the spreadsheet accompanying the textbook of John H. McDonald (2014), where we used a false discovery rate of 0.10.
Finally, we carried out sensitivity analyses in which we used several criteria to select subsets of the studies in the meta-analysis.We then ran the meta-analyses on the studies in those subsets only.The criteria we used to select subsets of studies were the use of additional, unrelated tasks, whether sex differences were part of the main hypotheses in the paper, and the sample size of the study.

Overview
The trust game meta-analyses encompassed 77 papers with 174 effect sizes, and 17,082 unique participants from 23 countries.The meta-analysis regarding trust involved 76 papers and 94 studies, each with one effect size, while the meta-analysis regarding trustworthiness involved 65 papers and 80 studies, also with one effect size each.

Publication bias analysis
We do not find evidence of publication bias in our meta-analysis on trust and trustworthiness.The Egger's regression test for funnel plot asymmetry yielded a non-significant intercept for trust studies, z = 1.30, p = .193,and trustworthiness studies, z = − 0.58, p = .560.This result can be visually confirmed by the fact that the effect sizes for both trust (Fig. 1) and trustworthiness (Fig. 2) are distributed evenly around the mean in their respective funnel plots.Similarly, in a dummy-coded regression using publication status as predictor, we found no significant difference between the overall effect of published studies (k = 83 for trust, k = 70 for 6 A priori we planned to use several other methods besides the random effects model.This is preferred over using only one method because it allows checking the robustness of the results (Steegen, Tuerlinckx, Gelman, & Vanpaemel, 2016).The additional meta-analytic methods we planned to use were PET-PEESE (Stanley & Doucouliagos, 2014), p-curve (Simonsohn, Nelson, &Simmons, 2014b), andp-uniform (Van Assen, Van Aert, &Wicherts, 2015).However, most of these methods come with fairly stringent assumptions, the crucial one being a homogeneous set of studies.Two different comparisons of meta-analytic methods have shown that PET-PEESE, p-curve and p-uniform all may lead to a significant bias in estimating the overall effect size when study heterogeneity is present (Carter et al., 2017;Stanley, 2017;Van Aert, Wicherts, & Van Assen, 2016).Unfortunately, heterogeneity analyses indicated substantial heterogeneity between the studies in the current meta-analyses.Therefore, based on recommendations from Carter et al. (2017) and Stanley (2017) we ruled out the use of these methods.A priori we already ruled out the use of the trim-andfill method (Duval andTweedie, 2000a, 2000b) because it shows bias both when publication bias is present (Simonsohn, Nelson, & Simmons, 2014a) and when publication bias is absent (Terrin, Schmid, Lau, & Olkin, 2003).

Main effects analysis
Consistent with our predictions, the random effects analysis showed males to be more trusting in the trust game than females, although the average effect was small, g = 0.22, 95% CI = [0.15,0.30], p < .001.On the other hand, contrary to our expectation, the analysis on trustworthiness failed to show a significant average sex difference, g = − 0.04, 95% CI = [− 0.10, 0.02], p = .21.

Moderator analysis
Because of the possibility of inflated error rates, we decided to adjust all p-values in the moderator analysis using the Benjamini-Hochberg procedure (Benjamini & Hochberg, 1995).We ran the procedure separately for the trust analyses and the trustworthiness analyses.The procedures can be found at https://osf.io/w6kfn(trust game) and https://osf.io/dcsnb(gift-exchange game).
None of the features of the experimental setting proved to moderate sex differences in trust (see Table 5): whether the participants got paid based on their decisions in the game, β 1 = 0.01, p = .940,whether the participants played as both the first mover and the second mover, β 1 = − 0.06, p = .452,whether the second mover was endowed with their own endowment, β 1 = − 0.01, p = .881,whether the strategy method was used to elicit the behavior of the second mover, β 1 = − 0.08, p = .298,the number of iterations of the trust game, β 1 = 0.001, p = .925,and the size of the multiplier, β 1 = − 0.30, p = .131.
The same holds for the moderation of sex differences in trustworthiness (see Table 6): whether the participants got paid based on their decisions in the game, β 1 = − 0.13, p = .274,whether the participants played as both the first mover and the second mover, β 1 = 0.02, p = .942,whether the second mover was endowed with their own endowment, β 1 = − 0.09, p = .890,whether the strategy method was used to elicit the behavior of the second mover, β 1 = 0.02, p = .784,the number of iterations of the trust game, β 1 = − 0.002, p = .957,and size of the multiplier, β 1 = 0.16, p = .292.

Sensitivity analyses
To gauge the robustness of the overall sex difference in trust that we found in the main analysis, we performed several sensitivity analyses.We only performed those analyses on the trust decisions because we only found a significant overall sex difference in that  domain.For the sensitivity analyses, we used several variables to create subsets of studies (with a higher than average expected quality) and re-ran the main analysis.First, we looked at a subset of studies that did not have a main hypothesis regarding sex.When only those studies were included, the overall effect size remained significant, g = 0.23, 95% CI = [0.15,0.30], p < .001.In line with that finding, we did not find a significant difference in effect size between studies that did (k = 6) or studies that did not have a main hypothesis regarding sex (k = 88), β 1 = − 0.04, p = .763.Second, we looked at a subset of studies that did not involve additional tasks that could bias the trust game experiment.When only studies without additional tasks were included, the overall effect size remained significant, g = 0.26, 95% CI = [0.17,0.35], p < .001.Again, we did not find a significant difference in effect size between studies with (k = 42) or without (k = 52) additional tasks, β 1 = − 0.08, p = .263.
Third, we looked at a subset of studies that were published in a scientific journal.When only those studies were included, the overall effect size still remained significant, g = 0.23, 95% CI = [0.15,0.31], p < .001.As we already discussed in the publication bias analysis, we did not find a difference in effect size between studies that were published in a scientific journal and studies that were not.An overview of the sensitivity analyses can be found in Table 7.
Finally, as recommended by Kraemer Gardner, Brooks, andYesavage (1998), andIoannidis, Stanley, andDoucouliagos (2017) we carried out several sensitivity analyses using subsets of studies with different sample sizes.To this end, we ran several power analyses to find the required effect sizes corresponding to varying a priori estimated effect sizes and a power of 0.8.The first column of Table 8 provides the a priori estimated effect sizes, while the second column provides the corresponding required sample size per group.We ran several random effects analyses with only the studies that matched these required sample sizes.For example, the first analysis was run with only studies that had an average sample size per group of at least 394.The third column provides the number of studies that fulfilled this requirement and the remaining columns provide the results from this particular sensitivity analysis.Fig. 3 illustrates the information in Table 8 graphically.We discuss the relevance of this result for our main findings in the Discussion.

Overview
The gift-exchange game meta-analyses encompassed 15 papers with 35 effect sizes, and 1362 participants in 9 countries.The metaanalyses regarding both trust and trustworthiness involved 15 different papers that included 17 effect sizes in the case of trust and 18 effect sizes in the case of trustworthiness.

Publication bias analysis
We did not find evidence of publication bias in the meta-analysis on the gift-exchange game.Egger's regression test for funnel plot asymmetry gave a non-significant intercept for trust studies, z = − 1.53, p = .126,and trustworthiness studies, z = − 0.35, p = .725.These results can be visually confirmed when looking at the funnel plot of the studies on trust (see Fig. 4) and the funnel plot of the studies on trustworthiness (see Fig. 5).Finally, we found a non-significant difference in effects for the published studies (k = 14 for both trust and trustworthiness) as opposed to the unpublished studies (k = 4 for both trust and trustworthiness) for both trust, β 1 = 0.12, p = .541,and trustworthiness, β 1 = 0.20, p = .456.

Main effect analysis
Inconsistent with the prediction from parental investment theory, the random effects analysis indicated no significant overall sex difference in trust in the gift-exchange game, g = 0.15 95% CI = [− 0.03, 0.32], p = .100.For trustworthiness, the random effects   analysis indicated an opposite overall effect of what was expected based on social role theory; men appear to be more trustworthy than women, g = 0.33, 95% CI = [0.11,0.56], p = .003.This overall effect can be qualified as small to moderate.

Moderator analyses
Because of the possibility of inflated error rates with multiple tests, we decided to adjust all p-values in the moderator analysis using the Benjamini-Hochberg prodedure (Benjamini & Hochberg, 1995).We ran the procedure separately for the trust analyses and the trustworthiness analyses.The procedures can be found at https://osf.io/w6kfn(trust game) and https://osf.io/dcsnb(gift-exchange game).
None of the features of the experimental setting proved to moderate sex differences in trust (see Table 9): whether the participants got paid based on their decisions in the game, β 1 = − 0.42, p = .289,whether the second mover received their own initial endowment, β 1 = 0.31, p = 0.050, whether the strategy method was used to elicit the behavior of the second mover, β 1 = − 0.26, p = .486,the number of iterations of the trust game, β 1 = − 0.08, p = .727,whether the gift-exchange game was framed neutrally or in a labor context, β 1 = − 0.02, p = .917,whether first movers in the gift-exchange game needed to suggest an effort level, β 1 = − 0.30, p = .138,whether second movers in the gift-exchange game were able to reject the first mover's offer, β 1 = − 0.09, p = .630,and whether efficiency in the gift-exchange game was determined by the second mover, both first and second mover, or whether that was ambiguous, β 1 = 0.32, p 1 = 0.173, β 2 = 0.25, p 2 = 0.234.There were no gift-exchange game studies where participants had to play both roles, so we were not able to calculate a moderating effect of that variable.
The results for the moderators of sex differences in trustworthiness are similar to those of trust (see Table 10): whether the participants got paid based on their decisions in the game, β 1 = − 0.37, p = .451,whether the second mover received their own initial endowment, β 1 = − 0.17, p = .456,whether the strategy method was used to elicit the behavior of the second mover, β 1 = − 0.29, p = .512,the number of iterations of the trust game, β 1 = − 0.004, p = .893,whether the gift-exchange game was framed neutrally or in a labor context, β 1 = − 0.21, p = .486,whether first movers in the gift-exchange needed to suggest an effort level, β 1 = 0.27, p = .303,whether second movers in the gift-exchange were able to reject the first mover's offer, β 1 = 0.29, p = .192,and whether efficiency in the gift-exchange was determined by the second mover, both players, or whether this was ambiguous (reference group), β 1 = − 0.66, p = .02,β 2 = − 0.47, p = .078.

Sensitivity analyses
We performed sensitivity analyses on the trustworthiness decisions only because the gift-exchange game results showed a significant overall sex difference only for trustworthiness and not for trust.First, we looked at a subset of studies that did not have a main hypothesis regarding sex.When only those studies were included, the overall effect size remained significant, g = 0.29, 95% CI = [0.06,0.52], p = .012.Studies that did have a main hypothesis regarding sex (k = 1) and studies that did not (k = 17) failed to show significantly different effect sizes, β 1 = 0.48, p = .196.Second, we looked at a subset of studies involving additional tasks potentially biasing the gift-exchange game experiment.When only those studies were included, the overall effect size remained significant, g = 0.31, 95% CI = [0.05,0.57], p = .018.Again, we failed to find a significant difference in effect sizes between studies that did (k = 3) or did not have (k = 15) any additional tasks, β 1 = 0.10, p = .719.Third, we looked at a subset of studies that were published in a scientific journal.When only published studies were included, the overall effect size still remained significant, g = 0.38, p = .01.As we already discussed in the publication bias analysis, we did not find a significant difference in effect size between studies that were or were not published in a scientific journal.An overview of these sensitivity analyses can be found in Table 11.
Finally, like we did for the trust game results, we carried out a sensitivity analyses using subsets of studies based on sample sizes.An overview of these analyses is provided in Table 12 and Fig. 6.We return to these results in the Discussion.

Discussion
In this meta-analysis, we reviewed the literature on sex differences in the one-shot trust game and the one-shot gift-exchange game.In these games, the decision of the first mover is an indication of trust, while the decision of the second mover indicates trustworthiness.Based on parental investment theory and social role theory, we predicted men to send more than women as first movers and women to send more than men as second movers.We ran separate random effects analyses for all main hypotheses, and we included ten moderators to study the relevance of the experimental protocol for explaining sex differences in both games.
In line with our predictions, we found that male first movers, on average, send more than female first movers in the trust game (g = 0.22; small effect), but we did not find a significant overall sex difference in the gift-exchange game (g = 0.15).With regard to second mover behavior, we failed to find an overall sex difference in the trust game (g = − 0.04), but we did find that, on average, male second movers send more than female second movers in the gift-exchange game (g = 0.33; small to moderate effect).The second mover results in both games are in contrast with our predictions, but they may make some theoretical sense as we reveal later.Finally, we found that none of the moderator variables significantly moderated the main effects outlined above.Below we discuss the main results and their implications in more detail.We start out with the moderator analyses and then continue with the main effects.

Moderating variables of sex differences in trust and trustworthiness
Our analyses indicated that the ten moderators explained sex differences neither in the trust game, nor in the gift-exchange.
O.R. van den Akker et al.
However, before we conclude that men and women are influenced equally by the experimental protocol of these games it is good to take note that we might have failed to find a moderating effect because our moderation analyses lacked statistical power (Hedges & Pigott, 2004;Hempel et al., 2013).Low power for moderator analyses occurs frequently in cases where the subgroup sample sizes are unequal (Alexander & DeShon, 1994).In our case this holds too.For example, participants were paid based on their choices in only seven studies out of the 94 in the trust analysis.Similarly, only two of the 18 studies in the gift-exchange game analysis involved the use of the strategy method.Because power is lower in these cases, caution is warranted when interpreting the null result in our moderator analyses.

A sex difference in trust
Regarding the trust game, our analysis indicated that men, on average, send more than women as first movers.While this finding was expected a priori, a sensitivity analysis raised some initial concerns about the robustness of this effect.Specifically, we found the overall effect size to be smaller and non-significant when only the six largest studies were included.This difference could be explained by publication bias, possible moderator effects, and/or chance in combination with decreased statistical power.However, as outlined in the results section, we found no evidence of publication bias and no moderator effects.Additional examination also did not indicate   O.R. van den Akker et al. structural differences between the six largest studies and the 88 other studies (e.g., regarding representativeness of the sample).We therefore conclude that chance in combination with decreased statistical power is the most likely explanation of the non-significant effect of sex on trust in the six largest studies using the trust game, and stick to our conclusion that, on average, men send more money as first movers than women.However, it is important to keep in mind that the average effect size of g = 0.22 is small7 .
Regarding the gift-exchange game, we did not find a significant sex difference in first mover behavior.However, it should be noted that the effect size in the gift-exchange game is in the same (expected) direction and only slightly lower (g = 0.15) than the effect size in the trust game (g = 0.22).Because the gift-exchange game meta-analysis had way fewer studies (k = 18) than the trust game metaanalysis (k = 94) it could be the case that our nonsignificant result was caused by a lack of statistical power.To verify this, we did a post hoc power analysis (Quintana, 2017;Valentine, Pigott, & Rothstein, 2010) using information from our meta-analyses.We used an effect size based on an effect size similar to that found in the trust game (0.20), and the number of effect sizes (18), the average group size (20) and a measure of heterogeneity (τ 2 = 0.02) from our gift-exchange game meta-analysis.We found a post hoc power of 0.75, which indicates that we had a 75% (25%) chance of (not) finding a significant sex difference if the true effect size is g = 0.20.This means that our results are not conclusive to determine whether there is a sex difference in trust in the gift-exchange game, and whether it is similar to the sex difference in the trust game.What we can conclude is that if a sex difference exists in the gift-exchange game, it is likely small.

A sex difference in trustworthiness
In contrast to our predictions, the gift-exchange game meta-analysis showed that men, not women, send more as second movers, with an average effect of g = 0.33 8 .Our sensitivity analysis showed no reason to doubt the robustness of the overall sex difference in the gift-exchange game.Thus, we conclude that men are more trustworthy than women in the gift-exchange game, where the effect is small to moderate, g = 0.33.
However, we did not find men to be more trustworthy than women in the trust game (g = − 0.04).As the meta-analysis had a (post hoc) power of 1.00 to detect a small effect of g = 0.20, we conclude there is no sex difference in trustworthiness in the trust game, or if an effect exists, it is small and practically insignificant (g < 0.20).This does mean that we still have to explain why men send more as second movers in the trust game, but not in the gift-exchange game.We provide a possible explanation in the next section.

Reconciling the results from the two games
Our results indicate that the difference between male and female behavior in trust and trustworthiness depends on the game that is used to assess these constructs.More specifically, we found men to be more trusting in the trust game, but found no difference in the gift-exchange game (although this could be due to a lack of statistical power).Additionally, we found men to be more trustworthy in the gift-exchange game, but found no difference in the trust game.The obvious way to make sense of these inconsistencies is by considering the key difference between the trust game and the gift-exchange game.Whereas in the trust game efficiency comes about through the decision of the first mover (i.e., the resources transferred by the first mover are multiplied), in the gift-exchange game efficiency can come about in three ways.In a first variant of the game efficiency comes about through the second mover only, in a second variant through both the first and second mover, and in a third variant it is ambiguous which of the players determines efficiency (see the coding protocol of the gift-exchange game at https://osf.io/dp9xu).Let us consider only gift-exchange games where the efficiency is solely determined by the second mover (k = 8), as this game can be seen as a 'reversed trust game' where instead of the first transfer the second transfer is multiplied.When taking only studies using these gift-exchange games into account, we again find no significant overall sex difference for first movers, g = 0.19, 95% CI = [− 0.07, 0.44], p = .15,and we again find that men send significantly more than women, on average, as the second mover, g = 0.30, 95% CI = [0.02,0.58], p = .036.
When taking these trustworthiness results at face value and combining them with the trust results we can distill an interesting pattern.That is, our results suggest a 'male multiplier effect': when a multiplication factor is involved, men send more than women, both as a first mover (in the trust game) and as a second mover (in the gift-exchange game).What can we make of this so-called 'male multiplier effect'?An evolutionary psychology explanation may shed some light on this. 9 Throughout our evolutionary past, men and women have encountered different evolutionary challenges, which may have contributed to sex differences across different kinds of psychological traits.We already discussed the fairly well-established sex difference in risk-taking which is likely the result of biological differences between men and women in parental investment.Yet this difference may have also selected for a stronger motivation among men to acquire resources and share them within their community.In hunter-gatherer societies that characterize our evolutionary past (Von Rueden & Van Vugt, 2015), women were largely responsible for taking care of the children and offering family support, while men were largely responsible for acquiring resources through hunting, trading, and warfare (Hooper, Demps, Gurven, Gerkey, & Kaplan, 2015;Kaplan, Hill, Hurtado, & Lancaster, 2001).
The tendency of men to acquire these surplus resources may be driven ultimately by female mate choice.There are plenty of findings that show that women prefer men as partners who signal a potential to attain resources and share them (e.g., traits like intelligence and generosity; Buss, 1989;Fales et al., 2016;Iredale, Van Vugt, & Dunbar, 2008;Van Vugt & Iredale, 2013).Thus, men may have evolved a stronger drive to acquire shareable resources.Support for this idea comes from anthropological studies that indicate sex differences in the division of labor, where men more often than women pursue high-risk activities to acquire resources, like hunting large game and vying for leadership positions (Hawkes, O'Connell, & Blurton Jones, 2001;Seabright, 2012;Von Rueden, Alami, Kaplan, & Gurven, 2018).Obviously, the first transfer in the trust game and the second transfer in the gift-exchange game are excellent opportunities for resource acquisition as in both cases the initial number of resources is multiplied.It might be that men, more so than women, respond to the multiplication factor in both games and send more when that multiplier is present.In these situations, it does not matter that the acquisitor does not always incur the benefits of those resources because the ultimate goal is not the resources themselves, but the prestige and status that come with acquiring them.Indeed, evidence from the trust game suggest that participants derive pleasure from the value-creating power in their role as first mover (Becchetti and Degli Antoni, 2010).
However, this explanation hinges on the acquisitor's drive to build a reputation as a resource provider and in the anonymous oneshot games we analyzed it was not possible to establish such a reputation.Indeed, in a meta-analysis on sex differences in public goods games, which also involve a multiplier, Balliet, Li, Macfarlan, & Van Vugt (2011) found that men only provide more to a public good in repeated games (i.e., games where reputation building is possible).However, it should be stressed that evolutionary mechanisms do not always work at a conscious level.For example, non-human animals are triggered by mating rituals to engage in sexual behaviors, even though they do not appear to understand that sex can lead to reproduction (Dunsworth & Buchanan, 2017).The ultimate explanation of their behavior (that they can propagate their genes) is thus independent of the proximate explanation of their behavior (that they are triggered by mating rituals).Similarly, the ultimate explanation of sending money in trust and gift-exchange games (building a reputation to attract a mate) could be independent of the proximate explanation of sending money (that they are triggered by a potential for multiplying resources).In short, the fact that men pursue resource acquisition even though reputation building is impossible does not necessarily contradict this evolutionary explanation for the male multiplier effect.
However, the fact that Balliet et al. (2011) did not find a sex difference in one-shot public goods games sits less well with our evolutionary explanation.Because there is a multiplier present in public goods games, just like in the trust and gift-exchange games, we would expect men to give more than women in those games.But this is not the case; apparently men are not triggered by a multiplier in public goods games even though they seem to be triggered by a multiplier in the trust game and the gift-exchange game.The empirical evidence for a male multiplier effect is therefore ambiguous.To overcome this ambiguity we need additional empirical evidence with which we can create a solid theoretical framework explaining sex differences in both public goods games and trust and gift-exchange games.

Future directions
A straightforward step in resolving the ambiguity in the empirical literature is to corroborate the male multiplier effect using a large-scale preregistered experimental study.This study could employ two games that are exactly the same, except for which transfer gets multiplied.In the 'trust game' the first transfer would be multiplied, and in the 'gift-exchange game' the second transfer would be multiplied.All other features of the games should be identical.If we find that men send more than women as first movers in the trust game and as second movers in the gift-exchange game, we can be more confident that a male multiplier effect actually exists.
Moreover, this experiment could help shed light on an aspect that we were not able to properly study in the current meta-analysis: the size of the multiplier.Although we did include the multiplier as a moderator in our trust game analysis our statistical test lacked power because only 4 studies involved multipliers other than 3.In the proposed experiment we could manipulate the multiplier for both games, which means that the experiment would not only be able to replicate the male multiplier effect, but also extend it in a theoretically meaningful way.
Based on the findings from this large-scale preregistered study we could start to develop a testable theoretical framework that can explain this effect.An interesting starting point would be the evolutionary explanationbased on parental investment and sexual selection theorywe outlined above, but there may be other explanations, some of which may be rooted in other fields such as cultural psychology or sociology.
Interestingly, the evolutionary explanation would predict the male multiplier effect to occur across a wide range of cultures, whereas sociocultural explanations might predict cultural differences.For instance, a sociocultural account would predict that the male multiplier effect is weaker for people who are less exposed to gender roles, which means the effect should be weaker for children and people in societies with weak gender roles.An evolutionary account could predict that the type of interaction partner is important.For women, trusting behaviors may mainly arise with interaction partners they are personally close to because it is close relationships that matter with respect to reciprocal arrangements in child care.For men, trusting behavior could be influenced by the presence of a female third party because parental investment theory suggests that one of the goals of risk-taking behavior is to impress potential sexual partners.These are all interesting possibilities to study this male multiplier effect in more detail, either using the trust game and gift-exchange game, or using other measures.

Limitations
Four limitations come to mind with regard to our meta-analyses.First, we were not able to include the entire sample of trust games and gift-exchange games in our analyses as not all authors provided us with data on sex differences.Even though we managed to retrieve the data for more than half of the eligible papers (which is somewhat higher than data sharing rates for psychology papers, see Vanpaemel, Vermorgen, Deriemaecker, & Storms, 2015;Wicherts, Borsboom, Kats, & Molenaar, 2006) the fact that we were not able to include all studies does raise questions about the representativeness of our sample.It is possible that there are systematic differences between the study results of authors who sent us their data and the study results of authors who did not.However, we could not think of any convincing reasons why this would be the case, especially given that almost none of the authors explicitly looked at sex differences in their papers.
Additionally, we did everything we could to rule out systematic biases in our own choices during the search and selection of primary studies.For example, we used as much as six databases to find papers, we included many unpublished papers, and a research assistant independently checked whether the inclusion choices of the first author were biased.Moreover, our search is fully transparent and reproducible.A detailed overview of our search for papers can be found at https://osf.io/qmz2h(trust game) and https://osf.io/pgm7n (gift-exchange game).
A second limitation of our meta-analyses is that we used fairly stringent inclusion criteria.For example, we excluded nonexperimental studies that use Likert questions to measure trust and trustworthiness (e.g., Herd, Carr, & Roan, 2014;Reeskens & Hooghe, 2007;Yamagishi & Yamagishi, 1994) because they show signs of social desirability bias (Naef & Schupp, 2009) and fail to show a consistent correlation with behavioral measures of trust (Glaeser et al., 2000).In addition, we excluded non-standard trust games and gift-exchange games because such games involve elements that could confound our results (e.g., repeated games involve reputation building).Our strict selection does mean that we have a relatively homogeneous sample, which makes it easier to draw proper conclusions from our results (Thompson, 1994).However, it also means that these conclusions should be limited to concepts measured by the games that meet our inclusion criteria.That is, our conclusions do not generalize to concepts measured by repeated games, non-continuous games, or games that deviate in another way from the 'standard' trust and gift-exchange games.This seems like a valuable area for future research.
Similarly, our choice to include only papers written in English may leave some doubt about the generalizability of our results to non-English speaking countries.However, English is the lingua franca of the academic community and, as such, the vast majority of papers are written in English.This is exemplified by the fact that, even though we only included English papers, we were able to include studies from 24 different countries from all over the world in our sample.
A third limitation is based on suggestions that the trust game and the gift-exchange game do not necessarily only measure trust and trustworthiness in isolation (Dunning, Anderson, Schlösser, Ehlebracht, & Fetchenhauer, 2014;Dunning et al., 2012;Thielmann & Hilbig, 2015).More specifically, Cox (2004) and Ashraf et al. (2006) provide evidence that first and second mover behavior in the trust game are related to altruistic preferences more so than trust and trustworthiness.Similarly, given the intricate relationship between social risk-taking and trust, the results from our trust game analysis could also indicate that men are simply willing to take more social risk.Indeed, most researchers agree that trust is multifaceted (e.g.Alós-Ferrer & Farolfi, 2019), but studies that pit these facets against each other do find that trust games mostly measure trust and trustworthiness instead of related concepts (Ben-Ner & Halldorsson, 2010;Thielmann & Hilbig, 2015).
A fourth limitation relates to trustworthiness, which we measure by the ratio of the amount sent by the second mover to the amount sent by the first mover.This measure of trustworthiness does not take into account the absolute amount sent by the first mover, while it could be argued that someone is likely to reciprocate more when they receive a larger amount (Johnson & Mislin, 2011).Our measure does not provide us with information about this on an individual level.Fortunately, we were able to compute the correlation between average (absolute) first mover transfers and the ratio of second mover transfers and first mover transfers at the study level (see R code for the trust game at https://osf.io/3rvkcand for the gift-exchange game at https://osf.io/4ajp5).We found non-significant correlations of 0.17 (trust game) and 0.08 (gift-exchange game) suggesting that our choice of trustworthiness measure did not bias our results, affirming the validity of using the ratio as our measure of trustworthiness.

Concluding remarks
In our meta-analyses of sex differences in the trust game and the gift-exchange game, we used parental investment theory and social role theory to hypothesize that men would be more trusting and women would be more trustworthy.Our trust meta-analyses indicated that men are more trusting only in the trust game and there is no sex difference in the gift-exchange game.Our gift-exchange metaanalyses indicate that men, not women, are more trustworthy in the gift-exchange game and there is no sex difference in the trust game.These results suggest a possible 'male multiplier effect', whereby males are more strongly triggered than women by the possibility to acquire surplus resources.However, earlier studies are not entirely consistent with this explanation, so more empirical work is required to substantiate this effect.Hopefully, this empirical work will lead to a theoretical framework with which we can explore the proximate and ultimate explanations of this male multiplier effect as well as its boundary conditions.

Fig. 1 .
Fig. 1.Funnel Plot of the Studies on Sex Differences in Trust in the Trust Game.

Fig. 2 .
Fig. 2. Funnel Plot of the Studies on Sex Differences in Trustworthiness in the Trust Game.

Fig. 3 .
Fig. 3.A Graphical Representation of the Sensitivity Analysis on Sex Differences in Trust in the Trust Game Using Sample Size as the Subset Variable.

Fig. 4 .
Fig. 4. Funnel Plot of the Studies on Sex Differences in Trust in the Gift-Exchange Game.

Fig. 5 .
Fig. 5. Funnel Plot of the Studies on Sex Differences in Trustworthiness in the Gift-Exchange Game.

Fig. 6 .
Fig. 6.A Graphical Representation of the Sensitivity Analysis on Sex Differences in Trustworthiness in the Gift-Exchange Game Using Sample Size as the Subset Variable.

Table 2
Studies Included in the Meta-Analyses on Sex Differences in the Trust Game.

Table 3
Studies Included in the Meta-Analyses on Sex Differences in the Gift-exchange Game.

Table 5
Summary of the Moderator Effects on Sex Differences in Trust.< .05,** p < .01,*** p < .001.Q refers to the value of the Q-statistic, which is used to test the null hypothesis of no heterogeneity.'k' indicates the number of studies.'g' refers to the Hedges' g effect size measure.

Table 6
Summary of the Moderator Effects on Sex Differences in Trustworthiness.

Table 7
Summary of the Sensitivity Analyses on Sex Differences in Trust.
Note: * p < .05,** p < .01,*** p < .001.Q refers to the value of the Q-statistic, which is used to test the null hypothesis of no heterogeneity.'k' indicates the number of studies.'g' refers to the Hedges' g effect size measure.

Table 8
Overview of the Sensitivity Analysis on Sex Differences in Trust in the Trust Game Using Sample Size as the Subset Variable.

Table 9
Summary of the Moderator Analysis of Sex Differences in Trust in the Gift-Exchange Game.p < .05,** p < .01,*** p < .001.Q refers to the value of the Q-statistic, which is used to test the null hypothesis of no heterogeneity.'k' indicates the number of studies.'g' refers to the Hedges' g effect size measure. *

Table 10
Summary of the Moderator Analysis of Sex Differences in Trustworthiness in the Gift-Exchange Game.

Table 11
Summary of the Sensitivity Analyses on Sex Differences in Trustworthiness in the Gift-Exchange Game.< 0.001.Q refers to the value of the Q-statistic, which is used to test the null hypothesis of no heterogeneity.'k' indicates the number of studies.'g' refers to the Hedges' g effect size measure.

Table 12
Overview of the Sensitivity Analysis on Sex Differences in Trustworthiness in the Gift-Exchange Game Using Sample Size as the Subset Variable.