Gender Differences in Private and Public Goal Setting

We conduct a field and an online classroom experiment to study gender differences in self-set performance goals and their effects on performance in a real-effort task. We distinguish between public and private goals, performance being public and identifiable in both cases. Participants set significantly more ambitious goals when these are public. Women choose lower goals than men in both treatments. Men perform better than women under private and public goals as well as in the absence of goal setting, consistent with the identifiability of performance causing gender differences, as found in other studies. Compared to private goal setting, public goal setting does not affect men’s performance at all but it leads to women’s performance being significantly lower. Comparing self-set goals with actual performance we find that under private goal setting women’s performance is on average 67% of goals, whereas for men it is 57%. Under public goal setting the corresponding percentages are 43% and 39%, respectively.


Introduction
Gender equality is one of the 17 Sustainable Development Goals (SDG 5) elaborated by the United Nations Development Programme in the 2030 Agenda for Sustainable Development. 1 A more specific goal is to increase women's participation and leadership in all forms of decision-making in the public, judiciary, and private sector. 2 Why should we care about the underrepresentation of women in leading positions? The empirical literature provides initial evidence that benefits of gender equality in the workplace exist because of complementarities between the two genders. 3 There are many reasons for the imbalance between women and men in leading positions. An important distinction is that between demand-side and supply-side factors (Gino et al., 2015). Demand-side factors are those that women face because of the different ways in which women are judged and treated on the labor market and in society at large (prejudice, discrimination, etc.). Supply-side factors are related to differential beliefs and behavior of women and men that are relevant for access to high-level positions. On the end of supply-side factors, gender differences in the reaction to various aspects of competition have been studied in detail in a large experimental literature (see Niederle, 2016, for a survey of relevant studies). 4 In this paper, we study gender differences in a novel, potentially important supply-side dimension of behavior: performance goal setting. The specific question that motivates our work is whether gender differences in public goal setting could be one reason for the female underrepresentation in high-level positions. Leaders in the public and private sector often announce their goals for the society or the firm publicly; for instance, governments setting public targets for debts, unemployment, vaccination, or emission levels or companies declaring self-set targets for a more diverse workforce. 5 Leaders are arguably more visible than lower-ranked employees. If females have (more) difficulties with this part of the job, this may be part of the explanation of why they apply less for leading positions (among other factors). With respect to the specific issue of gender-balance in organizations, public self-set goals are, compared to more disruptive interventions like quotas, a 'softer', less invasive, and potentially cheaper intervention. 1 https://www.undp.org/content/undp/en/home/2030-agenda-for-sustainable-development/people/genderequality.html (link accessed on Dec 15, 2020). 2 As pointed out by the World Economic Forum (2018): " [...] while there are still relevant gender-biased labour market outcomes, the presence of women in management roles is today one of the main barriers to overcome, both in the public and private sector, in order to achieve full economic gender parity [...]." 3 These benefits are especially important in organizations requiring high-skill workers. García-Meca et al. (2015) show that board-level gender diversity improves the performance of firms. The positive effects on firm performance are especially large for those whose strategy is based on innovation (Dezsö and Ross, 2012) and for firms in high-tech manufacturing and knowledge-intensive services (Christiansen et al., 2016). Gender diversity on the boards of banking-supervision agencies has also been associated with greater financial stability (Sahay and Cihak, 2018). 4 More recently, research has also addressed the role of public observability for gender differences in public speaking. Survey-based and observational data show that women fear more to speak up in public (Stein et al., 1996;Turk et al., 1998;Behnke and Sawyer, 2001;Marinho et al., 2017), feel more stressed about it (Buser and Yuan, 2020), and also do speak up less frequently in public settings (Hinsley et al., 2017;Carter et al., 2018;Parthasarathy et al., 2019). 5 Since 2016, German firms not falling under the gender quota for supervisory boards (that is, firms that are publicly listed or that are third codetermined and have at least 500 employees) need to set their own targets for the proportion of women in boards and the leading management levels below the executive board (Annadanam et al., 2021).
Public institutions like the European Central Bank and large private corporations opt ever more often for publicly announcing self-set goals for a more diverse workforce (and for committing to publish the corresponding data of whether the self-set goals are reached). 6 Not only leaders, but individuals in general face many situations on the labor market and in private life, in which their performance is observable and identifiable and they may use self-set goals -private or public -as a commitment device. An example from the labor market is a researcher's performance in terms of number and quality of publications. This information is publicly observable. To overcome self-control problems, s/he may set a goal upfront -either privately (that is, without telling anyone) or publicly (for instance, in a yearly assessment with the head of department or in front of colleagues). An example from the private environment is an individual's intention to lose weight. If the intended weight loss is large enough it will be publicly observable. Also here, the individual may want to set a goal for him-/herself -either privately (that is, without telling anyone) or publicly (for instance, by sharing it with friends or family).
In this paper, we focus on the goal-setter in a non-strategic setup. A feature that all previously mentioned examples and many real-life situations have in common to a varying degree is the presence of strategic concerns. In a strategic situation, the goal-setter's performance (and goal) and at least one other party's decision(s) determine the outcome. The goal-setter is influenced by individual factors like confidence and also considers how others react to own performance (and goal). To the best of our best knowledge, we are the first to investigate the difference between the private versus public goal setting explicitly and its interaction with gender differences. As a first step, we implement a non-strategic setup in this paper to rule out beliefs about how others react to the own goal and performance.
In this paper, we shed light on this question, and investigate experimentally how women and men set goals for themselves and perform in a particular real-effort task, where performance is publicly observable and identifiable. We analyze behavior under two distinct goal-setting conditions: Private goals -that is, the self-set goal is only observable by oneself -and public goals -that is, the self-set goal is observable and identifiable by the public. We run a field experiment and an online classroom experiment, in which participants perform a real-effort task. Depending on the randomly assigned treatment, participants perform the task without goal setting (control condition), after setting a goal privately, or after setting a goal publicly. The primary outcome variables are participants' self-set goals and performance.
Our results show that participants set significantly more ambitious goals when these are public. Men perform better than women under private and public goals as well as in the absence of goal setting, consistent with the identifiability of performance causing gender differences as found in Schram et al. (2019). Compared to private goal setting, public goal-setting does not affect 6 See for instance, https://www.ft.com/content/0d1d2d4d-8bb8-42ce-b263-9863a1f377ed for the goal set by the European Central Bank, https://www.bloomberg.com/news/articles/2021-04-14/amazon-pledges-to-promote-morewomen-black-employees for Amazon's diversity goals, or https://www.cnbc.com/2020/06/09/gm-ceo-its-myresponsibility-to-drive-change-after-floyds-death.html for GM's diversity goals (links accessed on May 06, 2021). men's performance at all but it leads to women's performance going significantly down. In terms of the ratio between performance and goals, participants are more realistic under private than under public goal setting, with women being more realistic than men in both cases.
The remainder of the paper is structured as follows. In section 2, we present a review of the relevant literature in psychology and economics. Section 3 discusses the experimental procedures, the design, the hypotheses, and the research questions. Section 4 contains the results and section 5 the discussion and conclusion.

Review of the Literature on Goal Setting
In this section we briefly refer to some related research, starting with the general issue of goalsetting and moving then to gender differences and then to the distinction between private and public goals. There is a rich literature in psychology on goal setting and performance which finds that setting goals, whether self-set, assigned by others or set jointly through participation, is better for performance than not setting any goals (Latham and Locke, 2007). These goal-setting effects have been shown to be salient in the realm of sports, academic performance, managerial and professional jobs, and teamwork, to mention a few examples (Locke, 1996;Locke and Latham, 2002;Locke and Latham, 2006, for literature reviews). Support for goal-setting effects on performance has also been found worldwide in experimental and non-experimental research with samples consisting of participants from Asia, Australia, Europe, and North America (Locke and Latham, 1990). 7 Economists have studied possible mechanisms behind the effects of goal setting in several theoretical studies. Heath et al. (1999) propose the "goals as reference points" approach to explain the motivational process behind the achievement of goals. Their idea is that a reference point systematically alters the value of outcomes as described by the psychological principles of Prospect Theory (Kahneman and Tversky, 1979). The reference point divides outcomes into regions of gains and losses and the value function includes loss aversion in the loss domain. Because loss aversion implies that losses are more painful than gains of the same size are pleasurable, individuals will be more motivated to perform better after moving their reference point up through the goal. Wu et al. (2008) provide a formal model for Heath et al.'s (1999) assumptions and Koch and Nafziger (2011) show how self-set goals act as a commitment device to help overcome self-control problems. The key idea is that loss aversion makes low performance/ non-compliance of goals psychological painful and thereby motivates individuals to commit to their goal. 7 The psychology literature further shows that relevant mediators in goal-setting and performance are individual choice, effort, persistence, and goal-achievement strategy. Potential moderators are the ability to achieve a goal, goal commitment, feedback concerning goal pursuit, the complexity of a goal, and other situational factors (e.g., presence of needed resources to achieve a goal) (see Locke and Latham, 2006;Latham and Locke, 2007;Locke and Latham, 2019 for an overview of mediators and moderators).

Self-set goals and gender differences
The psychology literature posits that there are gender differences in goal-setting behavior and offers potential explanations. However, this research is relatively old and likely not fully generalizable to contemporary times, because societal events, these days, continuously put pressure on individuals to change or adapt their attitudes towards gender differences (Szekeres et al., 2020). This could also influence men and women's goal-setting behavior (Latham and Locke, 2007;Locke and Latham, 2019). According to the older stream of goal-setting research in the psychology literature, men chose higher task-based goals than women (e.g., Kurman, 2001;Levy and Baumgardner, 1991;de Pater et al., 2009). 8 The psychology literature offers a likely explanation: men are generally perceived by both genders as more competent, leading to superior male performance (McCarty, 1986;Wood and Karten, 1986). Thus, men are more confident about their competences (e.g., Beyer, 1990;Lundeberg et al., 1994;Beyer and Bowden, 1997) and opt therefore for more challenging goals than their less confident female peers (McCarty, 1986;Wood and Karten, 1986).
More recently, economists started investigating gender differences in goal-setting behavior. For instance, Clark et al. (2017) examine whether self-set goals that are task-based, or performance-based improve student performance 9 . The authors find that the task-based goal setting increases task completion (i.e., practice exams) and course performance, but only for men. Women completed more practice exams in the control group without goal setting. In another study, Dalton et al. (2016) provide a model of self-chosen goals that predicts that (i) the self-chosen goal contract is more cost-effective than a piece-rate contract for an employer aiming for a specific output level, and that (ii) workers set goals that they systematically outperform. The authors test these predictions in the laboratory and find that the self-chosen goal contract increases men's performance compared to the piece-rate contract. However, this is not the case for women. Concerning the self-set goals, women set lower goals than men but outperform their self-set goals to a greater extent than men. 10 The experimental economics literature offers potential explanations for such a gender difference in goal setting. First, there is evidence that women take less risk than men. In a series of ten experimental studies, Croson and Gneezy (2009) find evidence that women are indeed more risk-averse compared to men. Dohmen et al. (2011) measure and validate self-assessed risk aversion and show that women are much less likely to take risks in general. This finding applies to several domains; car driving, finance, sports and leisure, health, and career. A similar conclusion on gender differences in risk aversion is echoed in more recent studies Falk et al., 2018). In addition to being more risk-averse, women report a higher intensity of nervousness and fear than men---in anticipation of negative outcomes (Brody, 1993;Fujita et al., 1991). Therefore women might want to avoid negative outcomes more than men (i.e., lower performance than the self-set goal) by taking less risk in not meeting the self-set goal. Given that higher goals are more challenging to achieve, females might reduce the probability of this negative outcome by setting a lower goal than men. 11 Another individual characteristic that influences which goal a person sets is self-efficacy, that is, self-confidence that the goal for a specific task is attainable (Bandura, 1997;Latham and Locke, 2007;Locke and Latham, 2006). There is consensus in the literature that men are more confident than women. For example, Niederle and Vesterlund (2007) asked participants in a laboratory experiment to solve a real task, first under a non-competitive piece-rate and afterwards under a competitive tournament incentive scheme. After solving the task in the competitive tournament incentive, participants were asked to select which of these two compensation schemes they wanted to apply to their next performance. They found that 73 percent of the men and only 35 percent of the women selected the competitive tournament. The authors conclude that this gender difference is to a large extent driven by men being more confident about their performance than women. Möbius et al. (2011) implement an experimental test with a sample of 656 undergraduate students. The authors track the evolution of students' beliefs about their own relative performance on an IQ test and find that women are less confident about their performance than men.  use data from lab experiments on preferences for redistribution conducted in the U.S. and several European countries to investigate gender differences and their causes. Across all sampled locations, they found that men are more confident about their ability compared to women.
In the psychology literature, goal-setting theory (Bandura, 1997;Latham and Locke, 2007;Locke and Latham, 2006) offers an explanation for why women, being less confident about their competences and performance, set lower goals compared to men (Dalton et al., 2016). Goal-setting theory suggests that higher goals lead to higher performance than easy lower goals because the former motivates individuals to put more effort into achieving the challenging goal such as looking for new knowledge and developing new skills (Locke andLatham, 1990, 2002;Locke and Latham, 2006).

Gender differences in private versus public environments
We are not aware of any experimental work that examines gender differences between publicly and privately self-set goals. However, there are some studies on how men and women are differentially affected by various dimensions of the publicness of the environment in which they perform. Schram et al. (2019) study the difference between providing public ranking (referred to as status ranking) and private ranking information about performance. They found no gender differences in performance or attempted summations when there was only private ranking (as well as in the absence of any ranking). By contrast, inducing status ranking leads to gender differences in performance. Men significantly increased the number of attempted summations and performance, while women significantly decreased the number of attempted summations and performance.
In another related study, Ariely et al. (2009) examine the impact on performance when an audience watches the subject working on a cognitive task that involves performance-contingent payment. Across the two conditions (public and private), there was no evidence of any gender difference in the ability to solve anagrams, nor any evidence for the two genders to be differentially influenced by social pressure.
Moreover, research suggests that in a competitive environment, a "desire to win" can emerge within individuals, which motivates them to beat the other side, rather than focusing solely on maximizing their payoffs (Cooper and Fang, 2008;Delgado et al., 2008). In environments in which self-set goals and performance are revealed publicly, competition is triggered which could motivate individuals to opt for higher goals and performance just to beat others. Whether this is indeed the case is not yet studied in goal-setting research. Moreover, given that research on gender differences in performance and attitudes in such competitive environments shows mixed results, it is yet to be explored how men and women will set goals and achieve their performance in public versus private environments.

Experimental design and procedures
The experiment is composed of three parts: the goal setting, a real-effort task, and a questionnaire, that includes socio-economic background questions. 12 The participants were not allowed to communicate with anyone during the whole experiment.
It is a crucial design feature that the audience can identify participants. Therefore, we reveal the students' names in a particular way. Before the task takes place, the students indicate their first and last names. In all treatments, we display the following information on a shared public screen at the end of the experiment: Students' first and last names together with their performance in the task. Our experiment was pre-registered and approved by the Research Ethics Review Board (School of Business and Economics, Vrije Universiteit Amsterdam). 13 Section 3.1. explains the 15-minute real effort task in detail. Section 3.2 describes the three treatments that allow us to investigate potential gender differences in private and public goal setting and whether women and men perform differently under private and public goal setting as well as in the absence of any goal setting. Section 3.3. presents the procedure and the subject pool. 12 Appendix B provides the instructions as displayed on the screen. 13 The study was pre-registered at AsPredicted.org (#48703, https://aspredicted.org/blind.php?x=4t5by6) and the experimental design of the online classroom experiment was approved by the Research Ethics Review Board, School of Business and Economics, Vrije Universiteit Amsterdam (20200828.1.xxx where xxx stands for one of the author's employee ID). Ethics approval is not required, but we still opted to apply for it.

Task
The task is identical to the one used in Schram et al. (2019) and before that in Weber and Schram (2017). 14 Participants are presented with a sequence of pairs of 10x10 matrices filled with random two-digit numbers ( Figure 1). 15 For each matrix pair, the participants' task is to search for the highest number in each matrix and then calculate the sum of these numbers. Participants have to enter this sum at the centerbottom of the computer screen. After entering the sum, the participant immediately learns if she/he has entered a correct answer or not. Regardless of whether the sum was correct or not, a new pair of matrices appears. This task stops after 15 minutes and participants can see the remaining time on the screen at the top left of the screen. We measure a participant's performance by the number of accurate summations within the time limit of 15 minutes. The instructions inform the participants that there is no ceiling on their possible performance (and hence on their performance goals they can set for themselves). The instructions state: "(...) there will always be a new pair of matrices as long as you are within the 15 minutes limit". We programmed the task such that we could be sure that nobody (even not the participant with the highest ability) could do the task and reach the limit. The task choice is an important issue, which we do not study here. See Flory et al. (2015) and Günther et al. (2010). 15 A possible alternative would have been to use the summation task of Niederle and Vesterlund (2007). As discussed in Schram et al. (2019), this task involves a risk of a stereotype threat (Shurchkov, 2012), where females feel that men have an advantage in this task. Therefore, we use the summation task of Weber and Schram (2017) and Schram et al. (2019), as these studies have found no gender performance differences. All participants perform this task individually without interacting with other participants. The instructions highlight the importance of doing well in the task by informing the participants that doing well in such a task is positively correlated with professional life success. In addition, we give the participants information about the performance distribution of similar participants doing this task in previous studies. 16 The performance of each participant is public information to all participants. We inform the participants that their performance (i.e., the total number of correct answers within the 15 minutes), together with their name, will be displayed on a shared screen after the study is finished.
As in Clark et al. (2017), we decided not to incentivize participants. The literature in psychology on goal setting theory suggests that self-set goals induce intrinsic motivation---in contrast to externally-set goals. Intrinsically motivated behavior is commonly referred to as a behavior that is engaged for its own sake without any external inducement (Pinder, 1984;Cerasoli et al., 2014), whereas extrinsically motivated behaviors are guided towards achieving some instrumental outcomes such as money or financial rewards (Erez et al., 1990). Self-set goals allow for personal control in setting a goal that is attainable with one's ability (Erez et al., 1990). A similar argument has been recently echoed by Welsh et al. (2020) that self-set goals induce positive feelings such as enthusiasm, because they are perceived as beneficial and achievable. Whether financial rewards motivate individuals to perform is deemed to depend on individual values and personal dispositions. A failure to consider these individual differences could decrease one's motivation and even result in lower performances (Malik et al., 2015). Since intrinsic motivations are considered to mainly trigger self-set goals and drive performance, we decided to not incentivize our participants financially. 17

Treatments
We implement two goal-setting treatments, private and public, next to a control treatment without goal setting. We randomly assigned participants to one of the treatments (between-subject design). Performance is public information in all (control) treatments. In the control treatment, participants are not asked to set a goal, but in the two goal-setting treatments they are. The only difference between the two goal-setting treatments is that the self-set goal is private or public information at 16 The instructions state: "This is an important task that is often used to measure people's talents. Many scientific studies have found that people who do well in a task like this are more successful in professional life than people who do less well." This statement is identical to the statement in Schram et al. (2019). The instructions continue: "In a previous session, students like you performed the same task. Most of them gave between 9 and 17 correct answers." We refer to the participants' performance in Schram et al. (2019). We included the statement about success in the task being correlated with professional success to keep our environment close to the one in Schram et al. (2019), where the same real-effort task and this statement was used. More importantly, given that we do not have monetary incentives, we thought that the statement would help participants to take the task seriously. We mention the prior performance to give the participants a broad idea of what their performance could be in a task with which they have no experience. 17 One could argue that we should incentivize performance (independently of the goal-setting). However, the orthogonality of goals and incentives has not been tested either in the economic or in the psychological literature, to the best of our knowledge. Hence, not incentivizing performance seems to us the appropriate design choice here to avoid possible interaction effects of goals and incentives. the end of the study. We thus implement a clean design in which only one feature changes at a time. Figure 2 highlights the implementation of the treatment variation. Panel A shows the implementation of the control treatment, Panel B the implementation of the private goal setting, and Panel C the implementation of the public goal setting.

Control treatment (NoGoal):
Participants do not set a goal, but they are informed that the total number of correct answers (i.e., performance) together with the participant's name will be displayed on the shared screen at the end of the experiment. 18 Private Goal Setting Treatment (PrivGoal): Before performing the 15 minutes task, the participants are asked to set a goal (the number of correct answers). Participants know that their goal will be private before they are asked to set a goal. They are informed about the setup in the feedback (and goal-setting) instructions (see Figure 2). On the subsequent screen, participants indicate their goal. The precise wording of the goal question is: "What is your self-set goal -How many questions do you WANT to answer correctly in the 15 minutes available?" On the decision screen next to this question, the instructions remind the participant that the self-set goal will NOT be displayed, but the total number of correct answers (i.e., performance), together with the participant's name, will be displayed on the shared screen at the end of the experiment. Hence, the goal setting is private but the performance is public.
Public Goal Setting Treatment (PubGoal): Before performing the 15 minutes task, the participants are asked to set a goal (the number of correct answers). Participants know that their goal will be public before they are asked to set a goal. They are informed about the setup in the feedback (and goal-setting) instructions (see Figure 2). On the subsequent screen, participants indicate their goal. The precise wording of the goal question is: "What is your self-set goal -How many questions do you WANT to answer correctly in the 15 minutes available?" On the decision screen next to this question, the instructions remind the participant that the self-set goal will be displayed, together with the total number of correct answers (i.e., performance) and with the participant's name, on the shared screen at the end of the experiment. Hence, both the goal setting and the performance are public.
Before performing the matrix-task, all participants -irrespective of the treatment assignment -are asked how many questions they expect to answer correctly in the 15 minutes available. Note that this is different from their goal, which refers to how many questions participants want to answer correctly. We think that eliciting expected performance is important because it can point to a potential explanation of why women and men possibly react differently to the goal-setting environment.

Experimental Sessions, Procedure, and Participant Pool
The experiments were conducted at the School of Business and Economics of the Vrije Universiteit Amsterdam in September 2019 and October 2020. Participants were first-year Bachelor students enrolled in the course 'People in Business and Society' from the International Business Administration program. We used the Qualtrics software to program the experiment, and the duration of the experiment was on average less than 30 minutes. The experiments in 2019 and 2020 differ in one key dimension. The experiment in 2019 was conducted during an in-person lecture, while the experiment in 2020 was conducted in an online-lecture. 19 Next, we describe the details of the implementation of both experiments.

Field experiment in September 2019:
The experiment was conducted on location at the School of Business and Economics of the Vrije Universiteit Amsterdam. The field experiment was integrated in one lecture of the first-year Bachelor course 'People in Business and Society' as a quiz. Participation was absolutely voluntary and students were informed that it would not have any impact on their assessment in the course and that they could leave the quiz at any moment in time.
The students were randomly assigned to different treatments on the online course platform and a different Qualtrics link was sent to each treatment group. In total, 302 students participated in our experiment, out of which 124 (41%) were female and 178 (59%) male. 20 69 students were assigned to the control treatment (NoGoal), 97 to the Private Goal Setting Treatment (PrivGoal), and 136 to the Public Goal Setting Treatment (PubGoal). 21 Online classroom experiment in October 2020: The experiment was conducted during one online-lecture of the first-year Bachelor course 'People in Business and Society'. The Qualtrics link was sent to the attending students during the lecture. The participation was voluntary. Students could earn a fixed number of course participation credits, which were given independently of whether they consented to participate in the study, finished, or left the study, and these conditions were announced one week before the experiment. The students were provided with one Qualtrics link and randomly assigned to one treatment within the survey. In total, 333 students participated in the experiment, out of which 144 (43%) were female and 189 (57%) male. 113 students were randomly assigned to the control treatment (NoGoal), 112 to the Private Goal Setting Treatment (PrivGoal), and 108 to the Public Goal Setting Treatment (PubGoal). 22

Research Question and Hypotheses
How do women and men set goals and perform on a real-effort task when the self-set goal is observable by the public (versus private)? 23 This primary research question and our main hypotheses are motivated by two research streams as discussed in detail in the literature review in Section 2: First, the large body of mainly psychological studies analyzing goal setting and, in particular, self-set goals and their impact on behavior and individuals' performance. And second, the broad stream of experimental economics literature addressing gender differences in different contexts involving various elements of competition. Since this is, to our best knowledge, the first study addressing gender differences in private versus public goal-self-setting environments, we developed our hypotheses based on the findings in the two motivating research fields on self-set goals and gender differences.
The focus in this paper is on the comparison of public versus private goal setting. Starting with private goals, the main difference between treatments NoGoal and PrivGoal may be related to the goal setter's usage of the goal as commitment device and his/her perception of compliance with the goal. A way to conceptualize this is in terms of additional benefits and costs of setting goals. A private goal may entail a benefit (higher motivation and satisfaction) and a cost (disappointment in case of private failure). No (empirical or theoretical) studies exist that investigate whether private goal-settings differ for women compared to men. We conjecture that the net benefit might be lower for women-as several studies find that women attribute private failure mainly to their own ability while men tend to attribute it to bad luck (Hankin and Abramson, 2001;Boggiano and Barrett, 1991;Mezulis et al., 2004).
The main difference between treatments PrivGoal and PubGoal may be related to the goal setter's belief about and preference for how others perceive the goal and compliance with it. 24 When goal-setting is public, the individual may receive an additional benefit (for instance, looking confident, skillful, or ambitious) and an additional cost (for instance, beliefs about and utility from others judging failure to meet the self-set goal) related to public image concerns. Individuals will set higher public than private goals if the additional benefits are larger than the additional costs. Such benefits and costs may differ across genders. Public image concerns related to the appropriateness of ambitious goals are likely to affect men positively: The benefits from high public goals are likely to outweigh the costs of high public goals. For women, the impact of public image concerns is more ambiguous, since women are often expected to be more modest and less competitive than men. Hence, women's public image concerns can lead to lower goals in the public setting.
How does goal setting affect performance across treatments? The only difference between treatments NoGoal and PrivGoal are private self-set goals. As discussed in the literature review section, prospect theory provides a mechanism through which goals can improve performance (loss aversion transforms goals into commitment devices). In the private goals setting, goals can be seen as the expression of commitment devices. Combined with previous findings on the effect of self-set goals on performance (e.g., van Lent and Souverijn, 2020), this leads us to expect a positive impact of private goals on women's and men's performance compared to no goals. Public goals can function as commitment devices just as private goals, with the additional effect of the visibility of goals referred to above.
Taking together these arguments, we can now formulate our hypotheses on the gender gap in goal-setting and performance across treatments. We define a gender gap as the absolute mean difference of a variable between women and men. We expect to find a gender gap in self-set goals and consequently in performance in the private goal setting treatment PrivGoal, whereby men set higher goals and perform better than women. In addition, we expect that the additional public image concerns in the public goal-setting condition PubGoal increase the gender differences as argued above. For the control treatment NoGoal we relate to the previous literature employing this task. Using the same task in a setting without goals, Schram et al. (2019) find no gender differences in performance under private ranking and without ranking and a gender gap under public ranking (women perform worse than men). In our case we do not provide any type of explicit public ranking. Also Weber and Schram (2017) do not find gender differences in performance in this task. We therefore conjecture no gender differences in the NoGoal condition. This expectation seemed to us a strong one and it is the one we use in Hypothesis 2. But one has to keep in mind that individual performance is public at the end of the experiment. This could create a feeling of public comparability, which may create an environment that is closer to one with public ranking in Schram et al. (2019).
The following hypotheses summarize our conjectures about participants' behavior (as formulated in the pre-registration 25 ): Hypothesis 1: Men set significantly higher goals than women when self-set goals are private information (treatment PrivGoal) and this difference becomes larger when goals are set publicly (PubGoal).
Hypothesis 2: While women and men do not perform differently without goals (control treatment NoGoal), a significant gender gap in performance emerges with privately self-set goals (treatment PrivGoal), and it becomes larger when goals are set publicly (treatment PubGoal).

Results
Before turning to the analysis of participants' self-set goals and their performance, we present some descriptive statistics. The distribution of socio-demographics does not differ across treatments and experiments overall. In treatments NoGoal, PrivGoal, and PubGoal, the respective share of women is 47%, 43%, and 38% (chi2 test, p = 0.136), the respective average age is 19.0, 19.2, and 18.9 (Kruskal-Wallis test, p = 0.3605), and 58%, 63%, and 64% of the participants indicate that they feel attached to the Dutch culture (chi2 test, p = 0.346).
Comparing the field and the online classroom experiments, respectively 41% and 43% of the participants are women (chi2 test, p = 0.578) while 62% indicate affinity with the Dutch culture in either cohort (chi2 test, p = 0.919). There is a small, yet significant age difference across cohorts (18.9 in the field vs. 19.2 in the online classroom experiment; Mann-Whitney U test---hereinafter MWU test, p = 0.0387).
One might be concerned that compliance of the study is selective and differs across treatments and/or gender. In the field experiment, the dropout percentages are 39% (44 of 113) in NoGoal, 36% (54 of 151) in PrivGoal, and 36% (78 of 214) in PubGoal. In the online classroom experiment, the dropout percentages are 10% (12 of 125) in NoGoal, 11% (14 of 126) in PrivGoal, and 14% (18 if 126) in PubGoal. 26 The differences across treatments are not significant -neither in the field experiment (chi2 test; p = 0.859) nor in the online classroom experiment (chi2 test; p = 0.498). The gender distribution in the study is comparable to the gender distribution of the students enrolled in the course in 2019/20 and 2020/21: Both academic years pooled, 39% (394) of the students enrolled in the course were female and 61% (604) students were male -compared to 42% (268) female and 58% (367) male participants in the study. Also for each study year and experiment separately, the distributions are comparable. 27 This leads us to believe that there was no selective dropout by gender.
In the following, we present the results on participants' goal setting and performance across gender and treatments. We focus on the pooled analysis of both experiments, but also present the results for the field and the online classroom experiment separately. We show non-parametric tests and regression results from Ordinary Least Square regressions with robust standard errors. 28

Goal Setting
In the treatments PrivGoal and PubGoal, participants choose a goal for the number of correct answers that they want/aim to give. Participants are free to choose any goal between 0 and 99, and it does not have any monetary or assessment consequences for them. The average self-set goal in 26 Note that some students dropped out before seeing the treatment instructions, whom we do not consider here. Hence, we define a dropout as a student that started reading the instructions, but decided to not participate in the experiment. 27 In the academic year 2019/20 and the field experiment, the corresponding distributions are 36% (165) female and 64% (292) male enrolled students versus 41% (124) female and 59% (178) male participants. In the academic year 2020/21 and the online classroom experiment, the corresponding distributions are 42% (229) female and 58% (312) male students versus 43% (144) female and 57% (189) male participants. 28 The main text and the tables refer to uncorrected p-values. For our main analysis (gender differences), we run a total of 14 tests with two outcome variables (goal setting and performance) testing for gender gaps across treatments and treatment effects on women, men, and the gender gap. This is reflected by the regression post-estimation tests in table 1 for goal setting and table 2 for performance. The chance of at least one false positive result with 14 (independent) tests and a significance level of 10% is 0.77. We apply the Benjamini-Hochberg correction (Benjamini and Hochberg, 1995) for 14 multiple comparisons with an acceptable false discovery rate of 0.20 and apply the correction to the OLS post-estimation F-tests as well as the MWU tests. With this multiple testing correction, all uncorrected significant results remain significant. In Appendix A we show additional results for the mediator variables 'expected performance' and 'attempts'. They do not pertain to our main hypotheses.
PrivGoal is 18.0, and the corresponding goal of 23.3 is significantly higher in PubGoal (MWU test, p = 0.0682). This difference is driven by the behavior in the field experiment (MWU test, p = 0.0906; online classroom experiment: MWU test, p = 0.9058). This larger treatment effect in the field experiment can be explained by the observability and identifiability of goals being arguably higher in an in-person situation than in an online environment. Figure 3 shows the average goal set by women and men in the treatments PrivGoal and PubGoal. The corresponding Ordinary Least Square regression analysis with robust standard errors and post-estimation F-tests are presented in Table 1  Second, moving to the gender gap results pertinent to Hypothesis 1, in treatment PrivGoal, men are significantly more ambitious than women---as revealed by the male self-set average goal of 20.6 compared to the average goal of 14.7 set by women (MWU test, p = 0.0000; Table 1 variable is the participant's self-set goal for the matrix-task and the explanatory variables are a gender dummy (taking value 1 if female and 0 if male), a treatment dummy (taking value 1 if PubGoal and 0 if PrivGoal), and their interaction term. Controls (Risk Attitudes, Age, Age^2, and a dummy for feeling attached to the Dutch culture) are included in models (2), (4), and (6). The data come from treatments PrivGoal and PubGoal and the samples are both experiments pooled in models (1) and (2), the field experiment in models (3) and (4), and the classroom experiment in models (5) and (6).
Third, in the treatment PubGoal, men are more ambitious and set a higher goal than women. This gender difference in goal setting is substantial. Men set a 24% larger goal than women. The gender difference is significant with non-parametric tests, but insignificant with parametric tests (female goal of 20.3 versus male goal of 25.1; MWU test, p = 0.0299; Table 1, model 1, p = 0.122). A closer look reveals that the weaker results in treatment PubGoal stem from different responses in the field and in the online classroom experiment. In the field experiment, men set a 40% larger goal than women and in the online classroom experiment men set only a 4% larger goal than women. To be precise, while women and men aim publicly for a better performance than privately in the field experiment (15.0 versus 21.9 for women: MWU test, p = 0. We can only speculate about the reasons for this difference across experiments. As already mentioned above, perhaps the higher degree of observability and identifiability in an in-person versus online environment can explain part of these differences. 30 Fourth, the difference in the gender gap between public and private goals is minor. Men's goals are 29 percent higher than women's self-set goals in the treatment PrivGoal, but they are only 19 percent higher in the treatment PubGoal. Moving from private to public goal setting reduces the gap by ten percentage points, but this is an insignificant change (Table 1, model 1, p = 0.765). This result also holds separately for the field and the classroom experiments.
We can summarize our findings with respect to private and public goal setting in the following results: Result 1a: Women set significantly higher goals when goals are public compared to private information and in particular in the field environment.
Result 1b: Men set higher goals when goals are public compared to private information, but only significantly in the field environment.
Result 1c: Men set significantly higher goals than women, both when goals are private or public.

Result 1d: The gender gap in goal-setting is not larger when the goal is public versus private.
30 For example, the observability and identifiability of goals might be higher in an in-person situation (field experiment) than in an online environment (online classroom experiment). The treatment effect is weaker for both genders in the online classroom experiment. Women set a 46% (25.7%) larger goal in the PubGoal treatment in the field (online classroom) experiment. And men set a 46% larger goal in the PubGoal treatment in the field experiment and basically the same (1.4% lower) goal in the online classroom experiment compared to the PrivGoal treatment. The choice of the self-set goal is more conscious than performance in the matrix-task, which might explain why differences across experiments are less pronounced for performance as discussed in section 4.2.
Results 1a and 1b address the goal-setting part of the pre-registered primary research question and Results 1c and 1d speak directly to the pre-registered Hypothesis 1. Our data are consistent with the first prediction of Hypothesis 1 (Result 1c) but not with the second prediction (Result 1d). The above analysis of the goal-setting part of the primary research question and of the corresponding hypothesis was laid out in the pre-analysis plan in the pre-registration.

Performance
Before we turn to the second main outcome variable---women's and men's performance in the different goal setting conditions---we shortly discuss the impact of the goal treatments on the (potential) mediating factors 'expected performance' and 'attempts'. The analysis of these factors was not explicitly pre-registered as part of the main research question, hypotheses, and analysis, but we believe that it is of interest. Recall that before performing the matrix task, participants are asked to indicate how many problems they expect to solve correctly. 31 Participants' expected performance and their actual performance are positively correlated (correlation coefficient = 0.0867, p = 0.0299). The correlation of attempts and performance is very strong (correlation coefficient = 0.6671, p = 0.000). In Appendix A, gender differences across treatments are shown visually and with regression analysis ( Figure A1 and Table A1 for expected performance; Figure  A2 and Table A2 for attempts).
While women's performance expectations go slightly up when the self-set goal is publicly visible (13.5 in NoGoal, 13.6 in PrivGoal, 17.5 in PubGoal; NoGoal versus PubGoal: MWU test, p = 0.0964; otherwise p > 0.2844), the effect is strongly pronounced among men (13.5 in NoGoal, 17.8 in PrivGoal, 21.5 in PubGoal; NoGoal versus PrivGoal: MWU test, p = 0.0083; NoGoal versus PubGoal: MWU test, p = 0.0001). The post-estimation F-tests of Ordinary Least Square regressions in Table A1 draw a similar picture. Foreseeably, expected performance and self-set goals are strongly positively correlated (correlation coefficient = 0.8708, p = 0.0000). The vast majority of participants (91.5%) believe that they will at most achieve their self-set goal. Some interesting patterns emerge: While half of the participants (50.6%) are confident to meet their self-set goal precisely, 40.9% expect to perform worse than their self-set goal. Among these 91.5% of participants, the goal setting environment does not significantly affect the distribution of participants confidence to meet their goal (chi2 test, p = 0.605), also not separately for women and men (chi2 tests, p > 0.244). However, while roughly half of women and men expect to meet their self-set goal in PrivGoal (51.9% of women and 55.5% of men; chi2 test, p = 0.622), a gender gap emerges in PubGoal: 45.6% of women versus 62.8% of men indicate that they are confident to meet their self-set goal (chi2 test, p = 0.014). While the public goal-setting environment seems to boost men's goal-compliance confidence, the opposite tendency can be observed for women.
With respect to the number of attempted summations we find a large and highly significant gender gap in attempts across treatments (2.4 in NoGoal,3.7 in PrivGoal,4.3 in PubGoal;MWU tests,p < 0.0023 ; Table A2, model 1, p < 0.00242), which is consistent across experiments (Table  A2, models 3 -6). While men attempt to solve more matrix summations after setting a goal for themselves (17.4 in NoGoal,18.9 in PrivGoal,19.2  With respect to performance, we make three observations. First, we observe an interesting pattern in women's performance: a privately set goal improves female performance slightly and insignificantly compared to no goal. However, women's performance worsens significantly when they set a goal publicly compared to privately (MWU test, p = 0.0983; Table 2, model 1, p = 0.0801). This is an interesting and novel observation that is worth attention and further research. In contrast, men's performance is not affected at all by the goal setting environment (MWU test, p = 0.8960; Table 2, model 1, p = 0.976).  Notes. *** p<0.01, ** p<0.05, * p<0.1. Ordinary Least Square Regression results with robust standard errors (in parentheses). The table shows post-estimation F-tests with corresponding p-values [in parentheses]. The dependent variable is the participant's number of attempts in the matrix-task and the explanatory variables are a gender dummy (taking value 1 if female and 0 if male), a treatment dummy for NoGoal and PubGoal (taking value 1 if applies and 0 otherwise), and the interaction terms of the gender dummy with each treatment dummy. Controls (Expected Performance, Risk Attitudes, Age, Age^2, and a dummy for feeling attached to the Dutch culture) are included in models (2), (4), and (6). The data come from treatments NoGoal, PrivGoal, and PubGoal and the samples are both experiments pooled in models (1) and (2), the field experiment in models (3) and (4), and the classroom experiment in models (5) and (6).
Second, a preliminary result pertains to the case of NoGoal. While women solve on average 9.0 summations correctly in the NoGoal treatment, men give 11.3 correct answers (MWU test, p = 0.0016; Table 2, model 1, p = 0.000253). This result is at odds with our pre-registered Hypothesis 2. However, considering that performance is made public after the study, we can give an ex-post explanation of why men outperform women in NoGoal, which we have already mentioned above. The control condition NoGoal could be more comparable to the public ranking treatment than to the private or no ranking treatments in Schram et al. (2019). There, a third person can observe and compare participants' performance in public ranking. Though we do not provide any explicit (public) ranking, performance is publicly observable in our control treatment NoGoal. This might create a feeling of comparability and be closer to the setting with public than private or no ranking in Schram et al. (2019). 32 Third, we find a robust and strong gender gap in performance across treatments, illustrated in Figure 4 and analyzed with post-estimation F-tests in Table 2. 33 In both treatments, when setting a goal (privately or publicly), men perform significantly better than women (9.9 versus 11.9 in PrivGoal: MWU test, p = 0.0015, Table 2, model 1, p = 0.00237; 8.8 versus 11.9 in PubGoal: MWU test, p = 0.000, Table 2, model 1, p = 0.000), with the effects being stronger in the online classroom than in the field experiment. The gender gap is hardly affected by the goal-setting environment (only one significant change from PrivGoal to PubGoal in the online classroom experiment: Table 2, model 5, p = 0.0647). Our results can be summarized as follows: Result 2a: Women perform significantly worse when goals are public compared to private information.
Result 2b: Men's performance is not affected by the goal setting conditions.

Result 2c: A gender gap in performance exists in all treatments with and without goal setting.
Result 2d: The gender gap in performance is larger, but not significantly so, when the goal is public compared to private. 32 Given our design, we cannot rule out alternative explanations for finding gender differences in performance in NoGoal. We have to take into account that the no-difference result stems from two laboratory studies, Weber and Schram (2017) and Schram et al. (2019), and that we are using a different subject pool in a field experiment and not a laboratory experiment. Some results in the literature suggest that men could outperform women because of (minor) differences in math-related skills (Halpern et al. 2007) or because of gender differences in visual perception performance (Shaqiri et al. 2018) and these tendencies may come through in our experiment, although they did not in Weber and Schram (2017) and Schram et al. (2019). Implicit goals that participants set for themselves without being asked to do so could also play a role (for instance, Kaiser et al, 2021). 33 Though performance is not incentivized in our experiments, participants' performance is strikingly similar to the incentivized performance in Schram et al. (2019) where women's and men's average performance range between 10 and 14 correct answers across treatments. We are thus confident that participants in our experiments take the study and the real effort task seriously. The same is true for the number of attempts, see Schram et al. (2019).
Results 2a and 2b address the performance part of the pre-registered primary research question and Results 2c and 2d speak directly to the pre-registered Hypothesis 2. Our data are overall not consistent with Hypothesis 2. Though we find a significant gender gap in performance in PrivGoal and PubGoal as hypothesized, we also find a gender gap in performance in NoGoal (Result 2c). Though the gender gap changes in the expected direction when moving from PrivGoal to PubGoal, the gender gap does not vary statistically significantly across treatments overall (Result 2d). The above analysis of performance was laid out in the pre-analysis plan in the preregistration.
A potential explanation for the absence of a significant effect of goals on performance and on the gender performance gap could be that the marginal effect of the accumulated commitment devices is too small to affect performance significantly as we move from NoGoal (public performance) to PrivGoal (public performance and private self-set goal) to PubGoal (public performance and public self-set goal). Note however, that men do try to solve significantly more matrices after setting a goal, which affects the gender gap in attempts in the expected direction. Also an emerging gender gap in goal-compliance confidence in PubGoal (compared to PrivGoal) provides supporting evidence for our hypothesis. Attempts and expected performance/ goalcompliance confidence had not been pre-registered as part of the main research questions, hypotheses, and analysis.
To link the findings of goal setting and performance, we compare set goals with actual performance. Under private goal setting, women's performance is on average 67% of goals, whereas for men it is 57%. Under public goal setting, the corresponding percentages are 43% and 39% respectively. Seen ex-post, participants are more realistic under private than under public goal setting, with women being more realistic than men in both cases. Or put differently, one may want to take publicly announced goals with a grain of salt. This result speaks directly to our preregistered primary research question and indicates that the key difference between the public versus private goal-setting environment (i.e., social image concerns) leads to larger differences between goals and actual performance ---for both, men and women. This result provides preliminary evidence that the perceived benefit of social image concerns outweighs the perceived associated costs (i.e., belief about the perception of failure when. not meeting self-set goal). And second, the difference between these perceived benefits and costs are larger for men than women. A more in-depth analysis of these differences is beyond the scope of this paper.

Discussion and Conclusion
We conduct a field and an online classroom experiment to study whether men and women set different goals and perform differently in the absence of goals, when information about goals is private and when it is public. Our main findings point to the following: 1) Women and men set higher goals when they are publicly observable and identifiable than when they are private. This effect is particularly observable during the field experiment. 2) Women choose both lower private and public goals than men. 3) Men perform better than women under private and public goals as well as in the absence of goal setting. 4) Women's performance does not change between no goals and private goals, but worsens between public and private goals, whereas men's performance is unaffected by the goal-setting condition. As a consequence of differential goal setting and performance we find that, in terms of the ratio between performance and goals, participants are more realistic under private than under public goal setting, with women being more realistic than men in both cases. In what follows we connect our results to some relevant literature.
The difference in behavior between the public and the private goal-setting environments is consistent with several explanations. One is that it is the result of public image concerns (Bénabou and Tirole, 2006). Such social image concerns can be thought of in terms of a benefit (belief of being perceived as ambitious/high ability) and a cost (beliefs of how failure is perceived). Our result shows that both men and women set more ambitious goals when these are public which suggests that the benefits resulting from social image concerns are larger than the costs.
Social conformity is a related, but distinct concept that could also explain our result that men and women set more ambitious goals in the public treatment. Social conformity is considered a powerful social phenomenon that encourages individuals to adapt their opinions and behaviors to conform to the majority in the group, especially to fit in the group and to be "liked" by others (Asch, 1951;Deutsch and Gerard, 1955). It has been widely observed in face-to-face groups, but more recently the psychological mechanism was also found in online environments (Wijenayake et al., 2020). Both men and women were aware that their self-set goals were observable and identifiable by their peers in the public condition; they could have increased their goals because they expected that the majority would do so. 34 We find that women choose lower goals than men in both treatments. In terms of a benefitcost explanation, women may perceive the net benefits (benefit minus costs) to be lower than men. Goal setting theory posits that self-efficacy and self-confidence that the goal for a specific task is attainable, are important individual characteristic for self-set goals (Bandura, 1997;Latham and Locke, 2007;Locke and Latham, 2006). It could thus be that women set lower goals than men, because they are likely less confident about their competences (e.g., Beyer, 1990;Lundeberg et al., 1994;Beyer and Bowden, 1997) than their confident male peers (McCarty, 1986;Wood and Karten, 1986) as suggested in the old stream of psychology literature and also more recently in the experimental economics literature showing that men are more confident about their abilities than women . Our finding that women set lower goals than men could also be explained by gender differences in the attribution of success and failure to internal factors (personal abilities and skills) and external factors (for instance, luck). Some evidence suggests that young boys show a stronger self-serving attributional bias than young girls (e.g., Stipek and Gralinski, 1991), the findings seem to be stronger for adolescents (Hankin and Abramson, 2001) and adults (Boggiano and Barrett, 1991;Mezulis et al., 2004). If not achieving a self-set goal can be seen as failure (increasing in the size of the mismatch), men might be more likely to attribute such 'failure' to external factors whereas women possibly tend to internalize it. These attribution differences could explain why women set lower goals than men in both treatments.
Men perform better across all treatments. Surprisingly, for both men and women, performance does not increase significantly with the introduction of privately self-set goals. Hence, we do not find necessarily evidence that participants are using goals as successful commitment devices. Further, this result contradicts findings from experimental and nonexperimental studies testing the goal setting theory that have shown that setting goals is better for performance than not setting any goals (for literature reviews see Locke, 1996;Locke and Latham, 2002;Locke and Latham, 2006;Latham and Locke, 2007). Our environment differs from the aforementioned studies by having performance being public and identifiable in all treatments. Hence, a potential ex-post explanation of our result could be that, even in the no-goal condition, individuals feel committed to performing well because their performance is publicly observable. The marginal effect of an additional commitment device in the form of a private or public self-set goal could be too small to affect performance significantly. An interesting observation in this context is however that men attempt to solve significantly more problems when they set a goal privately or publicly compared to no goal setting while women's attempts are literally unchanged across treatments.
Finally, whereas both women's and men's performance is unaffected by private self-set goals, public self-set goals negatively affect women's performance and do not affect men's performance. Our finding that women perform worse with public than with private goals, while men are unaffected is in line with other experimental studies which show that women underperform in competitive environments (Gneezy et al., 2003;Gneezy and Rustichini, 2004). We do not have enough information to identify the mechanisms behind this result. However, we find that women's goal-compliance confidence declines, whereas men's decreases resulting in a significant gender gap.
Several important and interesting research questions emerge from our study. Can an interpretation of gender goal-setting differences in terms of gendered (perceived) benefits and costs be backed up by data? To which factors do women and men attribute (un-) successfully met self-set goals? Does failing to achieve a publicly (self-)set goal result in gender differences in negative consequences such as reputational damages and stress in competitive environments? It would be interesting to examine whether such consequences take place as they might bear important implications for practice. Especially, since women tend to be more sensitive to negative outcomes than men (Buser and Yuan, 2019;Brody, 1993;Fujita et al., 1991). In this context, it would be interesting to analyze how individuals set goals and perform when they are hold accountable. Additionally, public observability and the public perception might affect selfselection into public versus private goal setting or (leading) positions that entail public goal announcements. Our experimental setup was entirely non-strategic. However, the strategic context in which individuals set goals might have an important impact on goal setting (and possibly performance). Figure A1. Expected performance. Average expected performance by women and men in the treatments NoGoal, PrivGoal, and PubGoal. 90% confidence intervals are calculated with robust standard errors. Figure A2. Expected performance. Average expected performance by women and men in the treatments NoGoal, PrivGoal, and PubGoal. 90% confidence intervals are calculated with robust standard errors.  (2), (4), and (6). The data come from treatments NoGoal, PrivGoal, and PubGoal and the samples are both experiments pooled in models (1) and (2), the field experiment in models (3) and (4), and the classroom experiment in models (5) and (6). Notes. *** p<0.01, ** p<0.05, * p<0.1. Ordinary Least Square Regression results with robust standard errors (in parentheses). The table shows post-estimation F-tests with corresponding p-values [in parentheses]. The dependent variable is the participant's number of attempts in the matrix-task and the explanatory variables are a gender dummy (taking value 1 if female and 0 if male), a treatment dummy for NoGoal and PubGoal (taking value 1 if applies and 0 otherwise), and the interaction terms of the gender dummy with each treatment dummy. Controls (Expected Performance, Risk Attitudes, Age, Age^2, and a dummy for feeling attached to the Dutch culture) are included in models (2), (4), and (6). The data come from treatments NoGoal, PrivGoal, and PubGoal and the samples are both experiments pooled in models (1) and (2), the field experiment in models (3) and (4), and the classroom experiment in models (5) and (6).

Welcome to this study!
Your answers in this study will not be used in this course in any way. They will not affect your assessment in the course at all.
The study will take approximately 30 minutes. This study is part of a research project in social sciences that we -a group of professors from different universities -are conducting. Your engagement and attention when responding all parts of the study are very valuable for the success of this study, which will contribute to a better understanding of our society.
The study is divided into a task and a questionnaire. You are not allowed to communicate with anyone else until the study is over. For each part, you will receive instructions. We guarantee that everything we tell you in these instructions will happen exactly as described.
The answers to the questionnaire are entirely voluntary. The file with personal information will be password protected and saved on a secure university drive. It will be deleted after publication. We will immediately create two datasets: one dataset for feedback and one anonymized dataset. We will use only the anonymized and non-identifiable dataset for the analysis. We have no interest whatsoever in identifying an individual's decisions and answers or in sharing that information with a third party.
Your participation in this study is voluntary. You have the right to withdraw at any point during the study, for any reason, and without any prejudice.

Appendix B. Instructions and Screenshots
Welcome Screen: Identical across treatments

Task instructions
You will now independently perform a task during 15 minutes. This is an important task that is often used to measure people's talents. Many scientific studies have found that people who do well in a task like this are more successful in professional life than people who do less well. In a previous session, students like you performed the same task. Most of them gave between 9 and 17 correct answers.
The task is as follows. You will see two matrices on the computer screen. Each matrix has 10 rows and 10 columns and is filled with randomly generated numbers.
Your job is to find the largest number in each of the two matrices and then to add them up. You are not allowed to use calculators, but you can use paper and pencil.

Example:
The largest number in the left matrix above is 92. The largest number in the right

Powered by Qualtrics A
What is your first name?
What is your last name?

Feedback
The total number of your correct answers together with your name will be displayed on the shared screen after the study is finished. This information will be public, hence all participating students will see it.
At the end of the study, the results will be shown in a This information will NOT be recorded and will NOT be made available to the other participating students in any other way.

Feedback Instruction Screen: NoGoal Treatment
What is your first name?
What is your last name?

Goal instructions and feedback
Before you perform the task, you will be asked to set a goal. Your goal and the total number of your correct answers together with your name will be displayed on the shared screen after the study is finished. This information will be public, hence all participating students will see it.
At the end of the study, the results will be shown in a This information will NOT be recorded and will NOT be made available to the other participating students in any other way.