Social power and dimensions of self-control: Does power benefit initiatory self-control but impair inhibitory self-control?

Abstract People in power positions should be able to control their impulses and act in line with long-term goals. However, two influential theories disagree as to whether power is conducive or detrimental to exercising self-control. We propose to resolve this contradiction by distinguishing between initiatory (“start”) and inhibitory (“stop”) self-control components that may be differentially affected by social power. Ninety-five female participants were randomly assigned to either a powerful role (interviewer) or a powerless role (applicant) and interacted in a simulated job interview (i.e. a modified Trier Social Stress Test). They then completed two inhibitory (d2 Test of Attention and emotion regulation) and two initiatory (handgrip and creative problem-solving) self-control tasks. We tested the hypotheses that social power benefits task performance if the task requires start self-control but impairs task performance if the task requires stop self-control. Although the power manipulation strongly affected participants’ sense of power, it did not significantly affect self-control performance. Considering that this preregistered study had 80% power to detect an effect of d = 0.64, we conclude that the population effect size is smaller than that.

ABOUT THE AUTHORS Sonja Heller is PhD student in social psychology at the University of Zurich, Switzerland. She holds a Bachelor of Science in Economics and a Master of Science in Social, Organizational and Business Psychology. In her dissertation project she examines the relationship between social power and responsible behavior.
Florence Borsay is MSc student in Social, Organizational and Business Psychology at the University of Zurich. She holds a Bachelor of Science in Psychology from the University of Fribourg. She helped develop and conduct this study and wrote her Master's thesis on data collected within this project.
Johannes Ullrich is Professor of social psychology at the University of Zurich. He holds a PhD in psychology from the University of Marburg, Germany. His research is guided by the ideas that social psychological experiments can make the world a better place, and that good experiments do so whether the results are statistically significant or not.

PUBLIC INTEREST STATEMENT
It seems desirable that people who have power over us have good self-control. They should be disciplined, self-possessed, persistent and focused on strategic long-term goals. But does the experience of power change their ability to exercise self-control, and if so, how? The authors report a study from a research program which tackles this question. The key idea from their theoretical integration of self-control and power literatures is that characteristics of the task at hand may explain why the powerful have sometimes been found to have more self-control and sometimes less. In a role-playing experiment, the authors reliably increased their female participants' sense of power. However, they did not observe similarly strong effects on self-control as reported in the published literature. Although power may affect self-control and this may depend on task characteristics, the authors conclude that such effects are not large.

Introduction and theoretical background
In general, the stereotypical view of powerful people's self-control performance is inconsistent. On the one hand, we tend to see powerful people as following their impulses-often mentioned examples are sex scandals involving top-ranking politicians (e.g. Bill Clinton, Dominique Strauss-Kahn). On the other hand, it belongs to the image of powerful people that they are always thinking at least 10 steps ahead and aligning their actions with strategic long-term goals (such as Francis Underwood (portrayed by Kevin Spacey) in the popular TV series House of Cards). The question arises as to whether experiencing social power (i.e. control over the material and immaterial resources of others; see Galinsky, Gruenfeld, & Magee, 2003;Keltner, Gruenfeld, & Anderson, 2003) benefits or impairs willingness (not ability) to show self-control.
Psychological research to date is not able to answer this question. The social distance theory of power (Magee & Smith, 2013) postulates that power should improve self-control performance. In contrast, the approach/inhibition theory of power (Keltner et al., 2003) suggests that power should worsen self-control performance. Empirical findings are similarly contradictory. For one, participants in high-power conditions are better able to focus on the task at hand (Guinote, 2007b;Smith, Jostmann, Galinsky, & van Dijk, 2008), and they persist longer and make more attempts to solve (unsolvable) tasks (Guinote, 2007a) than participants in low-power conditions. But for another, they are worse at suppressing thoughts (Guinote, 2007c), withstanding impulses to act (Scholl & Sassenberg, 2015), and they take more risks (Anderson & Galinsky, 2006).
If self-control is a unitary construct, these findings cannot all be true. We propose to resolve this contradiction by considering self-control to be a two-dimensional construct consisting of both initiatory and inhibitory components. We call the predominantly initiatory component start self-control and propose that it is needed for initiating and maintaining goal-directed behavior. Stop self-control, the predominantly inhibitory component, is needed for suppressing behavior or refraining from acting impulsively.
In the following, we will first develop our two-dimensional view of the self-control construct and then review previous research on power and self-control with this distinction in mind. We then test the hypotheses that social power benefits task performance if the task requires start self-control but impairs task performance if the task requires stop self-control.

Operationalization and dimensionality of self-control
Self-control (also called self-regulation, willpower, effortful control, among other terms; Duckworth & Kern, 2011) has been defined as "the capacity for altering one's own responses, especially to bring them into line with standards such as ideals, values, morals, and social expectations, and to support the pursuit of long-term goals" (Baumeister, Vohs, & Tice, 2007, p. 351). Impulsivity can be seen as the opposite of self-control. It describes "the tendency to act on a whim and, in so doing, disregard a more rational long-term strategy for success" (Madden & Bickel, 2010, p. 11) and "has been defined variously as an inability to wait, a tendency to act without forethought, insensitivity to consequences, an inability to inhibit inappropriate behaviors" (Reynolds, Ortengren, Richards, & de Wit, 2006, p. 306).
To better understand the inconsistent findings regarding the effects of power on self-control, we turned to the operationalization of self-control and impulsivity. For measurement of both impulsivity and self-control there are psychometric and experimental approaches. Examples of the psychometric approach are the Eysenck Impulsiveness Scale (Eysenck, Easting, & Pearson, 1984), Self-Control Scale (Tangney, Baumeister, & Boone, 2004), and Barratt Impulsiveness Scale Version 11 (Barratt, 1985). Typical tasks administered in the laboratory are tasks targeting executive functions, such as the Go/ No-Go task, Stroop task, or delay of gratification/delay discounting tasks (Smith & Hantula, 2008).
The idea that self-control measures may reflect a smaller number of higher-order constructs is not new. Looking at both psychometric and experimental approaches, we find several classifications for both. Table 1 presents an overview of the different conceptualizations. First, self-report measures have repeatedly been reported to yield a two-factorial structure de Ridder, de Boer, Lugtig, Bakker, & van Hooft, 2011); those authors first proposed to split the selfcontrol construct into an inhibitory and an initiatory component. A recent examination of the factor structure of the Brief Self-Control Scale (Maloney, Grawitch, & Barber, 2012) revealed a structure consisting of two significantly negatively correlated factors, referred to as restraint, "the tendency to resist temptation" (p. 113), and impulsivity, "acting on spontaneous thoughts and feelings" (p. 113).
Second, behavioral tasks can also be classified according to different taxonomies. Hagger, Wood, Stiff, and Chatzisarantis (2010) proposed two content-related classifications of tasks (without testing them empirically): (a) according to the demands placed on cognitive or affective processing systems, and (b) according to task content (controlling attention, emotions, thoughts, and impulses; cognitive processing; choice and volition; and social processing). The idea of multidimensionality of behavioral self-control measures also received empirical support. A meta-analytic principal components factor analysis demonstrated that laboratory tasks typically used to measure impulsivity constitute four factors: inattention, inhibition, impulsive decision-making, and shifting (Sharma, Markon, & Clark, 2014).
Third, taking both self-report measures and behavioral tasks into consideration simultaneously, multidimensional conceptualizations emerge as well. A meta-analysis by Duckworth and Kern (2011) used a fourfold classification of self-control measures (executive functioning, delay of gratification/temporal discounting, self-report and informant-report questionnaires). Work on primary data usually finds that self-report measures load on a single factor, whereas the behavioral measures constitute more than one factor: (1) Self-report, impulsive decision-making, impulsive disinhibition (Reynolds et al., 2006).

The start/stop distinction
The evidence presented above clearly supports the idea of multidimensionality, although the taxonomies differ considerably. Self-report measures seem to be categorized best by a simple functional taxonomy. In contrast, behavioral measures seem to be organized best according to more multifaceted taxonomies based on task content. Here we propose to apply the functional distinction between inhibitory and initiatory self-control to organize the variety of behavioral measures. We acknowledge that a simple dichotomy may not suffice to explain method variance, but it may constitute an important step forward in developing the theoretical link between power and self-control.
These two dimensions have already received preliminary empirical support. For instance, within the health domain, inhibitory self-control seems to be important for behaviors that require stopping a response, such as limiting intake of foods high in saturated fat, whereas initiatory self-control (updating) is important for carrying out behaviors that require the initiation of a response, such as consuming fruit and vegetables (Allom & Mullan, 2014). In the organizational context, de Boer, van Hooft, and Bakker (2015) drew on these two types of self-control to predict contextual performance at the workplace: Results showed that only initiatory control was positively related to organizational citizenship behavior, personal initiative, and proactive coping. Both inhibitory control and initiatory control were negatively related to counterproductive work behavior.
Given this initial evidence for the predictive validity of the proposed two factor conceptualization, we categorize laboratory tasks used to measure tendency to act without thinking and suppression of impulses as behavioral operationalization of inhibitory (stop) self-control and laboratory tasks measuring persistence or capability to overcome one's weaker self as operationalization of initiatory (start) self-control. As noted above, within the research on social power and self-control there are numerous theoretical and empirical contradictions. We propose to begin organizing the different findings in the literature by explicitly considering previous classifications of self-control in other areas of research.

Effects of social power on dimensions of self-control
The two most influential theories of power make different predictions regarding the effects of power on self-control. More specifically, according to the approach/inhibition theory of power (Keltner et al., 2003), high power activates the behavioral approach system, which is sensitive to rewards and opportunities. Hence, high power should trigger approach-related positive affect, attention to rewards, automatic cognition, and disinhibited behavior. Correspondingly, due to the powerful people's heightened attention to rewards and their drive to experience these rewards immediately, they should show relatively poor self-control.
According to the social distance theory of power (Magee & Smith, 2013), high-power individuals feel more subjectively distant from others than low-power individuals. Based on assumptions of construal level theory (Trope & Liberman, 2010), this greater perceived social distance should lead to more abstract mental representation (i.e. higher level construal). High-level construal was shown to benefit self-control (e.g. Fujita, Trope, Liberman, & Levin-Sagi, 2006;Schmeichel, Vohs, & Duke, 2011). Accordingly, due to their use of high-level construal of goals and situations, powerful people should show relatively better self-control.
When two theories make different predictions and there is empirical evidence in support of both, the question arises as to what the conditions are under which one or the other theory is correct. We used the distinction between start self-control and stop self-control to structure published findings on power and self-control (for a detailed overview, see Table 2). It appears that regardless of the power manipulation (episodic priming, conceptual priming, not enacted role assignment, impact of opinion), participants in the high power condition showed better performance in start self-control tasks such as dichotic listening task, problem-solving tasks, Stroop task, and temporal discounting, whereas in most cases they performed worse in stop self-control tasks such as thought suppression, action planning, and deliberation tasks. We decided not to include findings on powerful people's increased risk-taking propensity (e.g. Anderson & Galinsky, 2006;Carney, Cuddy, & Yap, 2010;Jordan, Sivanathan, & Galinsky, 2011), because risk-taking is not a common operationalization of the selfcontrol construct. However, as risk-taking is sometimes used as a measure in impulsivity research and entails acting without prior deliberation, we consider this to be indirect evidence in favor of our hypothesis of reduced stop self-control in powerful individuals.

Table 2. Studies on power and self-control classified according to inhibitory and initiatory self-control
Notes: HP = high power condition; LP = low power condition; C = control condition.

Method
All study materials and procedures can be accessed via https://osf.io/u9xa2/. This study was approved by the responsible ethics committee of the Faculty of Philosophy at the University of Zurich. Participants gave their written consent to take part in the study.

Power analysis
We planned to recruit 100 participants, but time and resource constraints only allowed us to collect data from 95 participants. A power analysis suggested that this sample size affords 80% statistical power to detect an effect of at least d = 0.64 with an experimentwise alpha of 5%, assuming onesided tests. We will therefore declare an effect significant if it is in the expected direction and p < 0.025 (note that this criterion follows from one-sided testing and spreading the experimentwise alpha error across four hypothesis tests that follow from our use of four self-control tasks). Otherwise we will conclude that the effect is less than the minimal detectable effect size.

Participants
We recruited 95 women in the age range 19 to 48 years (M = 25.36, SD = 6.12; 82 students) from the pool of psychology students and interested community members at the University of Zurich to take part in a study on personality and interpersonal behavior. We decided in favor of an all-female sample to: (1) avoid possible confounds induced by mixed gender interactions in the role play task, and (2) minimize the risk of self-handicapping effects (e.g. in the creative problem-solving task), as these effects seem to be less pronounced for women than for men (McCrea, Hirt, & Milner, 2008). Participants were paid either 30 Swiss francs (about 30 US dollars) or received partial course credit. One participant misunderstood the creative problem-solving task (i.e. rather than crossing the lines, she retraced the lines) and was therefore excluded only for the analyses of the creative problem-solving data.

Procedure
Before coming to the laboratory, participants were asked to complete an online questionnaire that contained questions on demographics, personality (BFI-15; Gerlitz & Schupp, 2005), and potential moderator variables ( When they arrived at the laboratory, participants learned that they and another participant (who was in fact a confederate) were scheduled for the same experimental session and would work on one task (among others) together. At the very beginning of the first part of the experimental session, a female experimenter first administered the baseline measure of the handgrip task for each participant separately. Then, a modified version of the Trier Social Stress Test (TSST; Birkett, 2011) was used to manipulate the feeling of power. Participants were randomly assigned to a powerful role (as interviewer) or a powerless role (as applicant) and interacted in a simulated job interview (the self-presentation part of the TSST) with the female confederate. At the end of the first part of the experimental session, participants completed a manipulation check and several filler questions concerning mood and feelings during the interaction. The first part of the session took approximately 25 min.
To guarantee that the experimenter who administered the dependent measures was not aware of the experimental condition, participants were instructed by a second female experimenter to complete four well-established self-control measures: For stop self-control we used the d2 Test of Attention (Brickenkamp, 1994) and an emotion regulation task (avoiding emotional displays and facial expressions while watching a funny, a distressing, and a boring short film in counterbalanced order). For start self-control, we used the handgrip task and an ostensible test of creative problemsolving abilities. These four tasks were administered in four different orders, so that every task was once in the first, the second, the third, and the last position. Subsequently, participants completed a second manipulation check and several filler questions concerning mood and their overall impressions of the experiment. Participants received their compensation and were debriefed and thanked. The second part of the session took approximately 50 min.

Power manipulation
To create a highly involving and naturalistic hierarchically structured situation we used a modified version of the TSST. The TSST generally consists of a waiting period upon arrival, anticipatory speech preparation, speech performance, and verbal arithmetic performance periods, followed by one or more recovery periods. Our implementation of this paradigm differed in the following main aspects from the standard procedure (see Birkett, 2011): First, we were not interested in assessing stress hormone reactivity, so we omitted both the waiting period at the beginning and the verbal arithmetic performance period at the end. Second, the interviewers wore their normal clothes instead of lab coats to keep the situation more naturalistic and provide a better fit with the cover story.
Participants believed that the dyadic task was a mock interview in which one person would play the role of the interviewer, a professional recruiter, and the other person the role of the job applicant. The research assistant would play the role of assistant to the interviewer. Depending on the condition, the real participant would either act as the interviewer (high power role) or the applicant (low power role). Participants were informed that the applicant had 7 min to mentally prepare a 4-min speech in which she presents herself as an ideal candidate for her dream job. Her speech was videotaped. In the preparation period, the interviewer had to determine evaluation criteria for the presentation and could prepare up to three questions for the applicant. Participants believed that the role allocation was randomly determined-in fact, it was randomly determined before the experiment started, and the real participant drew either one of two interviewer lots or one of two applicant lots.
The research assistant brought the applicant to an adjacent room where she had to mentally prepare her self-presentation. Applicants had to use an annoyingly ticking egg timer to monitor their 7-min preparation time. They were informed that the interviewer would pick them up for the presentation.
In the meantime, the interviewer and the research assistant prepared the setting, arranging chairs and table so that both of them sat on the same side of the table, placing the chair for the applicant to face them, and positioning the video camera. When the interviewer was a real participant (i.e. in the high power condition), she was given written instructions that summarized the role requirements in order to make her feel as comfortable as possible in her role. The instructions summarized the interviewer's goals (find out possible strengths and weaknesses, evaluate the quality of the presentation) and procedural rules (e.g. ask for what job the applicant is applying, do not interrupt the presentation, prompt the participant to continue speaking if she remains silent for more than 10 s, take notes if needed, prepare up to 3 questions).
Then participants played their respective roles: The interviewer welcomed the applicant, the applicant gave her speech and answered questions, and the interviewer thanked the applicant and brought her back to the preparation room. The interviewer and the research assistant took approximately 3 min to discuss the presentation in a way that allowed the interviewer to feel like the person in charge. After this discussion, the interviewer was asked to evaluate the applicant. The interviewer was informed that her evaluation was important because the applicant's chances of winning a bonus would depend on the evaluation, whereas the interviewer's bonus would be randomly decided.

Manipulation checks
After the interview role play, the power manipulation, we assessed how powerful and in charge of the situation each participant felt during their interaction. On a scale from 0 (not at all) to 5 (completely) participants indicated how much they agreed with six self-descriptive adjectives: "powerful," "selfconfident," "unassertive" (reverse-coded), "subordinate" (reverse-coded), "responsible," and "competent." These items were averaged to build an indicator of felt power. Cronbach's alpha was α = .81. At the very end of the experiment, we asked the participants again how powerful they felt (0 = not at all, 5 = completely) in order to have an indicator of the stability of the power manipulation. http://dx.doi.org/10.1080/23311908.2017.1288351

Behavioral measures
We used four tasks to represent the two self-control components. We chose well-established measures in the self-control and ego-depletion literature (e.g. Hagger et al., 2010) that are easy to administer. To our knowledge, these tasks have not yet been used in conjunction with any power manipulation.
2.5.2.1. Handgrip task. This task was used as an operationalization of start self-control. We implemented this task based on the description by Muraven, Tice, and Baumeister (1998). The apparatus used for this task was a hand exerciser consisting of two handles and a metal spring. Participants were told to squeeze the handles together and maintain that grip for as long as they could with their dominant hand. A small eraser was inserted between the handles so that when the grip relaxed the eraser would fall down, thereby providing a clear audio-visual and objective signal to stop timing. The experimenter timed how long the participant squeezed the handles (i.e. endurance in seconds). Participants completed a baseline measure at the beginning of the experiment and a second measure after the power manipulation.
We conceptualize persistence as an integral component of start self-control. The handgrip task requires physical stamina and accordingly becomes taxing with time. Therefore, a person must exert self-control to continue squeezing the handles despite the uncomfortable condition. The longer participants kept on squeezing, the better their self-control performance on this task.

Creative problem-solving task.
This task was used as an operationalization of start self-control. We created this task using elements from tasks used by Vohs et al. (2008) and by Guinote (2007b). Participants were given time to study for an ostensibly upcoming creative problem-solving abilities test that was framed as a predictor of many desirable life outcomes. Additionally, participants were told of past research showing that being familiar with the test materials significantly improved performance on the test and that a practice period of 15 min had proven to be sufficient. The experimenter announced that she would leave the room for 15 min and gave participants a sample item for practice. This alleged sample item was a geometrical form that looked like the contours of a building. Participants had to cross each wall only once in one continuous line. They were told that they could use as many copies as they needed to find the solution.
However, to make the task more difficult in terms of self-control, participants were also allowed to read magazines or surf the Internet (magazines and an iPad were on the table) if they did not wish to work on the sample item for the entire practice period. We used the number of attempts that participants made to solve the sample item as an indicator of task performance.
Focusing on the task at hand and getting (unpleasant) work done is part of our definition of start self-control. The creative problem-solving task requires participants to start and keep working on a frustrating task, while distraction in the form of pleasant activities to occupy their time is nearby. Therefore, the participant must exert self-control to stay focused and continue working on the task.

d2
Test of Attention. This task was used as an operationalization of stop self-control. It is a timed test of selective attention/concentration, and it measures processing speed, rule compliance, and quality of performance in response to the discrimination of similar visual stimuli. The test consists of 14 lines, each comprised of 47 characters (the letters d and p) with one to four dashes, for a total of 658 items. The participant must scan each line and cross out each d with two dashes. Participants had 4 min to work on as many characters as possible. We used the error rate (i.e. the ratio of total errors divided by the number of attempted items) as an indicator of task performance.
Stop self-control entails the suppression of impulses. On the d2 Test, participants should be tempted to cross out all ds irrespective of the number of accompanying dashes, because the visual distinction between ds and ps is far more obvious than the distinction between the target stimuli and all other kinds of characters. Self-control is required to ignore distractors and override the tendency to act naturally (cross out all ds).

Emotion regulation task.
This task was used as an operationalization of stop self-control. Participants were seated in front of an iMac and asked to watch a video that contained three short films separated by a short break of 12 s. To ensure that the effects were due to self-control rather than the particular emotional response, we used positive, negative, and boring stimuli. Based on the results of our pretest, we chose an Ice Age compilation (5:56 min), the trailer for Amityville Horror (2:12 min), and a documentary on traffic near Lucerne's main train station (3:48 min). Participants were told that the iMac was also filming their faces while they watched the video. They were instructed to watch the video and not show any emotions, so that another person watching the filming of their faces would not be able to guess which video they were watching.
Two raters blind to the experimental condition rated the emotional expressiveness of participant's faces on a 5-point scale (1 = absolutely non-expressive, 5 = very expressive). The interrater reliability (consistency definition) on the basis of mean ratings over three emotions per participant was ICC = .66. We used the mean rating of two raters across the three films as an indicator of task performance.
Keeping oneself from doing something one would want to do or one would naturally do is a defining part of stop self-control. In this emotion regulation task, videos prompted the participants to show emotions, but the participants were instructed not to do so. They had to exert self-control to override the natural tendency to spontaneously express their emotions.

Results
After data collection but prior to the analyses, we preregistered an analysis plan on the Open Science Framework (osf.io/u9xa2), specifying scale construction, decision rules, and planned confirmatory analyses.

Start self-control
We regressed the second handgrip measurement on the baseline measurement, saved the residuals, and used this residualized performance as a dependent variable in a t-test. Contrary to our hypothesis, interviewers (residualized handgrip performance in sec: M = 2.1, SD = 76.74) did not show more perseverance in the handgrip task than applicants (residualized handgrip performance in sec:

Discussion
Previous research has found that social power both benefits and harms self-control performance, but a theoretical explanation for this paradox was lacking. We suggested that social power would increase start self-control, which is necessary for initiating and maintaining behavior, but would decrease stop self-control, which is necessary for suppressing behavior or refraining from acting impulsively. This distinction helped us organize the contradictory findings in the published literature on power and self-control (see Table 2). However, the main goal of this study was to use the start/ stop distinction to test a priori hypotheses regarding the differential effects of power on start selfcontrol vs. stop self-control. To this end, we used a role play manipulation that allowed us to experimentally create large differences in participants' feeling of power or powerlessness. We used four tasks to represent the two self-control components. Participants completed the d2 Test of Attention and an emotion regulation task (both considered to be inhibitory) as well as a handgrip task and a creative problem-solving task (both considered to be initiatory).
The results of this pre-registered experiment are non-significant with regard to all four self-control tasks. More precisely, high power and low power participants do not differ in their endurance in the handgrip task, the number of attempts made to solve the problem-solving task, the number of errors made on the d2, and externally rated success in suppressing their emotions. We would like to preface the interpretation and discussion of these results by saying that effects of power on self-control may well exist, but they are unlikely to be large. In fact, the power analysis underlying our experiment allows us to conclude that the effects are most likely smaller than d = 0.64. This is the correct interpretation of non-significant effects in this study and should be kept in mind when we talk in the following more categorically about the presence or absence of effects.
Thus, the results of the present experiment are inconsistent with our predictions regarding the differential effects of social power on start/stop self-control. However, rather than merely disconfirming the direction of the effects postulated by the start/stop distinction (i.e. whether power decreases or increases self-control), our results more generally call into question the existence of the effects of power and self-control as postulated by the social distance theory of power (Magee & Smith, 2013) and the approach/inhibition theory of power (Keltner et al., 2003).

Possible explanations for the null findings
An effect depends on the outcome variable, the recipients of a treatment, the setting, the time, and the treatment (Reichardt, 2006). Reasons for the absence of an effect might be found in one or more of these five factors.
First, with regard to the outcome variable, we must state that every empirical study involves auxiliary assumptions regarding the operationalization of the outcome. It is possible that the chosen self-control measures are not good indicators of the two postulated self-control dimensions. For instance, the d2 Test of Attention might be considered to be an operationalization of start self-control, because the task-to focus on target stimuli and ignore distractors-is highly similar to the Stroop task, for example, which we classified as a start self-control measure in our literature review. However, even if our measures do not represent the initiatory and inhibitory component well, we used four operationalizations that are well established within the self-control and ego-depletion literature. Prior studies reported that these measures were sensitive to interindividual differences and experimental manipulations (e.g. Friese, Messner, & Schaffner, 2012;Guinote, 2007b;Muraven et al., 1998;Vohs, Baumeister, & Ciarocco, 2005). Accordingly, if self-control had been affected by our treatment, we would have expected to find variability in these measures as well.
Second, with regard to the recipients of the treatment (i.e. our participants), a possible alternative explanation is that we used a sample of female students, whereas most previous studies have relied on mixed gender samples. As stated in the introduction, we were interested in the participants' willingness to show self-control (not their ability), which we assumed could be altered by a power manipulation. Probably, highly conscientious participants would want to do their best in an experiment-irrespective of contextual factors such as manipulations; this would minimize variability in willingness to exercise self-control. Previous findings show that across different cultures, women score higher than men on Conscientiousness (Schmitt, Realo, Voracek, & Allik, 2008), and psychology students tend to be more conscientious than students majoring in other disciplines (e.g. Vedel, Thomsen, & Larsen, 2015). Indeed, our sample reported relatively high scores on conscientiousness (on a 7-point scale: M = 5.31, SD = 0.93). However, previous psychological research found interindividual differences in state self-control performance in predominantly female student samples (e.g. Martijn et al., 2007;Vohs et al., 2008), so that this is unlikely to be the reason for our null findings.
Third, with regard to the setting, we have to note that due to administrative reasons the experimental sessions took place in two different rooms. However, we took great care to furnish and prepare the rooms as similarly as possible. Furthermore, it is very likely that our setting is highly similar to the setting in previous psychological research within a university context where studies detected differences in self-control measures. Therefore, we do not think that the setting is responsible for the absence of power effects.
Fourth, with regard to the time factor, participants completed the experiment at different times of the day. Although it is possible that self-control performance might vary over the course of the day, for example due to ego-depletion (Kouchaki & Smith, 2014), we expect that this potential time effect was averaged out, as we tested participants both in the mornings and afternoons. Time also refers to the time lag between treatment and measurement of the outcome variable. In this regard, it is reassuring to note that our manipulation check indicates that participants in the high power condition still felt more powerful at the end of the experiment than participants in the low power condition did. This suggests that it is unlikely that the effect of the interview role play was too short-lived to affect all of the outcome variables.
Fifth, with regard to the treatment, we have to acknowledge that this study is the first study on power and self-control to manipulate the feeling of power via a structural manipulation. Previous research in this domain adopted experiential and conceptual manipulations, such as episodic priming, conceptual priming, and not enacted role assignment (see Table 2). Diverse power manipulations could-mediated by distinct processes-have different effects on the same dependent variable. Experiential and conceptual manipulations are likely to activate the cognitive power network (see Tost, 2015), which might be responsible for the previous findings on the relationship between power and self-control. However, the role play manipulation that we used might have failed to activate this power network. Nevertheless, the results of our manipulation check support the conclusion that we succeeded in affecting the participant's experience of power. Hence, we doubt that the absence of effects on self-control performance is due to our power manipulation.

Strengths of this study
This study has two notable strengths. First, this study employed a newly created power manipulation that contains important elements of previous role play manipulations. For example, the powerful participant has the mandate to direct the powerless participant and may evaluate the performance of the powerless participant with real consequences for the powerless participant. Besides, our manipulation has considerable mundane realism and fits the background of our participants, as most of them have probably already experienced comparable interview situations, either as candidates or interviewers or in both roles. Second, we see this mock interview as a suitable situation, because participants (mostly students and employed adults of the same age) in a prior study (Heller & Ullrich, in press) often provided interview situation examples when asked to describe a situation in which they had or lacked power. Moreover, this manipulation creates a strong situation in which the role requirements are clear, and therefore, the participant's personality dominance should not have a strong effect (as may be possible in open situation role plays with a non-naturalistic task, such as e.g. in Schmid Mast, Jonas, & Hall, 2009, Study 1). Furthermore, the statistical conclusion validity of this experiment is high, given the a priori power analysis and the preregistration of hypotheses and planned analyses. To reiterate our main conclusion, the non-significant results mean that the population effect is smaller than the one assumed in our power analysis, i.e. d = 0.64.

Conclusion
In a conceptual replication we observed effects that are inconsistent with the large effects of social power on self-control reported in the published literature. After considering several alternative explanations, the most plausible one is that the existing literature has overestimated these effects. This seems all the more plausible in light of the results of the large reproducibility project (Open Science Collaboration, 2015). This study aimed to replicate more directly a great number of effects from top journals. Effect sizes of the replications were consistently smaller than the original, published effect size if the original effect size was greater than d = 0.5, indicating a potential publication bias. Given its practical importance, future research should further explore the relationship between social power and self-control and try to identify possible moderators. Our newly developed role play power manipulation seems to be a promising starting point, in that it produces substantial differences in felt social power. With regard to the self-control operationalization, it might be worth investing further effort in validation of the start/stop distinction proposed here and sampling additional tasks that capture these aspects of self-control.