1 Introduction

Escalation of commitment to a failing course of action can be a costly decision error, both at an individual and at an organizational level (Brockner, 1992; Keil et al., 2000a; Sleesman et al., 2012; Staw, 1976). One of the reasons for such escalation is the sunk cost fallacy (Arkes & Blumer, 1985; Johnstone, 2002; Staw, 1976). This fallacy states that people are more likely to invest in a project if they have invested in this project previously. Various other project, psychological, social, and organizational factors have been shown to drive escalation of commitment as well (Hollar et al., 2012; Sleesman et al., 2012). A study by Hollar et al., (2012) suggests that, amongst these, psychological factors are the strongest driver of escalation of commitment behavior. A potentially important psychological factor that has remained unexplored thus far, is construal level.

Construal level involves the degree of psychological distance that people experience at the time of making a decision (Trope & Liberman, 2010). Psychological distance increases as the object that is being construed mentally, is further away (e.g., temporally, geographically, socially or in hypotheticality). In this paper, we investigate how construal level affects escalation of commitment.

Consistent with the widely held view of construal level as a primed effect, we employed a commonly used prime for manipulating this construct in a laboratory experiment. In addition, we measured subjects’ construal levels using the Behavior Identification Form (BIF) (Vallacher & Wegner, 1989). As explained later, while we found no evidence for the primed effect of construal level on escalation, we did find evidence for the effect on escalation of construal level as a trait. To probe the mechanism through which construal level may influence escalation, we examined three potential mediators that have previously been associated with construal level.

The remainder of the paper is organized as follows: first, we offer a brief overview of the relevant literature and the theory base that we draw upon. Next, we introduce our hypotheses and research model. Then, we describe the results that were obtained. Finally, we discuss the implications of our study.

2 Theoretical background and hypotheses

Trope & Liberman (2010) provide an overview of the various domains in which Construal Level Theory (CLT) has been examined. Construal level can influence predictions, such as predictions about one’s own future performance (Trope & Liberman, 2010) and the degree to which people base their predictions on historical trends or on recent results (Henderson et al., 2006). Furthermore, construal level has been studied in relation to behavioral intentions and self-regulation (Trope & Liberman, 2010). In addition, construal level has been found to influence evaluations and choices of decision makers (Trope & Liberman, 2010), by influencing the weight that decision makers assign to central and peripheral features when evaluating products, for example (Trope & Liberman, 2000).

While CLT originates from the field of social psychology, it has also received attention from researchers in other fields, such as behavioral economics. Trautmann describes that “psychological distance effects have attracted the attention of behavioral economists in the context of descriptive modeling and behavioral policy. Indeed, psychological distance effects have been shown for an increasing number of domains and applications relevant to economic decision-making” (Trautmann, 2019, p. 1). Amongst others, the theory has been linked to concepts such as temporal discounting (Liberman & Trope, 2003; Trope & Liberman, 2003), attitudes towards future prospects (Onay et al., 2013), preference reversals of consumers (Fiedler, 2007), predicting the choices of others, advice giving, the annuitization of assets and saving for retirement (Leiser et al., 2008).

One effect of construal level discussed in the literature of particular interest to our research is that several studies found that decision-makers’ construal levels affect the perceived importance of feasibility aspects, relative to desirability aspects of the object in question (Liberman & Trope, 1998; Liberman et al., 2007; Trope & Liberman, 2010). Specifically, they found that for decisions regarding more distant activities, desirability considerations were considered to be more important, and feasibility considerations less important, compared to decisions regarding less distant activities. For example, Liberman & Trope (1998) found that for several contexts, such as students’ decisions of whether or not to attend a guest lecture, participants rated the importance of desirability to be higher and the importance of the feasibility to be lower when the temporal distance to the event was manipulated to be high (i.e., high construal level), rather than low (i.e., low construal level).

We suggest that adopting a low construal level increases the perceived importance of feasibility of a project, relative to the perceived importance of desirability. If construal level can influence the perceived relative importance of these two factors, then it is likely that this could in turn influence escalation of commitment to a failing project, also known as project escalation.Footnote 1 This may be relevant particularly in situations where the level of feasibility and desirability for a project differs (for example, in situations where the desirability of a project is high but the feasibility is low, or vice versa). Of course, the feasibility and the desirability of a project are both important elements in project decision making. However, when a troubled project has a low level of feasibility but a high level of desirability, the decision of whether or not to continue with said project can depend on how much weight a decision maker places on the feasibility aspect relative to the desirability aspect. The same applies for projects with a high level of feasibility and a low level of desirability. For that reason, it is necessary to consider the importance of project feasibility relative to project desirability.

Project escalation situations often involve projects with desirable outcomes which encounter significant feasibility issues. This is also the case for the project scenario which we employ in our experiment. Under these circumstances, a decrease in the relative importance of feasibility compared to desirability, as a result of adopting a higher construal level, is expected to make decision makers more willing to continue a troubled project (i.e., more willing to escalate their commitment to a failing project). This leads to the following hypotheses.

Hypothesis 1a

People with a higher construal level will find feasibility aspects of a project to be less important relative to desirability aspect (as compared to people with a higher construal level).

Hypothesis 1b

When the importance of feasibility aspects relative to desirability aspects decreases, people become more willing to continue a failing project.

Aside from perceptions of feasibility and desirability, construal level can influence other aspects of evaluations as well. For example, CLT predicts that with a high construal level, people will be able to think of more pros, and fewer cons, when deciding whether or not to perform a certain action (Eyal et al., 2004). Trope & Liberman, (2010, p. 452) explain this as follows: “In deciding whether to undertake an action, cons are subordinate to pros. This is because the subjective importance of cons depends on whether or not pros are present more than the subjective importance of pros depends on whether or not cons are present.” In short, this suggests that if you do not see a reason to perform an action in the first place (pros), then there is little reason to think about the possible disadvantages or problems (cons). As such, pros are associated more with a high construal level and cons with a low construal level.

In prior studies, this has resulted in people with a high construal level being able to bring to mind more pros and fewer cons than people with a low construal level in the same situations (Eyal et al., 2004; Herzog et al., 2007). We expect this effect to also occur in the context of escalation of commitment. Seeing more pros, and fewer cons, for a project is expected to increase a person’s willingness to continue with the project. This leads to the following hypotheses:

Hypothesis 2a

People with a higher construal level can think of more pros for a project.

Hypothesis 2b

People who think of more pros for a project are more likely to continue with the project.

Hypothesis 3a

People with a higher construal level can think of fewer cons for a project.

Hypothesis 3b

People who think of fewer cons for a project are more likely to continue with the project.

Figure 1 summarizes our hypotheses. To test the hypothesized relationships, we conducted a laboratory experiment with students based on a scenario that has been widely used in prior escalation studies (Conlon & Garland, 1993; Garland, 1990; Keil et al., 2000b; Moon, 2001), which we have tailored to fit the context of our study.

Fig. 1
figure 1

Hypotheses linking construal level to willingness to continue a failing project

3 Methods

One hundred and fifty-four native Dutch undergraduate students participated in our experiment. Each of them was enrolled in an economics program. The average age of the participants was 21 and 31% were females. Sessions were conducted in a laboratory setting with groups of up to 30 students. Upon arrival at the lab, all participants received verbal instructions at the start of the session. Participants were then seated at their cubicles. The experiment was administered digitally; participants recorded their answers on the computers in their cubicles. The average time spent by participants was 23 min and they received a flat fee of €7 for their participation.Footnote 2

3.1 Design

Participants were randomly assigned to one of two treatments: a low or high construal level treatment. They were told that the study was split into two separate parts. After the instructions, participants proceeded to the first part of the study which contained the experimental manipulation as well as the manipulation check. The manipulation was designed to induce either a high or low construal level depending upon the treatment group. The manipulation check and the rest of the experiment were identical across treatments.

3.2 Manipulation

For our treatments, we employed the category/examplar word manipulation task (Fujita et al., 2006), a commonly used method of construal level manipulation. The manipulation involves a short exercise, in which participants are presented with a set of words. In the low construal level treatment, participants were asked to think of specific examples of the presented word. For example, if the word was COMPUTER, then participants might write down LAPTOP or HP as examples. In the high construal level treatment, participants were presented with the same words but were asked to think about the higher level category to which each word belonged. Again, in the example of the word COMPUTER participants might write down DEVICE or ELECTRONICS as categories. Since the experiment was in Dutch, the instructions and the words were translated from English to Dutch.

3.3 Manipulation check

We used the Behavior Identification Form (BIF) as a manipulation check to assess participants’ construal levels after the manipulation. The Behavior Identification Form (BIF) consists of a short exercise in which participants are asked to answer 25 multiple choice questions (Vallacher & Wegner, 1989). The BIF presents participants with a list of 25 actions and activities, along with two descriptions of each and participants are asked to choose the description that they prefer. One of these two descriptions always describes how one can perform this action or activity while the other always describes why one would perform this action or activity. For example, the two descriptions for the activity LOCKING A DOOR are PUTTING A KEY IN THE LOCK (how) and SECURING THE HOUSE (why).

Prior studies have shown that people with high construal level are more likely to prefer ‘why’ answers on the BIF whereas people with low construal level are more likely to prefer ‘how’ answers (Vallacher & Wegner, 1989; Fujita et al., 2006). As such, the number of ‘why’ answers given on the BIF gives an indication of a participant’s construal level. This is typically done by giving a score of 1 for each ‘why’ answer selected and a score of 0 for each ‘how’ answer. Thus, participants can end up with a total score between 0 and 25, where higher scores indicate higher construal levels. We used these scores as a manipulation check to determine whether the manipulation had succeeded in creating statistically significant differences in construal level between treatment groups. This is in line with the recommendations from a recent article by Trautmann (2019) which emphasized using an accepted manipulation check in experimental studies involving CLT.

3.4 Decision task

The manipulation check was followed by the scenario for the experiment which is shown in Fig. 2.

Fig. 2
figure 2

Experimental scenario

We constructed our own scenario for it to be accessible and understandable to our participants. While it does not take place in the typical organizational context, it does contain all the basic elements needed to create a project escalation situation as described by Keil et al., (2007). First, the scenario has a previously chosen course of action: the initial decision to develop and market the app based on the positive feedback on, and the potential of, the app. Second, sometime later, problems with the selected course of action are encountered. This provides new information in the form of negative feedback which indicates that it may be best to abandon or redirect the previously chosen course of action. In our case, this refers to the technical obstacles encountered with the app and the high degree of uncertainty regarding whether these can be overcome. Third, there is the choice to either continue with, and escalate commitment to, the previously chosen course of action or to abandon or redirect the project. In our case, participants can choose to either continue development of the app despite the negative feedback, or abandon the development of the app.

The dependent variable of our study is decision-makers’ willingness to continue with the failing project. This variable was measured using 3 items which were based on the study by Du et al., (2007) and adapted to fit the context of our scenario. An overview of the exact wording of the items can be found in Table 1.

Table 1 Overview of item wording for the dependent variable and mediator variables

In most project escalation studies, experimental manipulations involve changes within the project scenario. For example, in sunk cost project escalation experiments, the level of sunk costs of the project in the scenario can be manipulated to be either high or low (Keil et al., 1995). In our study, however, the actual project scenario is identical for all participants and the manipulations involve priming to a high or low construal level.

3.5 Mediator variables

The decision task was followed by measurement items for the mediators in our hypotheses, the control variables age and gender, and questions to test for demand effects. The exact wording of the items used to assess the mediators in our model can be found in Table 1. The importance of feasibility of the project relative to its desirability, was assessed using a fixed sum (fixed allocation) question format in which participants were asked to divide 100 points between feasibility and desirability of the project. The more points assigned to feasibility, relative to desirability, the higher the importance of feasibility relative to desirability. As such, scores for this variable could range from 0 to 100. Items to elicit pros and cons were based on a prior CLT study by Eyal et al., (2004).

3.6 Control variables

To control for possible differences due to age and gender, we included these factors in our analysis. Since escalation involves risk-taking, and age and gender differences have been observed with respect to risk taking, it is reasonable to assume that these factors could have an influence on escalation behavior.

3.7 Order effects and demand effects

To reduce the impact of order effects, we counterbalanced the order of the pros and cons questions, where half the participants were first asked to write down pros and then cons, and vice versa for the other half of the participants. Comparison between the participants who were first asked to write down pros and the participants who were first asked to write down cons revealed no statistically significant differences between those groups regarding the number of pros or cons that they wrote down.

Demand effects can potentially also play a role. If participants correctly guess the purpose of the experiment or the experimental manipulation, then they might give the answers that they believe the experimenter wants to hear, rather than answering honestly. To guard against the threat posed by demand effects we included an open ended question at the end of the experiment which asked participants what they believed to be the goal of the experiment. Furthermore, participants were allowed to make any additional comments about the experiment, which could also be used to identify any potential demand effects.

4 Results

4.1 Responses

Out of the total of 154 subjects who participated in the experiment, one ended the experiment prematurely (by accidentally closing the internet session) and was unable to complete it. Two other participants were dropped from the sample because they had given invalid responses to questions. One participant for example, allocated more than 100 points when asked to allocate 100 points between feasibility and desirability based on their relative importance. Analysis of the open ended questions asking participants to (1) describe what they thought was the goal of the experiment and to (2) give feedback on the experiment, showed no signs that any of them had guessed that the study was about construal level theory or that the manipulation exercise was related to the project case. Thus, we have no reason to believe there was any threat to validity due to demand effects. After these preliminary checks, 151 out of 154 responses were considered valid, 76 of which were in the low CL treatment and 75 were in the high CL treatment.

4.2 Validity

Validity was assessed for our dependent variable, which included multiple measurement items. Specifically, willingness to continue was modeled reflectively using a three-item scale. Several tests are recommended for testing the validity of variables that are measured with multiple items (Chin, 1998; Fornell & Larcker, 1981). The first test of convergent validity is to determine whether each item has a sufficiently high loading on its corresponding construct. Loadings of 0.7 or higher are typically considered acceptable (Chin, 1998; Fornell & Larcker, 1981). In our case, item loadings ranged from 0.885 to 0.945. Additional measures for validity include tests of Cronbach’s α. Our DV had Cronbach’s α of 0.915 and this value exceeds the customary threshold of 0.7.

4.3 Construal level

The answers provided to the BIF were used to assess participants’ construal levels. As described in the method section, participants receive a score between 0 and 25 based on their answers. A higher score indicates a higher construal level. BIF scores were compared between treatment conditions to determine whether the manipulation was successful in creating a difference in construal level between the treatment groups. While participants in the high construal level treatment had a higher construal level, on average, than those in the low construal level treatment, the differences between groups were small. The high CL group had a mean score of 13.2 and a standard error of 5.39 while the low CL group had a mean score of 12.4 and a standard error of 4.27. Interestingly, the results from an ANOVA demonstrated that this difference was not statistically significant (F(1,148) = 0.970, p = 0.326). This indicates that the treatment was not successful in generating a significant difference in construal level between the two groups.

In addition, we found no evidence for an effect of the manipulation on decision makers’ willingness to continue with the failing project (F(1,148) = 0.147, p = 0.720). Similarly, the manipulation had no significant effects on the perceived importance of feasibility of the project relative to its desirability (F(1,148) = 0.027, p = 0.869), the number of pros (F(1,148) = 1.109, p = 0.294), and on the number of cons (F(1,148) = 1.160, p = 0.283) for the project. Table 2 provides an overview of the mean scores across treatment conditions for these variables.

Table 2 Mean scores and standard deviations (in parentheses) across treatment conditions

One explanation for these findings may be that the manipulation of CL was not strong enough. While we employed an experimental manipulation which is commonly used in the CLT literature, it is important to note that in many of these studies, manipulation checks were not performed during the experiment in order to confirm whether CL was manipulated successfully. Our results cast doubt on the extent to which the manipulations used in prior studies actually induced different CL levels as measured through the BIF scores. We thereby provide further support for Trautmann’s, (2019) suggestion to include manipulation checks in tests of CLT.

Another reason for the absence of a statistically significant difference in construal level between treatment groups in our study, as measured by the manipulation check, could be attributed to what is known as the distance effect of CLT proposed by Maglio et al., (2013). The distance effect of CLT suggests that sensitivity to a difference in construal level can be reduced by a prior exposure to psychological distance. Specifically, the efficacy of a subsequent manipulation of construal level was found to be weaker for subjects who were previously exposed to a high psychological distance (Maglio et al., 2013).Footnote 3 Drawing on these study findings, if participants in our study had already been exposed to psychological distance prior to taking part in the experiment, our manipulation of construal level would not have been effective. According to CLT, an exposure to psychological distance can occur when one thinks about objects, people or events that are near or distant either (1) temporally, (2) geographically or (3) socially (Trope & Liberman, 2010). Given all the possible events and objects that subjects can think about, it is plausible to think our subjects may have had a predisposed construal level prior to taking part in our experiment due to an exposure to psychological distance.

Based on the above arguments, while our manipulation failed to induce a significant difference in construal level between the treatment and control groups, the BIF scores themselves provide an indication of subjects’ predisposed individual differences in terms of CL and we decided to explore whether CL as measured through the BIF score is related to escalation and our other variables of interest. Thus, any references to construal levels in the analyses, tables, and figures refer to participants’ construal levels as measured by the BIF.

4.4 Testing of research hypotheses

To test our hypotheses, we used partial least squares (PLS), specificallySmartPLS 3.0Footnote 4 (Ringle et al., 2015), which is a component-based approach for structural equation modelling. One main advantage of PLS is that it allows us to test the entire research model at once (Wong, 2013). Figure 3 provides the results of our structural model assessment including path coefficients,Footnote 5 standard errors, and whether each path was statistically significant.Footnote 6 As shown, the R2 value for the dependent variable, willingness to continue, is 0.32, indicating that our model is able to explain about 32% of the variance in the dependent variable.

Fig. 3
figure 3

PLS results showing path coefficients, standard errors and significance

Table 3 provides more in-depth information on the hypothesized relationships as well as the direct relationship between construal level and willingness to continue. We hypothesized that people with a higher construal level would consider the feasibility of a project to be less important, relative to its desirability (H1a). As can be seen in Table 3, the hypothesized negative relationship was found to be significant. Further, we hypothesized that if people found feasibility to be less important, they would be more willing to continue the failing project (H1b). This hypothesized negative relationship was also found to be significant. Combined, these findings demonstrate that there is an effect of construal level on escalation of commitment to a failing project. Specifically, people with a higher construal level perceive feasibility to be less important relative to desirability and this in turn leads them to be more willing to continue a failing project.

Table 3 Structural model effects and testing of hypotheses

We hypothesized that people with a higher construal level would list more pros for the project, as compared to people with a lower construal level. As can be seen in Table 3, the hypothesized positive relationship was significant, thus supporting H2a. We further hypothesized that the more pros people can think of for a project, the more likely they are to continue the project. However, this hypothesis was not supported. At first sight, this outcome seems counterintuitive. Logically, being aware of a higher number of pros would make it more likely for someone to continue with the project. However, the number of pros does not say anything about the importance of each of the pros or about the relevance of each of these pros to the decision to continue. As such, the relationship between pros and willingness to continue may not be as simple as we had theorized. This could explain why the relationship between the number of pros and willingness to continue in our analysis, while in the predicted direction, was not statistically significant.

We hypothesized that people with a higher construal level would list fewer cons for the project, as compared to people with a lower construal level (H3a). This hypothesis was not supported. We further hypothesized that there is a relationship between the number of cons people can think of for the project and the willingness to continue (H3b), but did not find support for this relationship. This finding is not that surprising given that we also found no effect of the number of pros on people’s willingness to continue.

In this study, our primary interest centered on the relationship between construal level and willingness to continue. As can be seen from Table 3, people with a higher construal level are more likely to continue a failing project and this effect is statistically significant. This suggests that escalation of commitment to a failing project is more likely when the decision maker adopts an abstract mode of thinking (i.e., a high construal level). In addition, our findings demonstrate that this relationship is mediated by the perceived importance of feasibility relative to desirability, but not by the number of pros and cons.

5 Discussion

Our study contributes to the existing literature not only by identifying an additional factor that can influence escalation of commitment in projects, construal level, but also by clarifying the mechanism through which this factor exerts influence, the perceived importance of project feasibility versus desirability aspects. We found no empirical support for a relationship between construal level and the number of cons that people identify, even though this is predicted by CLT. One possible explanation could be that there is something different about the decision context of our study, as compared to other studies. This constitutes a contribution to the existing literature on CLT since it could imply that the effect of construal level on the number of cons that people can identify may be dependent on situational factors.

One caveat when interpreting the results of our study, is that the measured mechanism through which construal level exerts influence on escalation of commitment, cannot be interpreted as causal. The importance of feasibility relative to desirability and the numbers of pros and cons were measured after measuring willingness to continue with the project. Therefore, we cannot rule out that participants may have answered these questions to justify their reported willingness to continue. Yet, if this were the case, we would have expected to find the number of pros and cons to be associated with willingness to continue, which we did not find. A suggestion for future research is to assess the causality of relative importance of feasibility and number of pros and cons on escalation of commitment.

An important limitation of our study is that because we were unable to successfully manipulate construal level, we could only test for a statistical association, rather than a causal relationship, between construal level and escalation of commitment. At the same time, this limitation led us to what is arguably one of the more interesting findings of our study, namely that a commonly employed manipulation of construal level failed to have an effect. This finding is consistent with Trautmann, (2019) who questions whether construal level effects are robust and shows that they can be difficult to replicate.

Our results suggest that construal level may not be easy to manipulate and this has important implications for research and practice. For researchers, it suggests the need to further examine the strength and reproducibility of construal level manipulations. For practice, our findings suggest that attempts to influence individuals’ construal level in the hopes of reducing the incidence of project escalation will have limited success.

While there was no statistically significant difference in construal level between treatment conditions, we did observe individual differences in construal level among participants in terms of BIF scores. This outcome is consistent with a line of research which suggests that there are trait-like elements to construal level (Vallacher & Wegner, 1989).

In the CLT literature, construal level is usually treated as a state that can be manipulated. However, there is evidence to suggest that individuals may exhibit trait-like differences in construal level. Based on theory on action identification, the authors of the BIF suggest that some individuals may prefer one type of description (action identification) across domains. They describe that this can be the result of differences in a personality dimension called level of personal agency (Vallacher & Wegner, 1989). This suggests that scores on the BIF might be determined not only by experimental manipulations of construal level but also by personality traits.

Further support for this notion can be found in the description that Vallacher & Wegner give for the personality trait of personal agency: “High-level agents think about their acts in encompassing terms that incorporate the motives and larger meanings of the action, whereas low-level agents think about their acts in terms of the details or means of action” (Vallacher & Wegner, 1989, p. 660). This suggests that there is a personality trait which determines whether individuals are more likely to think about actions in terms of their motives for the action or in terms of how to perform the action or the details involved with the action. As mentioned previously, thinking about actions in terms of how (why) or thinking about an action in a detailed and specific (or general and abstract) manner is associated with a low (high) construal level (Trope & Liberman, 2010). In short, these findings suggest that personality traits might also influence an individual’s construal level.

Our findings suggest that the trait-like aspects of construal level may even be more important than any primed effects of construal level in the context of escalation of commitment. From a policy perspective, this would indicate that rather than trying to change individuals’ construal level, it may be more practical to screen for individual differences in construal level before assigning responsibility for managing a project that could be at risk for escalation. In line with this, one contribution of our study is that BIF scores themselves could be a valuable source for the testing of effects of individual differences in construal level. Specifically, our findings show that subjects with a higher construal level, as reflected by their BIF score, acted in line with predictions from CLT in that they considered feasibility aspects of a project to be relatively less important. Our findings indicate that this, in turn, increases the willingness to continue a failing IT project. One direction for future research would be to further explore construal level as an individual difference factor, as measured by BIF scores, and to investigate its impact on behavior and decision-making in a wider context.

Our finding that a commonly used manipulation of construal level did not induce significant difference in BIF scores may be troubling for the interpretation of results of previous studies on construal level. Most of the studies that used our manipulation do not include a manipulation check in their experiment. Our findings suggest that without a manipulation check, researchers may incorrectly assume that the manipulation was effective when it was not. For this reason, we support the recommendation by Trautmann, (2019) for all CLT experimental studies to include an accepted manipulation check instrument. We believe that there is value for future research to further test the robustness of the employed manipulation of construal level as well as to test or refine manipulation checks to determine the effectiveness of these manipulations.