Explaining successful and failed investments in U.S. carbon capture and storage using empirical and expert assessments

Most studies of deep decarbonization find that a diverse portfolio of low-carbon energy technologies will be required, including carbon capture and storage (CCS) that mitigates emissions from fossil fuel power plants and industrial sources. While many projects essential to commercializing the technology have been proposed, most (>80%) end in failure. Here we analyze the full universe of CCS projects attempted in the U.S. that have sufficient documentation (N=39)—the largest sample ever studied systematically. We quantify 12 project attributes that the literature has identified as possible determinants of project outcome. In addition to costs and technological readiness, which prior research has emphasized, we develop metrics for attributes that are widely thought to be important yet have eluded systematic measurement, such as the credibility of project revenues and policy incentives, and the role of regulatory complexity and public opposition. We build three models—two statistical and one derived through the elicitation of expert judgment—to evaluate the relative influence of these 12 attributes in explaining project outcome. Across models, we find the credibility of revenues and incentives to be among the most important attributes, along with capital cost and technological readiness. We therefore develop and elicit experts’ judgment of 14 types of policy incentives that could alter these attributes and improve the prospects for investment in CCS. Knowing which attributes have been most responsible for past successes and failures allows developers to avoid past mistakes and identify clusters of near-term CCS projects that are more likely to succeed.


Introduction
Deep decarbonization of the global economy will require a portfolio of low-carbon energy technologies, most of which are not ready for deployment at commercial scales [1]. For this reason, essentially all studies on deep decarbonization call for investment in a diverse array of technologies, including energy systems that are immature today but, once commercialized, could diffuse over time as they improve technologically and as challenges with regulation, business models, and policy support are resolved [2][3][4]. Critical among these are carbon capture and storage (CCS) technologies, which comprise not only a leading candidate for capturing carbon dioxide (CO 2 ) emissions from industrial sources, but can also be deployed in fossil fuel power plants [5,6]. Because it plays a pivotal role in multiple sectors, CCS is deployed aggressively in 1.5 • C and 2 • C scenarios within global climate and energy system models [7].
There are many active policy efforts to commercialize CCS [8][9][10], such as preferential financing as well as schemes to provide direct cash and tax incentives for capturing carbon pollution. In the U.S., these include storage tax credits implemented through section 45Q of the U.S. tax code in 2008, which were significantly expanded in 2018 [11]. Included among these are credits for utilizing captured carbon pollution, such as in enhanced oil recovery (EOR). (When the carbon is used and not merely stored the system is often called carbon capture, utilization, and storage (CCUS); here we use the general term 'CCS' unless the distinction is important.) Such CCS policies are intended to help lower costs, gain experience, and increase the technological maturity of key elements of CCS systems. All these efforts are motivated by the idea that, while the cost and performance of some individual components of CCS systems are mature technologically, commercial viability depends on how the entire system operates at scale. The success of these policies therefore hinges on getting projects built.
Actual investment in CCS has not kept pace with the large expected role for the technology. Commercial CO 2 capture has been ongoing since the 1970s, deployed in gas processing plants to separate CO 2 for use in EOR [12]. Those separation systems are now mature; indeed, they account for the majority (~70%) of CO 2 that is captured annually across the globe [13,14]. Apart from projects that use these systems, the record of CCS project development is overwhelmingly one of failure. The 2000s saw the largest U.S. push to commercialize the technology, with private industry and government investing tens of billions of dollars in dozens of industrial and power plant capture projects. Despite extensive support, the vast majority of these failed [15,16]. That failure has come in many gradations: some projects acquired and spent resources on front end engineering and design (FEED) but were terminated before final investment decision (FID). Others failed spectacularly, proceeding through FID and spending millions of dollars on construction only to be abandoned or reconfigured without CCS. By contrast, very few have succeeded in proceeding from FEED to FID to as-intended operation.
According to the U.S. Department of Energy's (DOE) National Energy Technology Laboratory (NETL), more than 300 CCUS projects of all types have been proposed or built worldwide [14]. Of these, approximately half (149) have sought to store some or all of the CO 2 they captured. This universe of 149 projects is the full global historical experience that can be mined for insights about what has gone right and (mostly) wrong. All told, more than 100 of the 149 CCS projects originally planned to be operational by 2020 have been terminated or placed on indefinite hold (figure 1). These were set to capture more than 130 million tons of CO 2 per annum (Mtpa) once completed-more than three times the amount of CO 2 captured today [13]. Of particular importance is that the probability of failure depends on the type of project. Our analysis of the NETL database suggests that most (>70%) proposed gas processing projects-the most mature carbon capture application-have succeeded and are in operation today. By contrast, in the power sector, close to 90% of proposed CCS capacity was never built.
In this paper we explain this extreme variation in project outcome using two complementary methods: by analyzing the historical record and by eliciting the judgment of experts.

What explains variation in CCS project outcome?
We look systematically and empirically at the disconnect between CCS's potential and real-world experience. In doing so we make five novel contributions to the literature. One, we develop a fuller theory about why CCS projects succeed or fail-one that builds on the disparate hypotheses about project attributes that earlier studies have examined (often without considering the full space of covariates). Two, having identified these attributes, we develop new methods for quantifying each, turning conceptual attributes into measured variables. Three, we build statistical models to explore the relationship between those variables and project outcome across the historical record, employing the largest sample of CCS projects ever studied in this way. Many prior studies have looked at CCS projects individually or in small case studies in an attempt to glean the secrets of success and failure, but that approach has suffered from selection bias because case studies have focused on only the most visible projects (e.g. [17,18]). Four, to complement our analysis of the historical record, we conduct a structured elicitation of expert judgment, allowing us to evaluate expert intuitions regarding project outcomes. While statistical models based on the historical record identify relationships between variables, the expert-derived assessment elicits the 'weight' or 'importance' of each variable-generating in the process a multi-criteria decision-making model. This is the first time an elicitation has been conducted alongside a historical analysis using the same variables and concepts. Five, we apply what we learn from the historical record and expert judgment to assess both the feasibility and efficacy of policy reforms that can better incentivize new CCS development.
There is no single literature focused on CCS because the issues that arise with this technologythe need for complex system integration of components at varied stages of technological readiness, a big role for public policy and innovation, and novel regulatory requirements and industrial coalitions supporting or opposing development-implicate many disciplines from engineering to political economy, sociology, and law. Broadly, the existing literature on CCS projects has considered four clusters of attributes: engineering economics, financial credibility, local political attributes, and broader political attributes. Nearly all the existing analytical literature fits into engineering economics. These types of studies have investigated cost, performance, and technological readiness [19,20,21]. Another, sparser set of studies has focused on financial credibility. These studies seek to explain how the size and credibility of financial flows, including contractual and tax benefits, affect project outcomes [22,23,24]. Things that erode the credibility of financial commitments jeopardize the financial integrity of investments [25]. Making policy pledges credible is a perennial challenge because credibility is often endogenous to the perceived success of the project: when a project starts failing, even a credible stream of payments can be undone by policymakers who do not want to bear the political costs of failures [26,27]. Local political attributes can affect project outcome, for example, by generating employment for local groups that are politically organized, gaining their support [28,29], or by encroaching on the interests of the local population [30]. The effect of broader political attributes like public opposition has also been studied for CCS [31,32]; more generally, the siting of energy infrastructure has been studied extensively, be it power plants, power lines, or oil and gas facilities [33,34,35]. Broader political concerns are distinguished from local concerns based on the size of organization: one of the most potent theories in the field of political science points to size and concentration of benefits from political organization as a key determinant of whether groups organize politically [36].
All told, we look at 12 attributes that comprise these four clusters (table 1). Methodologically, we create and test systems for measuring each of these attributes. Some of the attributes in table 1 are plainly quantitative, such as capital cost: scoring such attributes required only agreement on the scope of the variable and triangulating estimates. Others are less obviously quantitative and required establishing scoring scales, the endpoints of which (0 and 1) represent extremes in the data. Creating these scales was possible only after months of review of the full project database as well as further consideration of hypothetical projects so that the scales we adopted could reflect the full range of potential scores for the variable. For all attributes, two coders independently scored each project on each attribute, submitted their scores to the research team, then met to explain and debate their analysis. We tested our scales with a small sample of projects before proceeding to the full sample.
We provide an illustration here of how these less obviously quantitative attributes were handled. For example, for institutional setting, creating the scale required compiling recent (within the past 20 years, the window during which most of our projects were proposed and either succeeded or failed) state-level policy and regulatory frameworks that support the development of CCS plants. States like Texas and Oklahoma, for instance, have laws in place to clarify the regulatory context for much of the valuechain of CCS, from plant construction to pipeline development to sequestration. Thanks to state-level priorities and enduring experience regulating analogous infrastructure, Texas and Oklahoma have retired some of the factors that drive up costs and risks in other states where regulation is more contentions and uncertain. These states mark one extreme while states whose regulators or legislative majorities are apathetic to or resist fossil infrastructure development set the opposite extreme. A detailed state-by-state and project-by-project scoring is essential because some states create mixed institutional settings-for example, California has a CCS law (which can be a favorable factor) yet offers erratic Table 1. We analyze 12 CCS project attributes that impact project success and that can be evaluated quantitatively in a replicable manner. Attributes are diverse, spanning engineering economics, finance, and political economy. Hypothesis statements summarize how attributes could positively impact the likelihood of project success.

Category
Project political and regulatory support for fossil fuel infrastructure. For detail on scoring, see SI (available online at stacks.iop.org/ERL/16/014036/mmedia), which also reports how we measure project outcome.
We focus on the U.S., the country with the plurality of proposed or constructed CCS projects: 51 of 149 projects, or one-third of the global sample. Future research that applies our 12 variables to a transnational data set should keep in mind that institutions and the modes of political mobilization vary across countries-and hence variables and the scales we use to score variables should be modified to reflect those differences. To avoid selection bias, we score all 51 U.S. projects-omitting only those that lack accessible documentation across all variables. In total, we score a sample of 39 U.S. CCS projects (figure 2) with diverse CO 2 sources and sinks. This diversity is characteristic of emergent systems that are rife for experimentation with diverse technologies and business models. Part of our contribution is to determine which of these has led to successful project execution in order to guide near-term deployment of additional CCS projects. We use this database to build two statistical models-one employing a linear regression and the other a random forest. Details on models and their validation can be found in SI text.
To complement our review and statistical modeling of past projects, we elicited the judgment of experts: this occurred during a highly structured invitational workshop, conducted in September 2019, in which we led experts through a series of exercises that revealed their intuition about the importance of individual project attributes in explaining project outcomes.
After an extensive search to identify people who could offer judgments on the full range of attributes that might affect CCS projects, we invited 28, and 13 attended the workshop. Each invitee had been centrally involved with at least one CCS project, one policy effort to improve the landscape for CCS investments, or both. We conducted a virtual elicitation with a fourteenth expert who had agreed to attend the workshop but could not. In terms of expertise, one attendee is a geoscientist, seven are CCS project managers, one is a lawyer specializing in CCS project financing, one regulates CCS projects, and the remaining four have expertise in quantitative policy analysis in support of CCS development. In terms of current affiliations, two work in academia, two work at firms where they specialize on project finance for energy infrastructure, one works in state government, and nine work in either industry or industry support organizations. Finally, in terms of CO 2 utilization, all project developers were deeply familiar with either EOR or dedicated geologic storage, and two have worked in detail, as well, on other utilization options-namely, synthetic fuel production and durable carbon.
We provided experts with a list of pre-readings on the workshop's focus: (a) exploring the successes and failures in past projects, including barriers to CCS development, and (b) identifying, analyzing, and recommending policy options to accelerate development. We guided participants through structured exercises that elicited their individual judgments, which experts provided confidentially in dedicated booklets (see SI text for a list of pre-readings and the booklet, and SI file 1 for anonymized results). No presentations or discussions were held prior to these exercises to avoid biasing expert judgments; instead, we asked experts to record their judgments about a particular topic, then engaged in group discussions about the topic, and finally offered them the option to record revised judgments.
Because we addressed the same questions regarding the importance of project attributes as those in our review of the historical record, this study compares, for the first time, two approaches to understanding common patterns of historical behavior. This is useful to the ongoing debate about the utility of elicitation methods [37].

Results
In figure 3, we employ both a linear regression model ( figure 3(A)) and a random forest model ( figure 3(B)) to identify functional relationships that map the 12 independent CCS project variables to the dependent variable-project outcome. The elicitation (figure 3(C)) produces a multi-criteria decisionmaking model that weights the relative causal importance of the same 12 variables. The order of variable importance across the three models is the core empirical result of this study.
Three variables emerge as significant across all models. First is capital cost: projects with larger capital costs are more likely to fail. In this respect, the world of CCS aligns with the wider world of megaprojects: billion-dollar engineering infrastructure projects often encounter difficulties with financing, site preparation, supply chain management, or system integration. Consequently, these projects are often commissioned over-budget and behind schedule, if not abandoned altogether [38,39]. This trend holds for CCS: of the 14 most expensive projects as measured by their original budget estimates, 13 were abandoned; developers of the fourteenth (Southern Company's Kemper Project) abandoned plans for CCS, reconfiguring the project as a combined cycle natural gas power plant instead.
Second, high levels of technological readiness improve the chance of project success. Employing systems that have been more frequently manufactured, transported, integrated into a facility, tested, and commissioned reduces technical and system integration risks. Low levels of technological readiness have been implicated in the failure of the most expensive CCS project ever attempted (Kemper), which sought to use a first-of-a-kind gasification system (Transport Integrated Gasification) [40]. Low technological readiness levels are also behind the delays faced by NET Power [41], a proof-of-concept project for a novel thermodynamic cycle that could, if successful, lower the cost of deploying CCS in natural gas power plants. By contrast, the class of projects with the highest success rate-natural gas processing-use mature separation technologies.
Third is the credibility of project revenues. More credible sources of revenue-such as bilateral offtake agreements for CO 2 -strongly increase the odds of project success. The vast majority of successful industrial projects (11 of 15), for example, arranged to sell their captured CO 2 for EOR. The only successful industrial project to opt for dedicated geologic storage-at Archer Daniels Midland's ethanol production plant in Decatur, Illinois-was supported substantially with upfront cash grants from the DOE's Industrial Carbon Capture and Storage program.
A fourth variable, the credibility of incentives, is significant in two of the three models: the linear regression and expert-derived models, but not in the random forest. From the linear regression, we find that successful projects rely less on incentives than those that fail. Projects with high price tags have generally received government incentives; they are flagship, high-profile, sometimes high-risk, demonstration projects. It is precisely these types of projects that often fail, often because they are vulnerable to 'vetoes' if policy makers waver in their support, especially given their potentially long lead times [42]. By contrast, projects that succeed are smaller, less costly, and rely less on incentives.
Despite this general agreement among models regarding the most consequential project attributes, there are three areas where they diverge. One is regulatory challenges, which both statistical models find to be the fourth most important in explaining project outcome, but about which experts are more circumspect, ranking it seventh in importance with a median weight of 7%. Analysis of the historical record suggests that projects that face permit denials, extended regulatory proceedings, or lawsuits are more likely to fail. Most notable is Future Gen 2.0-a collaboration between the DOE and numerous industrial partners to retrofit a coal power plant in Illinois with oxy-combustion CO 2 capture. The project faced novel regulatory requirements for injecting CO 2 and was challenged in multiple lawsuits that contributed to construction delays [43].
A second area of divergence, local employment impact, is important in the random forest model but not statistically significant in the linear regression; the experts judged it to be largely irrelevant (rank 12 of 12 with median weight of 3.5%). This result is, at first glance, counterintuitive: the regression coefficient is negative, meaning that projects that promise more employment-a higher number of promised construction and permanent jobs-are more likely to fail, all else equal. The historical record reveals why this is so: projects that propose more extravagant plans to improve economies through employment are those that are expensive, highprofile, and high-risk-the same conditions that lead to promises for substantial government incentives yet frequently fail.
Third, experts rank the burden of CO 2 disposal fourth of 12 in importance (median weight of 10%). By contrast, this variable is insignificant in the statistical models. Our coding of this attribute relied on the documentary evidence that existed in the historical record of a project's CO 2 transportation and disposal plans. We found copious evidence outlining disposal plans in well-documented projects, including pipeline routes, discussion of access to pore space, and robust monitoring, verification, and assessment (MVA) regimes. The experts stated that the visibility of documentary evidence inherently ignores the groundwork that disposal requires on the part of project developers-such as characterizing storage site geology; securing access to pore space; constructing pipelines or linking capture facilities with the existing CO 2 pipeline network; and complying with the regulatory requirements embodied in MVA regimes. These findings are a warning sign to future empirical research on CCS: the degree of documentation and visibility around features like CO 2 disposal is endogenous to efforts to eliminate any risks before FID.

Extending the analysis: expert assessment of the credibility of incentives
Proponents of CCS maintain that incentives are essential to help commercialize the industry [10,44]. That's because, as an industry, CCS systems sit firmly in the so-called valley of death. They are stuck between a small number of early demonstrations that have received government support and later mass deployments that would stand on their own financial merit. In this context, the high importance that experts attribute to the credibility of incentives is unsurprising (importance rank 1 of 12). That finding also suggests that policy, if designed explicitly to address credibility, could have a huge impact on the success of projects. Such insights perhaps help explain the active and successful lobbying effort for the 2018 expansion of the 45Q tax credit, to $50 tCO 2 −1 for dedicated geologic storage and $35 tCO 2 −1 for EOR applications.
A challenge in historical research is that one can often only observe the effects of a single policy regime. Elicitations of expert judgment, however, allow for the characterization and assessment of the credibility of a fuller array of policies. This is perhaps especially important when the variable under investigation is the credibility of policy incentives, which is inherently tied to the intuitions and perceptions of decision makers. The history of CCS development so far is tied to pre-commercial projects that experiment with a diverse range of revenue streams and incentives. Near-term deployments will likely continue this trend, comprising additional data points on the learning curve to technological maturity. In such an environment, understanding decision makers' perceptions of the viability of these different experiments becomes even more important. We therefore elicited judgments about all 12 project attributes on day 1 of the expert workshop, assessed those results overnight, and reorganized exercises on day 2 to investigate policy responses in more detail. The results are summarized in figure 4. Starting with an existing catalog of CCS policies [10,44], through expert discussion we defined four clusters of possible future policies: CO 2 production incentives; capital incentives; decarbonization incentives; and CO 2 disposal incentives. Within each of these clusters we directed the experts to develop policy packages. The first policy package would be bare bones; each additional package within the cluster would add an additional element of policy reform-and, with it, additional needs for political effort to get the package enacted. In this way the marginal political effort and marginal impact on CCS from each new element can be distinguished. We then asked experts to judge each policy package along two dimensions: its effectiveness in enhancing the viability of CCS projects and the likelihood of its implementation-in other words, its political feasibility.
Three of these 14 policies-45Q storage tax credits (policy A), investment tax credits (policy D), and loan guarantees (policy G)-existed at the time of the workshop, which accounts for their high feasibility scores. As of September 2019 (the time of the workshop and elicitation), 45Q (policy A) had yet to be confirmed in the tax code-it existed, but the Internal Revenue Service had not opined on how it might work-thus even this 'existing' incentive elicited a median feasibility less than 1 (mean of 0.97, interquartile range of 0.93-1).
Two results are particularly noteworthy. First there is an inverse relationship between political feasibility and impact. For example, policy K (cash grants for the very first four CCS projects developed; procurement of 'green' cement, steel, and fuels by the U.S. military; and a national low-carbon fuel standard) was deemed most effective in the aggregate judgment of our experts. Unsurprisingly, policy K (along with policy C, which would involve large direct payments via different means) was also deemed the least politically feasible.
Second, incentives that are restricted to the CCS industry (i.e. policies A through C) or tuned to reward CCS investment or CO 2 capture specifically (i.e. policies D through H) only become competitive with disposal or decarbonization incentives once they are extremely generous to developers. In other words, experts believe that it is not direct support for the CCS industry that will lead to the largest volumes of CO 2 capture; rather, what matters most are incentives that encourage systematic decarbonization, such as government procurement of decarbonized industrial products or a broad low-carbon fuel standard.

Discussion
Many factors have been implicated in the success or failure of CCS projects. Using the historical record and expert judgment, we build three analytical models that relate project attributes with success and failure (figure 3). The three models paint a coherent picture of the importance of capital cost, technological readiness, and the credibility of project revenues. A majority of the models further align around the important roles played by credibility of incentives and regulatory challenges. The experts, in particular, identified the credibility of incentives-that is, policy design-as the single most important factor. Less significant features, by comparison, include stakeholder opposition, institutional setting, and population proximity. We conclude with three observations about the extensibility and utility of the methods we have employed in this work.
First, we built a systematic framework and a transparent coding system that can be replicated, debated, and adjusted. While we focus here on CCS, this framework can be employed in assessing a large number of promising yet fledgling technological systems that have been discussed as promising partial solutions to the climate crisis. The deployment of these technologies-such as advanced nuclear power, direct air capture, and novel biofuels-hinges not merely on economics but also on many similar interactions between engineering, economics, and politicsinteractions that affect, for example, the ability of governments to offer credible investment incentives.
Second, we found that expert elicitation can act as a much-needed complement for assessing project attributes that are hard to quantify. For instance, in our historical analysis we found credibility of revenues to be among the hardest variables to measure because project finances are rarely available publicly. We scored credibility based on evidence of developers' plans for securing revenues, including agreements between developer and offtaker. We hypothesized-and experts corroboratedthat contracting for predictable offtake arrangements for captured CO 2 constitutes a highly credible form of revenue. The same is true for credibility of incentives: these attributes were much easier to assess through expert judgment. The multi-method approach that we employ here-which combines expert elicitation with statistical modeling-offers a new way to assess credibility in a structured way.
Third, the approach taken here-especially when augmented with the structured elicitation of expert judgment-can plausibly improve representations of CCS deployment in large energy system models. Those models include learning curves and aim to endogenize technological change so that costs fall as investments increase-a virtuous cycle that begets still more investment and continued improvement in performance. But those models are highly sensitive to initial assumptions [45] where, so far, there hasn't been much theory or evidence as a guide.
To illustrate such a guide for modeling initial conditions (e.g. the near-term upscaling of the industry), we asked experts about the number and type of projects that are likely to succeed over the coming decade of CCS development. We asked them how, if the industry scales up over the coming decade, the volume of captured CO 2 would be distributed among project types (power plants vs. industrial sources) and among CO 2 end uses (dedicated sequestration vs. all forms of utilization) (figure 5). There was consensus among experts that, by volume, CCS would be preferentially deployed at power plants, which would capture roughly twice as much CO 2 as CCS at industrial sites ( figure 5(A)). However, industrial CCS sites, which are smaller point sources of warming gases, would number more. Further, captured CO 2 is more likely to be utilized rather than sequestered in dedicated reservoirs (figure 5(B)), with most captured CO 2 (90%) put to use for EOR ( figure 5(C)). These answers reveal skepticism among experts about novel utilization options like synthetic fuel production and durable carbon.
Knowing which features of CCS projects have been most responsible for past successes and failures allows developers to not only avoid past mistakes, but also identify clusters of existing, near-term CCS projects that are more likely to succeed. These projects will become the seeds from which a new CCS industry sprouts-the early data points on learning curves that will extend over growing investment. On the policy front, assessments like ours empower both developers and policymakers; they enable developers to carefully assess the feasibility of different policy packages in their financial engineering, and they signal to policymakers the extent to which different policies are viewed as credible and effective by the communities responsible for deploying CCS projects.

Data availability statement
All data that support the findings of this study are included within the article (and any supplementary files).