Improving environmental change research with systematic techniques for qualitative scenarios

Scenarios are key tools in analyses of global environmental change. Often they consist of quantitative and qualitative components, where the qualitative aspects are expressed in narrative, or storyline, form. Fundamental challenges in scenario development and use include identifying a small set of compelling storylines that span a broad range of policy-relevant futures, documenting that the assumptions embodied in the storylines are internally consistent, and ensuring that the selected storylines are sufficiently comprehensive, that is, that descriptions of important kinds of future developments are not left out. The dominant approach to scenario design for environmental change research has been criticized for lacking sufficient means of ensuring that storylines are internally consistent. A consequence of this shortcoming could be an artificial constraint on the range of plausible futures considered. We demonstrate the application of a more systematic technique for the development of storylines called the cross-impact balance (CIB) method. We perform a case study on the scenarios published in the IPCC Special Report on Emissions Scenarios (SRES), which are widely used. CIB analysis scores scenarios in terms of internal consistency. It can also construct a very large number of scenarios consisting of combinations of assumptions about individual scenario elements and rank these combinations in terms of internal consistency. Using this method, we find that the four principal storylines employed in the SRES scenarios vary widely in internal consistency. One type of storyline involving highly carbon-intensive development is underrepresented in the SRES scenario set. We conclude that systematic techniques like CIB analysis hold promise for improving scenario development in global change research.


Introduction
Much research on large-scale environmental change, including climate change, concerns not only the current state of the environment but also its future. Anticipating environmental changes beyond what has already been observed can inspire policies for better resource management or adaptation (Alcamo 2008). Conceptualizing future states of the world requires exploring environmental, and often social, change over the long term (Burton et al 2004). Many methods may be employed to do this including projections, sensitivity analysis, or quantitative scenarios coupled with qualitative storylines (Carter et al 2007). This paper focuses on a particular technique for coupling scenarios and storylines known as story and simulation (SAS) and the selection of qualitative storylines in particular. Studies embodying the SAS approach have appeared in regional and global environmental assessments including the US National Assessment, the UK Climate Impacts Programme, the Millennium Ecosystem Assessment, and the Special Report on Emissions Scenarios commissioned by the Intergovernmental Panel on Climate Change, or IPCC (Parson et al 2007).
Scenarios in SAS studies embody quantitative elements that are modeled as well as qualitative elements that may not (or cannot) be captured with models (Rounsevell and Metzger 2010). Specifically, the purposes of qualitative elements of scenarios are (1) to provide a plausible backdrop for exogenous assumptions used in quantitative simulations as well as (2) to categorize simulation results into types of futures. Among the nearly infinite ways that the future could be conceptualized, a few storylines will constrain the vast space of possibilities. However, there is a quandary: why should a particular storyline be selected? Scholars in futures studies have advanced criteria for selecting storylines including plausibility, internal consistency, divergence among selected storylines, how challenging the storylines are in comparison to the present, and perceived likelihood (Raskin et al 2005). Regardless of the criteria employed, judgments are made to study a small set of scenarios at the expense of others, which could have deep implications for the results and recommendations of an environmental change study.
Additionally, the criteria for selecting storylines are subjective. Rarely is the fundamental criterion for a credible storyline-internal consistency-demonstrated. Internal consistency refers to the scenario's ability to represent dynamics consistent with current knowledge regarding plausible trends. For example, a scenario that describes a future society with high levels of wealth, high educational attainment and low fertility rates would be considered internally consistent, while a scenario describing high levels of wealth, low educational attainment and low fertility rates would be questionable. With SAS, the coupling of a storyline to a quantitative simulation is believed to provide a sufficient check for the internal consistency of the scenario overall (Alcamo andHeinrichs 2008, Raskin et al 2005). Instead, this coupling is a limited check for internal consistency, as simulations provide internally consistent quantification only if the model embodies all of the dynamics with which the storyline aims to be consistent. Clearly this is a fundamental limitation of SAS, as storylines are used precisely to represent dynamics that models do not. Thus model simulations alone are insufficient for guaranteeing the internal consistency of scenarios, which must be consistent across their quantitative and qualitative components.
A recent development in futures studies, the cross-impact balance (CIB) method, provides an explicit check for the internal consistency of qualitative scenarios (Weimer-Jehle 2006. CIB analysis systematically represents relationships between scenario variables semi-quantitatively in order to evaluate the internal consistency of any scenario possible. In this paper, we demonstrate how CIB analysis can be used to identify alternate descriptions of the future satisfying the criterion of internal consistency. We chose to conduct this meta-analysis on the scenarios in the IPCC Special Report on Emissions Scenarios (Nakicenovic et al 2000), or SRES, because they are a salient, representative and well-documented case of global change scenarios. While we focus on the SRES, the method demonstrated here would be appropriate for any environmental SAS study.
To be clear, unlike many post-SRES scenario analyses, this paper does not assess the internal consistency of SRES scenarios with updated empirical information. Rather, this paper aims to assess the internal consistency of SRES scenarios with a new method-CIB analysis-using the best data available at the time the SRES was written, i.e., the studies the SRES cites. This type of analysis is useful because scenarios are devised under imperfect information. Therefore, methodological advancements that help scenario-makers get more out of available information have the potential to diminish surprise, or negative learning (Oppenheimer et al 2008), and to improve policy-related decision-making. In short, this study has three goals. First, we demonstrate how CIB analysis can identify internally consistent qualitative scenarios, where the SRES storylines are our case study. Second, we investigate the internal consistency of the SRES scenarios. Third, from an exhaustive ranking of the internal consistency of all scenarios possible, we examine if internally consistent scenarios that are substantially interesting and different from those featured in the SRES can be found.

Method
In this section, we briefly introduce the CIB method and describe how it was applied in our assessment of the SRES scenarios.
2.1. Introduction to cross-impact balance (CIB) analysis CIB analysis has been employed for studies relevant to environmental and global change research (ZIRN 2011) including energy technology innovation (Fuchs et al 2008), sustainability (Renn et al 2009), and water management (LiWa 2011). The CIB method represents judgments about relationships among variables selected to represent some system under study. Judgments can be collected from experts or through literature review. Collected judgments are then used to evaluate the internal consistency of any particular scenario. In the CIB context, a scenario is any combination of outcomes across system variables such as wealth level = high, educational attainment = high, fertility rate = low. For any CIB analysis, the following steps are completed.
• Define the system under study.
• Collect judgments for relationships among variables.
• Evaluate the internal consistency of scenarios.
Detailed descriptions of CIB analysis can be found in Weimer-Jehle (2006. Sections below provide a summary of the method using the SRES application as an example.

Defining the system under study
Variables that comprise a scenario in CIB analysis are called descriptors. Specific outcomes that descriptors can take are called descriptor states. When defining the system under study, it is necessary to delineate descriptors and their possible states.

Defining the SRES descriptors.
In the four SRES storylines (A1, A2, B1, B2), there are many options for variables to choose as descriptors. We began with the basic guide cited by the SRES, which is the Kaya (1990) identity where F is CO 2 emissions; P is population; g is average income, measured as a ratio of gross domestic product (GDP) per capita; e is average energy intensity, measured as a ratio of energy consumed to GDP; and f is average carbon intensity, measured as a ratio of CO 2 emitted to energy consumed. In their consideration of these variables, SRES authors identified the following as fundamental: population, economic development, energy resources, carbon intensity of energy supply, energy intensity, land use decisions and policy orientations (economic versus environmental, global versus regional). In our determination of descriptors for the SRES storylines, which were specified at the global scale, we selected global population, growth in average income at the global scale (a proxy for economic development), global energy resources, average carbon intensity of global energy supply, average energy intensity at the global scale, global balance of economic policy orientation across nations (global versus regional) and global balance of environmental policy orientation across nations (global versus regional). Land use was omitted because it proved difficult to understand its relationship to the other descriptors (and even to the storylines) solely from a read of the SRES. We therefore restrict our assessment of the internal consistency of the SRES scenarios to the portion of CO 2 emissions that are energy related.

Defining the SRES descriptor states.
As discussed in section 1, scenarios in the SAS tradition must be internally consistent across their qualitative and quantitative components. Thus when we identified what descriptor states to use in our evaluation of the SRES scenarios, we consulted three types of information: qualitative descriptions of trends and outcomes in the four SRES storylines (A1, A2, B1, B2 as detailed in supplementary data available at stacks.iop. org/ERL/7/044011/mmedia); scholarly literature cited in the SRES; and the results of 40 model quantifications featured in SRES (Nakicenovic et al 2000p 186, IPCC 2000. It was necessary to consult model quantifications for some descriptor states, as qualitative descriptions in the storylines-such as low, balanced or high carbon intensity-were meaningless otherwise. For the descriptors of global population and global economic development, model quantifications for the four SRES storylines can be distinguished in well-defined ranges (see figure 1). 'Mixed' quantitative simulations rest on or near the boundaries of these descriptor states.
Regarding energy resources, the SRES notes, The most important long-term issue (for energy resources) is how the transition away from easily accessible conventional oil (and to a lesser extent conventional gas) reserves will unfold. Will it lead to a massive development of coal in the absence of alternatives or, conversely, to a massive development of unconventional oil and gas? Alternatively, could the development of post-fossil alternatives make the recourse to coal and unconventional oil and gas (such as methane clathrates) obsolete? (Nakicenovic et al 2000, p 137).
We interpreted this to mean that the descriptor states for energy resources should focus on the technical availability of fossil resources. Additionally, it would be important to distinguish the availability of oil and gas (whether conventional or unconventional) from coal.
To better enable judgments for its relationships to other descriptors, quantitative ranges for states of the carbon intensity descriptor required interpretation. Typically, carbon intensity is presented in units such as a ratio of quantities of CO 2 over joules of energy consumed or GDP. However, quantities expressed in such units may not be meaningful for recording judgments of how these outcomes would be expected to influence other descriptor states (nor how these outcomes could be influenced in turn) as discussed in greater detail in section 2.3.1. We considered that an alternative quantification might be acceptable, such as reliance upon fossil fuels for global primary energy demand. To investigate  this, we examined the distribution of the 40 SRES quantitative simulations for average global carbon intensity versus global reliance on fossil fuels. A strong linear relationship exists (see figure 2). Thus the descriptor states for carbon intensity can be defined as percentile ranges for primary energy met by fossil resources in 2100. Scenarios fall primarily into three categories, where 50% or more of primary energy demands are met by fossil fuels (high carbon intensity); 30-49% is met by fossil fuels (balanced energy structure); and around 10%-29% is met by fossil fuels (low carbon intensity). One scenario had less than 6% of primary energy met by fossil fuels. This scenario was considered representative of very low carbon intensity. Ranges for energy intensity were determined by consulting ranges associated with storylines in the SRES  (Nakicenovic et al 2000, p 186) and by the distribution of the 40 quantitative scenarios. In figure 3, it is difficult to distinguish a pattern for the states of energy intensity. We let the range referred to in the SRES as 'medium' primary energy intensity (4.3-6.5 MJ/USD) be the anchor for the energy intensity states. We then specified that values below this range represent low energy intensity while values above this range represent high.
The global balance of economic and environmental policy orientations across nations were explicitly described in the SRES storylines. Readers interested in verbatim statements from the report interpreted for these descriptor states are referred to supplementary data (available at stacks. iop.org/ERL/7/044011/mmedia).  Table 1 summarizes definitions for the aforementioned descriptor states. All states are for global outcomes or trends through 2100.

Collecting judgments for relationships among descriptors
Once the system to be studied has been defined, the descriptors and their states can be organized in a cross-impact matrix (see figure 4). This organization is useful for recording judgments about how any given descriptor state would be expected to directly influence target states for other descriptors. Rows represent given states, or descriptor states that would exert an influence upon each intersecting state across columns. Columns represent target states, or descriptor states that would receive influences. In other words, rows represent descriptor states acting as impact sources and columns represent descriptor states acting as impact sinks. It may be noted that the descriptors and their states summarized in table 1 appear as headings for both the rows and columns (states are indented in the rows and further subdivide the columns). This is because in CIB analysis, the use of a full cross-impact matrix is required, as the evaluation of a scenario's internal consistency is calculated with state-dependent influences 3 .
The cells of the CIB matrix contain numerical crossimpact judgments about how descriptor states in the rows (impact sources) exert direct influences on descriptor states in the columns (impact sinks). The distinction between direct and indirect influences is important, otherwise the CIB analysis may result in 'double counting' of impact balances and skewed results. For each judgment section (circled in figure 4), one considers the cross-impact question, 'if the only information you have about the system is that [given] descriptor X has state x, would you evaluate the direct influence of X on [target descriptor] Y as a clue that descriptor Y has state y (promoting influence) or as a clue that descriptor Y does not have state y (restricting influence)?' (Weimer-Jehle 2006, p 339) Judgments can then be recorded according to a seven-point ordinal scale, where positive scores represent 'promoting' influences and negative scores 'restricting' influences. The stronger the direct influence, the greater the magnitude of the cross-impact judgment. A cross-impact judgment of 0 indicates that given state x has no direct influence on target state y.

Judgments in the SRES CIB matrix.
To enable this CIB analysis, we conceptualized the broad interactions of all descriptor states at a 'globally averaged' level and aimed for a basic analysis of internal consistency. Although quantitative modeling occurred at regional levels, the SRES storylines (with which model results aim to be internally consistent) describe human developments that are global in scale-especially world population and economic growth. The numerical cross-impact judgments in figure 4 summarize judgments used in our baseline analysis. Judgments were obtained from interpretations of verbatim statements in the SRES (cited below and in supplementary data available at stacks.iop.org/ERL/7/044011/mmedia). Judgment sections that lacked distinguishable direct impact relationships were assigned cross-impact judgments of 0 in the matrix. As noted previously, care must be exercised in recording judgments for direct influences only, as indirect influences are taken into account automatically during assessment of a scenario's internal consistency. To ensure that only direct influences were considered for cross-impact judgments, a diagram of direct influences was constructed independently and verified against literature referenced in SRES chapters 2-4. Detail about this influence diagram can be found in supplementary data (available at stacks.iop.org/ERL/7/044011/mmedia).
We identified 17 judgment sections that should have non-zero cross-impact judgments. In the baseline analysis, 13 of these were justified by documentation provided in the SRES; however, for four policy-related judgment sections, influences were introduced. Although SRES authors acknowledged influences between policies and many drivers of emissions, they also noted, Few of the policies and instruments identified . . . can be represented directly in the models typically used to produce GHG (greenhouse gas) emission scenarios. In general, the impacts of policies are highly uncertain (Houghton et al 1996). . . . Instead, the qualitative SRES scenario storylines give a broad characterization of the areas of policy emphasis thought to be associated with particular economic, technological, and environmental outcomes, as reflected in alternative scenario assumptions in the models used to generate long-term GHG emission scenarios (Nakicenovic et al 2000, p 157).
For these reasons, we interjected non-zero policy judgment sections only if it would have been more absurd to assume that policy orientations (global or regional) would have no effect on descriptor states and vice versa. Thus we introduced non-zero judgment sections for the following: direct influences of environmental policy on carbon intensity; direct influences of environmental policy on primary energy intensity; direct influences of per capita GDP on the orientation of economic policy; and direct influences of carbon intensity on the orientation of environmental policy.
Qualitative statements in the SRES were translated into quantitative cross-impact judgments according to the following algorithm. Judgments were recorded in the matrix by each judgment group (boxed in figure 4), which represents all possible outcomes for a particular target descriptor. For each of the 17 non-zero judgment sections, the following were determined for each given descriptor state x.
• Cell(s) in the judgment group with non-zero influences from given descriptor state x on target descriptor state y.
For each of these cells, the impact direction (promoting or restricting influence) and impact strength were noted. An impact score of ±1 was assigned for weak/slight influences, ±3 for strong, and ±2 for more than weak but less than strong. Conservative judgments of ±1 were used for influences whose strength was ambiguously stated. These relationships were determined by consulting chapter 3 of the SRES, the Kaya identity (equation (1)), or results from the SRES scenario database. Wherever possible, direct quotes were used to assign judgment scores.
• Remaining impacts for the judgment group. A basic rule in CIB analysis that enables internal consistency scoring for scenarios is the principle of compensation (Weimer-Jehle 2006, p 340). This refers to the requirement that each descriptor state (see table 1) represents a mutually exclusive outcome for each descriptor. By this principle, a promoting influence for one target descriptor state necessarily implies a restricting influence for at least one of the alternative descriptor states. In other words, the set of descriptor states should be exhaustive for the descriptor so that promoting influences on some states imply restricting influences on complementary states. This explains why cross-impact judgments in each judgment group sum to zero. As long as data or other reference material can anchor some of the cross-impact judgments in a judgment group explicitly, the principle of compensation can be used to interpolate complementary cross-impact judgments even if reference data or literature is incomplete. In cases where judgments recorded in the judgment group may be contested, it is possible to perform sensitivity analyses with alternative interpolations or judgments to investigate whether differences in judgments matter. Sensitivity analyses we performed for this study are discussed in section 2.3.2.
An example of this algorithm for a judgment section follows. In the example, we provide the SRES reference, our interpretation, and the sequence of cross-impact judgments recorded. The remaining 16 non-zero judgment sections are detailed in supplementary data (available at stacks.iop.org/ ERL/7/044011/mmedia).
2.3.1.1. Relationship: direct influence of population on GDP growth per capita. SRES passage: 'Prior to 1980, the overwhelming majority of studies showed no significant correlation between population growth and economic growth (National Research Council 1986). Recent correlation studies, however, suggest a statistically significant, but weak, inverse relationship for the 1970s and 1980s, despite no correlation being established previously (Blanchet 1991)' (Nakicenovic et al 2000, p 120). Interpretation: (a) No relationship between descriptors exists for the low population state. (b) For population states of medium and high, population slightly restricts high or very high economic growth.
Compensation: Figure 5. An example of impact balance calculations for a given scenario.

Sensitivity analysis of judgments in the SRES CIB matrix.
Qualitative statements in the SRES leave room for different interpretations of influences between descriptors. Thus some differences for cross-impact influences were investigated through sensitivity analysis, as discussed in detail in supplementary data (available at stacks.iop.org/ERL/ 7/044011/mmedia). This involved the consideration of 10 additional direct influences not included in the baseline matrix (introduction of up to 16 non-zero judgment groups) as well as adjustments to five baseline judgments (modification of up to nine non-zero judgment groups). In all, 14 sensitivity tests (that is, analysis with 14 different versions of the SRES CIB matrix) were investigated.

Evaluating the internal consistency of scenarios
As mentioned previously, internally consistent scenarios represent dynamics (or outcomes) consistent with current knowledge regarding plausible future trends. Since CIB analysis requires the documentation of knowledge relevant for scenario dynamics, the full collection of cross-impact judgments in the CIB matrix (cf figure 4) can act as a database for sets of influences that may be associated with any given combination of descriptor states (that is, a given scenario). In other words, the CIB matrix goes beyond providing a framework for collecting cross-impact judgments; it can also be used to perform the CIB internal consistency test, which is a logical check for scenario self-consistency.
Self-consistency is an important property of stable 4 scenarios (von Reibnitz 1988), or scenarios that describe 4 An elaboration of von Reibnitz's concept of scenario stability is as follows: 'Internal stability means that, when subject to disruptions, these scenarios do not improve in the direction of greater consistency. Scenario instability means that, when subject to disruptive events, the scenarios change in the direction of greater consistency. The purpose of a stability analysis . . . is to ascertain which scenarios have high stability and hence in most cases have long-term long-term trends. This is precisely the aim of the storylines that appeared in the SRES. Recall that, in the CIB context, a scenario is any combination of outcomes across variables. This means that a scenario is a combination of descriptor states. Each combination of given descriptor states can be associated with a unique set of influences that will impact the outcomes of all target descriptors. Figure 5 provides a visual representation of this association between a given scenario and its unique set of cross-impacts on target descriptors (shaded rows). If a given scenario is self-consistent, the set of cross-impacts associated with it represent system influences that result in a target scenario identical to the given scenario. When this occurs, the internal consistency of the scenario has been demonstrated.
Perfect self-consistency is a challenging property for any scenario to meet. Among the many possible ways to combine descriptor states, most scenarios will not be perfectly internally consistent. This is the case for the given scenario in figure 5. The given scenario is population = low, GDP growth = very high, fossil fuel availability = high fossils, carbon intensity = balanced, energy intensity = medium, economic policy = global, environmental policy = global. Shaded rows show the unique set of influences associated with this scenario. The target scenario is the net result of these influences and can be determined by examining impact balance scores, which are the columnar sums of cross-impact judgments intersecting the shaded rows. For example, the impact balance score for 'fossil fuel availability = high fossils' is the columnar sum obtained from the shaded intersecting rows, −1 = (1 + 0 + (−1) + (−1) + 0 + validity. 'The purpose of this type of [analysis] is to generate scenarios with maximum possible stability so that scenarios with long term [sic] validity are used in planning, and one avoids concentrating on scenarios that could only illuminate a momentary future situation (unstable scenarios)'. See p 47 of von Reibnitz (1988). 0). Impact balance scores represent the cumulative impacts of influences associated with the given scenario, which ultimately promote or restrict each state for each descriptor. The states of the target scenario are revealed by the set of maximum impact balance scores for each descriptor. For the example in figure 5, the target scenario descriptor states are denoted with an upward facing arrow in the 'target scenarios states' row, specifically, population = low, GDP growth = very high, fossil fuel availability = coal, carbon intensity = balanced, energy intensity = medium, economic policy = global, environmental policy = global.
Inconsistencies can be identified by comparing the states of the target scenario (the most strongly 'promoted' states denoted by upward facing arrows in the 'target scenario states' row in figure 5) to the states assumed in the given scenario (denoted by downward facing arrows the 'given scenario states row' in figure 5). When the descriptor states for the given and target scenarios are mismatched (denoted in figure 5 by upward and downward facing arrows that are not aligned), this means that the combined influences of the given scenario promote some scenario different from itself and is therefore not self-consistent. In figure 5, the given scenario has an internal inconsistency, since the impact balance score for the given scenario's descriptor state for fossil fuel availability, 'high fossils', is not a maximum; rather, the fossil fuel availability state of 'low oil/gas (high coal)' is more strongly promoted.
From this discrepancy, an inconsistency score can be calculated. It is defined as the maximum difference between the impact balance score for the target scenario state and the given scenario state that can be found across all descriptors. For the given scenario in figure 5, there is only one descriptor with a discrepancy in the target and given scenario states-fossil fuel availability. Here the inconsistency score is 3 = 2 − (−1). Perfectly internally consistent scenarios have inconsistency scores of 0.

Outputs of interest from CIB analysis.
From the judgments collected in the matrix and the internal consistency evaluations of scenarios, many different outputs can be obtained (Weimer-Jehle 2006). In this study, two are of interest.
• The internal consistency scores of particular scenarios.
• The descriptor states of any scenarios with good internal consistency (that is, inconsistency scores that are equal to or near 0).
Internal consistency assessments of the SRES scenarios would be an example of the first output. This required us to specify which combinations of descriptor states should be considered SRES scenarios. We then performed internal consistency tests on each of the descriptor state combinations that are SRES scenarios.
The latter type of result can be obtained in three steps. First, internal consistency tests can be performed on all combinations of descriptor states possible. This would be an exhaustive scan of all scenarios possible. Second, all scenarios could then be ranked according to their internal consistency scores. This would provide groupings for all scenarios that perform well or poorly. Finally, the groups of scenarios that perform well could be more closely investigated to uncover their specific descriptor states. In our application of CIB analysis to the SRES scenarios, we would be interested in this output to see if scenarios with good internal consistency that are substantially interesting and different from those featured in the SRES can be found.
2.4.1.1. Combinations of descriptor states indicating the SRES scenarios. Figures 1-3 provide clues for combinations of descriptor states that correspond to different SRES scenarios. From figure 1, it is clear what combinations of descriptor states for population and GDP per capita growth correspond to the SRES storylines. The A1 storyline represents a low population, very high GDP growth case; A2 a high population, low GDP growth case; B1 a low population, high GDP growth case; and B2 a medium population, medium GDP growth case.
In figure 2, it is less clear what states for carbon intensity should correspond to each storyline. Most A1 and B1 scenarios represent worlds with very low to balanced carbon intensity. Most A2 and B2 scenarios represent balanced or highly carbon-intensive worlds. However, there is also variation across the scenarios. This can be seen most clearly for A1 and demonstrates how the SRES separated the A1 storyline into futures with low, balanced, and high carbon intensity (specifically A1T: Technology that is advanced non-fossil; A1B: Balanced mix of technologies between fossil and non-fossil sources; A1FI: Fossil Intensive energy technology).
Variation can also be seen in figure 3. The energy intensity ranges for B1 and B2 storylines are well defined, but the ranges for A1 and A2 have variations. B1 is a low energy intensity world, and B2 predominantly a world of medium energy intensity. A1 is a world of medium or low primary energy intensity, while A2 is a world of medium or high energy intensity.
Appropriate descriptor states for fossil fuel availability were inferred from a data table in the report (Nakicenovic et al 2000, p 186). The A1 storyline describes a world predominantly of high fossil fuel availability, with a few cases of high coal. The A2 and B2 worlds are predominantly characterized by low oil and gas availability but high coal, although a few A2 scenarios with high fossil fuel availability exist. The situation for B1 is less clear, and we allow for interpretations of either high or low fossil fuel availability.
Descriptor states for economic and environmental policy orientations were fairly clear from the SRES storylines. The A2 and B2 worlds were shaped by regionalized economic and environmental policies, while the A1 and B1 worlds assumed a globalized economic approach. In the case of B1, environmental policies were also globalized. The environmental policy orientation in the A1 world was less clear; thus we investigated either globalized or regionalized interpretations.
In all, we identified 52 descriptor state combinations that could correspond to the 40 quantitative simulations in the SRES database (IPCC 2000). Multiple interpretations for individual model results were required in some cases, as model outputs for some descriptors fell on or near the boundaries intended to delineate descriptor states (cf figures 2 and 3). As previously mentioned, the SRES differentiated the A1 storyline according to dominant energy supply technologies (A1T, A1B, A1FI). In the scenarios investigated with CIB analysis, this differentiation was expanded to other storylines as applicable. More specifically, we use the identifiers T2, T1 and B to distinguish scenarios with very low, low and balanced carbon intensity. In addition, BC and FI denote scenarios with balanced (B) and high carbon intensity (FI) emerging in a situation of high coal (C) availability. The 52 SRES combinations are summarized in the appendix, and storylines used for these interpretations are in the supplementary data (available at stacks.iop.org/ERL/7/ 044011/mmedia).

Results for the baseline CIB matrix
A frequency distribution for internal inconsistency scores demonstrates both types of results that are of interest: first, the internal consistency scores of the SRES scenarios, and second, the total number of possible scenarios that perform best in terms of internal consistency. The frequency distribution of scenarios analyzed with the baseline matrix (cf figure 4) is shown in figure 6. Perfectly internally consistent scenarios appear at the top of figure 6 and have inconsistency scores of 0. Nearly perfectly internally consistent scenarios, which are labeled 'internally consistent', have inconsistency scores ≤2. Scenarios in bins toward the bottom of figure 6 are increasingly inconsistent. Compared to all 1728 descriptor combinations possible (which is a product of the number of states for each descriptor, or 1728 = 4 2 × 3 3 × 2 2 ), SRES scenarios are more internally consistent, since their inconsistency scores as a group are lower. Acceptable thresholds for internal consistency are set by the analyst, though ideally, scenarios should be nearly or perfectly internally consistent. We set our threshold for the best scenarios at inconsistency scores ≤2. As inconsistency score increases, scenarios bear more serious internal inconsistencies. Note in figure 6 that 77% of the SRES scenarios have internal inconsistency scores ≥3.
Out of 1728 possible scenarios, only 11 are perfectly internally consistent (that is, having an inconsistency score of 0). None of these scenarios reflect worlds of low or very low carbon intensity. In general, the descriptor state 'low oil and gas, high coal' for fossil fuel availability was found to be prevalent among the perfectly internally consistent scenarios. Within this subset, only three SRES storylines have perfectly internally consistent representations-namely A1B (with globalized (G) environmental policy, low primary energy intensity), A2FI and B2FI (with high energy intensity). The remaining eight perfectly internally consistent scenarios are not SRES scenarios. Of these eight, five represent interesting futures characteristically different from those featured in the SRES. All five are economically globalized worlds with low or medium populations reliant on coal with high economic growth. Energy structures are balanced or carbon intensive. States of the other descriptors vary. For the features that these non-SRES scenarios have in common, they can be termed 'coal-powered growth' worlds. Other scenarios from the A1 and B1 families, describing globalized worlds with high economic growth and easy access to fossil fuel, can also be found among the more consistent scenarios with inconsistency scores ≤2. A discussion of the scenario logics that result in internal inconsistencies can be found in the supplementary data (available at stacks.iop.org/ERL/7/044011/mmedia).

Sensitivity of results to changes in the CIB matrix
Since the above inconsistency scores are wholly dependent upon assumptions and interpretations made in the CIB matrix, 14 potential sensitivities (that is, 14 different versions of the SRES CIB matrix) were analyzed. Six of these sensitivities involved introducing new cross-impact relationships (up to 7 additional non-zero judgment sections). Four sensitivities consisted of adjustments to relationships already in the baseline matrix (adjustments for up to five judgment sections), and four were combinations of the aforementioned sensitivities. A complete description of the sensitivity cases and associated changes to judgment sections in the matrix can be found in the supplementary data (available at stacks.iop. org/ERL/7/044011/mmedia).
In general, inconsistency scores for all SRES scenarios deviated little from the baseline result. For inconsistency scores that did change, we found they were most sensitive to the following.
• Adding a restricting influence for economic growth on fossil fuel availability.
• Adding a restricting influence for regionalized economic policy on fossil fuel availability.
• Assuming that high fossil fuel availability (including natural gas) would strongly promote balanced energy structures, while higher coal availability would weakly promote balanced carbon intensity.
• Strengthening the promoting influence of globalized environmental policy on low carbon and energy intensity.
While the former three changes affected average inconsistency scores most, the latter change promoted scenarios with very low carbon intensity and low energy intensity to perfect internal consistency (e.g. A1T2, B1T2).

Robust SRES and non-SRES scenarios
Internally consistent SRES scenarios robust across all 14 sensitivity analyses and the baseline case (that is, inconsistency score ≤2 for all versions of the SRES CIB matrix) are summarized in table 2. Each storyline has at least one robust case. It should be noted that these robust scenarios describe coal-powered and carbon-intensive futures in regionalized worlds (A2 and B2 families), or balanced or low carbon energy systems with high fossil fuel availability in globalized worlds (A1 and B1 families). SRES scenarios not summarized in table 2 have weaker internal consistency. While four SRES scenarios maintain their internal consistency across all sensitivity analyses, 17 non-SRES scenarios did the same. None of these non-SRES scenarios reflect very low carbon intensity, nor are they futures of high oil and gas availability. Almost all of them reflect coal to be the readily available fossil resource. Of these 17, 13 represent potentially interesting futures characteristically different from those featured in SRES. Seven are futures of modest growth, where five achieve balanced energy structures, and two achieve low carbon energy structures. However, six of the non-SRES scenarios also resemble the 'coal-powered growth' worlds discussed in section 3.1, and three of them may be environmentally foreboding. Although population is low or medium, GDP growth per capita is high or very high. Coal is the highly available fossil fuel, and energy structures are carbon intensive rather than balanced. Primary energy intensity is medium or high rather than low, economic policy is globalized, yet environmental policy is regionalized. Most of these 'coal-powered growth' scenarios are particularly internally consistent, with their maximum inconsistency scores being 1 across all 14 sensitivity analyses.

Implications for future storyline scenarios
In this paper, we demonstrate how CIB analysis can be used to select alternate descriptions of the future that satisfy the criterion of internal consistency given a general set of descriptors and their interrelation. We also found that information about tendencies of a system can be uncovered through CIB and sensitivity analysis, which can have deep implications for the results and recommendations of an environmental change study. These applications are particularly appropriate for environmental SAS studies. On this note, results specific to our case study on emissions scenarios may be instructive for new socioeconomic scenarios for climate change research (Moss et al 2010, O'Neill andSchweizer 2011).
For the SRES scenarios that we analyzed, the goals of this study were (1) to critically examine the internal consistency of the SRES scenarios across their qualitative and quantitative components and (2) from an exhaustive scan of all scenarios possible, determine if substantially different, interesting and internally consistent scenarios could be identified that were not featured in the SRES. Results show that the SRES did identify internally consistent scenarios, and each storyline has at least one case in the nearly perfectly consistent set. However, not all scenarios featured in the SRES are equally internally consistent. Although it could be possible for scenarios to be realistic that are inconsistent with our current understanding of how emissions drivers interrelate, the accuracy of scenarios cannot be known until after the fact. Because scenarios aim to provide useful foresight, it is our position that scenario credibility must rest on their internal consistency with knowledge aimed to justify particular sets of trends as plausible.
In this regard, the characteristics of SRES and non-SRES scenarios robust to sensitivity analysis should be considered further. Overall, carbon-intensive futures with the highest emissions profiles-some of them not featured in the SRES-were found to be perfectly internally consistent and highly robust. This finding is especially relevant, since recent research showed that energy-related CO 2 emissions were increasing rapidly prior to the global economic downturn , it should nevertheless be noted that recent high emissions trajectories were due to a combination of strong economic growth, highly carbon-intensive energy structures, increased use of coal and modest improvement in primary energy intensity-scenarios our study found to be highly robust and, therefore, most consistent with information available at the time the SRES was written.
An important objective of scenario analysis is exploration; scenarios can potentially help users consider surprising developments or discontinuities (Bradfield et al 2005, EEA 2009). Bearing this in mind, one might question whether the results of this CIB analysis hew too closely to past trends. It is our position that the objective of exploration does not rest solely on the divergence of scenario results. Exploration is also achieved when the behavior of important system influences is elucidated. We found this to be the case for our SRES study in two ways. First, certain assumptions embedded in storyline logic may be crucial for internal consistency. We found this to be the case for the very low carbon intensity scenarios for A1 and B1. The internal consistencies of these futures are substantially enhanced when one makes strong assumptions about the influence of global environmental policy to promote improvements in energy and carbon intensity; modest assumptions about the influences of these relationships are insufficient. Second, the results from the sensitivity analyses for robust scenarios suggest there are substantial system inertias for achieving any low carbon intensity scenarios. Absent climate policies, policies affecting the availability of oil and gas, as well as carbon intensity, alter scenario consistency most. This implies that the global mitigation challenge should not be understated, yet the presentation of SRES storylines with very low carbon intensity as 'equally plausible' to those with high carbon intensity may do just that. Together these findings imply that policy discussions informed by scenarios containing qualitative elements could benefit from a systematic exploration of system dynamics as well as investigation of the internal consistency of all scenarios possible, which CIB analysis can provide 5 .
As a caveat, we note that CIB analysis is based on the assumption that impacts can be summed linearly. Though some nonlinear, indirect influences are captured this way, any nonlinear interaction effects are not 6 . Additionally one can generate only rough scenarios, since a CIB matrix can get large very quickly. Thus there is incentive to aggregate descriptors and descriptor states. On this note, CIB analysis is not appropriate for problems that allow a theory-based or empirical treatment, as CIB analysis cannot deliver this level of detail. This is why CIB analysis is most appropriate for assessing the internal consistency of storylines as opposed to quantitative ranges of carbon emissions. Finally, since CIB analysis relies on judgments, it is best performed 'live' to allow iterations with storyline authors (the experts) on any controversial impact relationships. Because this was not possible in our particular analysis, we used statements from the SRES and sensitivity analysis as surrogates.
Notwithstanding these limitations, our paper demonstrates that the internal consistency of scenarios can be assessed in a systematic way. Additionally, information about tendencies of a system can be uncovered through CIB analysis, which can have deep implications for the results and recommendations of an environmental change study. For environmental assessments utilizing scenarios, including the next generation of socioeconomic scenarios for the IPCC's Fifth Assessment Report, we recommend that systematic techniques for selecting storylines, such as the CIB method, be incorporated.
Mellon University. The CDMC has been created through a cooperative agreement between the National Science Foundation (SES-0345798) and Carnegie Mellon University. Dr Schweizer is now at the National Center for Atmospheric Research, which is operated by the University Corporation for Atmospheric Research and supported by the National Science Foundation. Elmar Kriegler was supported by a Marie Curie International Fellowship (MOIF-CT-2005-008758) within the 6th European Community Framework Programme.

Appendix. SRES descriptor state combinations
Quantitative SRES scenarios technically map to a unique combination of descriptor states. However, in situations where  results of quantitative scenarios fall close to the boundaries of descriptor state ranges, this precision is artificial, and we assigned to such scenarios more than one combination of descriptors. Such 'borderline' quantitative SRES scenarios are italicized in table A.1 and set in bold face for the descriptor combination that technically describes them. Abbreviations for the descriptors and their states are the same as figure 1, where GDP/capita is further abbreviated as 'GDP' and PE intensity is further abbreviated 'PEI'. Scenario names that are underlined are SRES marker scenarios.