Understand what you measure: Where climate transition risk metrics converge and why they diverge

Climate risks are financial risks. To help manage them, researchers and practitioners are exploring which metrics to use to assess climate risks, and to what extent the metrics delivers heterogeneous results. We analyze a unique dataset including risk assessments from 9 providers for firms of the MSCI World Index. Convergence between metrics is higher for the firms most exposed to transition risk. The underlying modeling assumptions and scenario characteristics are associated with changes in the estimated transition risk. Users of climate risk metrics should properly understand the key assumptions underlying a metric to appropriately interpret its result.


Introduction
Risk assessments are at the cornerstone of financial decisions. Climate risks are financial risks (NGFS, 2019;IMF, 2020;BCBS, 2021;FSB, 2022) and, as such, need to be addressed in risk management (Hong et al., 2019;Baldauf et al., 2020;Engle et al., 2020;Monasterolo and De Angelis, 2020;Bakkensen and Barrage, 2022;Bolton and Kacperczyk, 2021;Görgen et al., 2021). However, there is no agreement on how to measure climate risks and on which metrics to use for that. Traditional approaches, based on historical data are likely misleading in the context of climate risks because of non-linearity, non-stationarity, path-dependencies, unprecedented developments, and endogeneity issues (Weitzman, 2011;Chenet et al., 2019;Battiston et al., 2019;Karydas and Xepapadeas, 2019). Forward-looking approaches, usually based on scenario analysis, are therefore subject to increasing attention and fast developments.
Yet, the methodologies, data, and assumptions underpinning these forward-looking approaches vary substantially (Bingler and Colesanti Senni, 2022). This does not come as a surprise: it reflects the underlying complexity and uncertainty in the analysis of climate risks. It has been shown that environmental, social and governance (ESG) ratings could vary considerably across metric providers, for the same firm (Berg et al., 2022). Focusing on physical risk metrics, Hain et al. (2022) find considerable divergence across six risk measurement approaches. Given the deep uncertainty around climate risks, this divergence is not avoidable, and per se not an issue, as long as the key drivers of risk are properly understood.
Understanding the drivers of such heterogeneity in climate risk assessments is key for investors and supervisors, and also crucial for reliable research. The present analysis shows that, although risk indicators tend to converge on firms most exposed to transition J.A. Bingler et al.  risk, the different methodologies and scenarios underlying them generate heterogeneity in climate transition risk assessments. As a consequence, the selection of specific climate risk metrics should be explicitly justified.
To this end, this paper explores whether climate risk metrics of different providers tend to converge or diverge for the same firm, and which metric characteristics are associated with changes in the estimated transition risk exposures. For our analysis, we construct a dataset of 105,466 observations, covering 69 different climate transition risk metrics from 9 providers for 1565 firms of the MSCI World Index.
The remainder of this paper is organized as follows: In Section 2, we describe the data used and the variables chosen, as well as the methodological approach adopted in this study. In Section 3, we report the main findings concerning the convergence and divergence of climate risk metrics. Moreover, we look at the association between risk exposure and metrics' characteristics, both across-metrics and within the same providers. Our conclusions are summarized in Section 4, where we also highlight key areas for future research, and implications for asset managers, investors, central banks, and financial supervisors.

Data and variables
We focus on a sample of 1565 companies included in the MSCI World Index as of 31 January 2020. We consider forward-looking climate risk metrics, which assess the transition risk at the individual firm level. 1 For our company sample, we obtain transition risk assessments from 14 providers (see Table C.7). We restricted our analysis sample to 69 metrics from 9 providers, who fulfilled the following criteria: (1) the metrics provided aggregate information on climate risks at the firm and regional level, and (2) data were provided for more than half of the firms in the sample. Our final analysis sample consists of 105,466 observations.
To identify the most important drivers of climate risk metric values, we built on previous analyses of the core features of climate risk metric providers (Bingler and Colesanti Senni, 2022;Bingler et al., 2020) and set up a detailed questionnaire for the tool providers to obtain information on the relevant drivers. We then ask the providers to add to the questionnaire any further drivers that we might have overseen and that they would consider relevant. Each provider filled in the detailed questionnaire on the key assumptions and drivers of their metrics Based on their responses and bilateral exchanges, we identified six core categorical variables that are likely to exert the largest influence on the final risk value. The variables include the underlying climate scenario-specific variables, namely the temperature target in degrees Celsius, which consists of 5 possible levels (1.5 • C, below 2 • C, 2 • C, 3 • C, not applicable 2 ) and the horizon of the analysis, 3 with 6 levels (2025, 2030, 2040, 2050, 2100, not applicable). Moreover, we identify provider-specific variables to capture the methodology applied to translate the climate scenario developments into climate financial risk values. These include the output type, which can take four values (Balance sheet effects, 4 Financial asset metric, 5 Alignment gap, Risk score). We also define individual dummies for the inclusion of firm climate targets, and CAPEX plans in the methodology. Finally, we create a categorical variable for the model approach, which can be top-down, bottom-up, or combined (both top-down and bottom-up). Table 1 provides the definitions of the explanatory variables. For a descriptive variables overview, Table D.8 in the Appendix provides the weights of the different categories for each of the explanatory variables.

Results
We find that climate risk metrics display a significant degree of heterogeneity, which reflects the complexity of assessing climate risks, as well as the different methodologies and data underpinning these metrics. Yet, risk assessments across metrics tend to converge on firms that are most exposed to transition risks. Second, we find strong evidence that the methodology adopted and 1 Note that we also include an alignment metric, as the output can be used as a proxy for risk through gap analysis. 2 The category ''not applicable" implies that the variable is not an input in the provider's assessment approach. 3 Note that this is not the horizon of the scenario. For example, a provider might employ a scenario that runs until 2100, but the horizon of the analysis is for example the risk in 2030 or 2050. 4 Assessing the effect of a scenario outcome on the cost and return structure of a specific firm. 5 Assessing the effect of a scenario outcome on the value of a financial asset (bond, equity, etc.) and/or a portfolio of financial assets.

Table 2
Drivers of coherence across provider.

Table 3
Convergence of high-risk exposed firms: difference in pairs (left), pairs per quintile (right) the inclusion of forward-looking information affect the estimated risk value more than the underlying scenario. Last, we show that within the same modeling approach, lower temperature targets increase risk estimates, longer time horizons increase the estimated risk, and an orderly transition scenario delivers lower risk estimates than a disorderly transition scenario.

Convergence across risk metrics
To assess the convergence between metrics, we first rank the firms according to their metric-specific estimated risk exposure. We then classify them into five risk categories -from 1 for the least exposed firms to 5 for the most exposed firms. The indicators that we consider to assess the degree of convergence between pair of metrics are the average difference for a firm risk category between the two metrics (''Absolute distance'', AD) and the percentage of firms with identical risk categories in the two metrics (''Agreement rate'', AR).
We look at the convergence between providers and explore whether the scenarios underlying the output of providers are associated with changes in the coherence between two metrics. For that, we assess whether the coherence increases when they are based on similar hypotheses for the horizon of their assessments, for the temperature target that they consider and for the shape of the transition that they model. 6 Our results show that metrics sharing similar scenario characteristics, having similar horizon, temperature target and hypotheses on the shape of the transition improves the coherence between metrics both when measured with the AD and the AR (see Table 2).
Metrics from different providers tend to converge more for firms that are the most exposed to transition risk. To show that, we estimate the excess frequency of observing a combination of assessments for the same firm in our sample compared to the frequency that would occur if assessments were fully heterogeneous. To reflect the fact that characteristics of metrics might impact the coherence, we only compare only the pairs of assessment for metrics that have a similar horizon, temperature targets, hypotheses on the shape of the transition, and output indicators (see Table 3).

Across-metrics analysis
The assessments produced by the different metric providers are expressed in different units and scales, which makes comparison difficult. To overcome this issue, we rescale the risk assessments according to where denotes the firm and the metric. The rescaling produces a new vector of assessments for each metric from the different providers, with values ranging between 0 and 1.
To formally identify the main drivers of the divergence, we use a robust to outliers panel OLS with heteroskedasticity-and cluster-robust standard errors. Note that in our specification the panel dimension is given by the same firms being assessed across several providers rather than across time. We use as dependent variable the deviation in the risk assessment for company by metric 6 Concretely, we divide metrics between those with an assessment horizon between 2025 and 2040, and those with a longer horizon, between the metrics with a temperature target of 2 • C or below and those above 2 • C, and those that models an orderly or a disorderly transition.
where denotes the company, is the th characteristic of provider and is the total number of characteristics considered. The results of our regression are reported in Table 4. We find that changes in the temperature target are not statistically significant. Hence, we cannot infer that a higher temperature target, compared to the baseline (which is a 1.5 • C temperature target), is associated with a lower risk assessment -even though the sign of the estimated coefficients is in line with our expectations as a higher temperature target implies a less stringent transition and hence lower risk.
Adopting a longer time horizon of analysis is associated with higher risks until 2050 compared to the baseline (2025). This is in line with the fact that most climate transition scenarios assume transition activities to start relatively slowly in the near future, ratcheting up ambition considerably until 2050, when the climate targets are then fulfilled. Yet, given that most of the coefficients are statistically non-significant, we cannot infer that a longer time horizon is associated with first a higher and later a lower risk.
For the year 2100, the estimated coefficient (although non-significant) suggests that such a long time horizon is associated with a lower risk, compared to 2025. This is likely because all transition activities are assumed to be implemented by then at the latest, and the risks in the very distant future have -albeit being very uncertain -less impact on today's economic and financial values than risks in the near future. Specifying no time horizon compared to adopting the baseline horizon, is associated with a decrease in risk, with a strongly statistically significant coefficient. This effect is likely to capture the fact that metrics, which do not account for any time horizon, are structurally different from metrics that do assess climate transition pathways over time. Hence, this effect might also capture modeling differences other than considering the time horizon itself.
In contrast to the temperature target and the time horizon, the types of output produced by the metric are statistically significant. Holding everything else constant, if the output is a financial indicator, a gap, or a risk score, the associated risk is higher than in the case in which the output metric captures balance sheet effects. This effect has a similar magnitude for financial metrics and alignment gaps and is a bit less pronounced for metrics that are risk scores. This finding suggests that the metrics' output type is an important driver of the metrics' risk assessments. In other words, the modeling approach adopted to produce a specific output type is a key driver of the quantified risk exposure.
Considering individual firms' climate targets and CAPEX plans in the climate transition risk metrics quantification is associated with a higher risk. The variables are both statistically significant at the 1% level and exhibit a strong quantitative effect. Intuitively, firms' climate targets should be associated with higher risks, since the firm would be better prepared for the transition. However, the climate targets might not be sufficient for companies to align their activities with the transition. Hence, one reason for this result could be that analysts looking at the firm climate targets consider them as not sufficient, and hence find the respective firms riskier. With regards to the CAPEX plans, similar considerations can be made: Today's CAPEX plans are rarely aligned with what would be required to achieve the climate targets. On the opposite, they currently tend to lock-in companies into carbon-intensive technologies. Considering this lock-in effect in the risk analysis intensifies the anticipated transition risks.
Finally, holding everything else constant, adopting a combined top-down and bottom-up approach compared to a bottom-up approach is associated with lower risk. Adopting a top-down approach does not have a significant impact on the risk assessment.
To check how our model would perform in predicting the risk exposure of companies, we also apply a LASSO regression to our model. The results are also reported in Table 4. Overall, we see that the LASSO estimates confirm the result obtained with the simple robust OLS regression in terms of coefficient signs, and in most cases also in terms of the magnitude of the estimated effect. Yet, as would be expected, the statistical significance of most coefficients is lower, except for the inclusion of CAPEX and the firm target variables.
The LASSO regression coefficients suggest that for the out-of-sample understanding of key risk drivers, the time horizon, CAPEX considerations, and the approach are important. The fact that the LASSO did not confirm the statistical significance of the output type, provides a hint that our intuition from the full sample linear regression from before might be reasonable. The output type variables might capture very specific analysis decisions and modeling approaches adopted by the metric provider, which we were not able to capture by the explanatory variables in the present analysis.

Within-provider analysis
Some providers deliver multiple specifications for their metrics. Specifically, they assessed the companies in our sample for different temperature targets, time horizons, and transition paths. We thus run an OLS regression to investigate the impact of these characteristics on the output produced by the same provider. We rescale the risk assessment by the smallest and largest value computed by the provider across all specifications instead of within a specific metric as before. We use heteroskedasticity-robust, but not cluster-robust standard errors because we focus on one provider in each regression. Accordingly, the number of metrics and the explanatory variables vary across the regressions, depending on the various metric specifications that were provided. The results are reported in Table 5.
Overall, we see that within a certain metric, temperature target, horizon, and transition path matter for the risk assessment. The provider delivering Metrics 5-6 produced the results reported in the first column. A below 2 • C temperature target is associated with a higher risk compared to a 2 • C target. The provider of Metrics 7-30 assessed risks for multiple temperature targets and time horizons under two different assumptions about firms' behavior (without adaptation/inaction; with adaptation/mainstream). In this case, considering a 3 instead of a 2 • C temperature target decreases the risk, as expected. Considering a below 2 • C instead of a 2 • C target is associated with a higher risk. A longer time horizon is associated with a lower risk compared to the baseline horizon of 2025, but the coefficients are not significant. Assuming that additional climate transition activities become mainstream across all firms is associated with a lower risk, compared to the situation in which companies are inactive in the transition and just follow the market because it implies that companies are more ready for the transition. For the provider of Metrics 31-46, all the coefficients for the temperature targets have the expected sign, as the risk decrease when considering a higher temperature target compared to a temperature target of 1.5 • C. Considering a longer time horizon is associated with higher risks, compared to the baseline year 2025. Yet, the estimate for 2030 is not statistically significant. This provider assesses a delayed and an immediate transition. The assumption of an immediate transition pathway increases the risk, but not significantly. The results for the fourth providers' Metrics 53-64 show that, again, a higher temperature target is associated with lower risk compared to the baseline of 1.5 • C. In addition, for this provider, a longer time horizon is always associated with higher risk. Finally, for the provider of Metrics 66-73, a longer time horizon is always associated with higher risk, compared to the baseline horizon of 2025. Differently from Metrics 31-46, an immediate transition is associated with lower risk.
Like before, to provide some out-of-sample performance check of our results, we run a within-provider LASSO regression. The results are reported in Table 4. The results are consistent with the ones obtained with traditional OLS except for the fact that in the case of the provider of Metrics 7-30, 31-46, and 66-73, the variable time_horizon_2030 was dropped.

Discussion and conclusion
We analyze whether climate transition risk metrics of different providers converge or diverge for a sample of 1565 firms, and which underlying metric characteristics (i.e. scenario and modeling choices) are associated with changes in the estimated values.
First, we find that climate transition risk metrics display a significant degree of heterogeneity, although they tend to converge for those firms that are assessed as most exposed to transition risks. Second, we find evidence that the metrics underlying the estimation methodology affect the estimated risk value more than the chosen scenario assumptions. Third, we show that within the same provider (i.e. with the same modeling approach), lower temperature targets increase risk estimates, longer time horizons increase the estimated risk, and an orderly transition scenario delivers lower risk estimates than a disorderly transition scenario.
Our results have important implications for the use of climate risk metrics. First, our finding that metrics tend to converge on which firms are the most exposed to transition risks shows that, despite the general heterogeneity, they provide a signal and information on the highest transition risk exposures. Second, the scenario and methodology underlying the metrics do have a considerable impact on the magnitude of the estimated risk. It is therefore important to understand how metrics are built and estimated, to choose the ones that are the most appropriate for specific use-cases. Third, firms, which disclose climate risks should also report the underlying methods, data sources, and scenario assumptions in addition to the metrics' values, to allow third parties to properly understand the disclosed information.
The present analysis focused on understanding the drivers of risk valuations, without assessing which underlying models are best suited to assess climate risk. Identifying which models would be best is beyond the scope of our analysis, and is left for future research. Also, despite the careful selection of variables, our findings could be enriched with more granular information on the modeling setup, and which specific assumptions and modeling decisions by the providers have the largest influence on the resulting estimated risk value. With regards to the comparability of the metrics, we furthermore had to re-scale the values to a common baseline. This approach entailed a certain loss of information. Future research should identify methods for comparing the estimated risk values without the loss of information on the relative assessments of the specific metric across the entire sample.
For finance research and academia, our results show that an explicit justification of the selection of a specific climate risk metric, instead of just using any metric which is available, should become a standard quality criterion. This also implies that all findings should be interpreted in the light of the metric assumptions.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Data availability
The authors do not have permission to share data.