Inherent uncertainty disguises attribution of reduced atmospheric CO2 growth to CO2 emission reductions for up to a decade

The growth rate of atmospheric CO2 on inter-annual time scales is largely controlled by the response of the land and ocean carbon sinks to climate variability. Therefore, the effect of CO2 emission reductions to achieve the Paris Agreement on atmospheric CO2 concentrations may be disguised by internal variability, and the attribution of a reduction in atmospheric CO2 growth rate to CO2 emission reductions induced by a policy change is unclear for the near term. We use 100 single-model simulations and interpret CO2 emission reductions starting in 2020 as a policy change from scenario Representative Concentration Pathway (RCP) 4.5 to 2.6 in a comprehensive causal theory framework. Five-year CO2 concentration trends grow stronger in 2021–2025 after CO2 emission reductions than over 2016–2020 in 30% of all realizations in RCP2.6 compared to 52% in RCP4.5 without CO2 emission reductions. This implies that CO2 emission reductions are sufficient by 42%, necessary by 31% and both necessary and sufficient by 22% to cause reduced atmospheric CO2 trends. In the near term, these probabilities are far from certain. Certainty implying sufficient or necessary causation is only reached after, respectively, ten and sixteen years. Assessments of the efficacy of CO2 emission reductions in the near term are incomplete without quantitatively considering internal variability.


Introduction
Substantial year-to-year variations in the growth rate of global atmospheric CO 2 concentrations show variations that cannot be explained by land-use changes, fossil fuel emissions or the increase of carbon sink capacities due to increasing atmospheric CO 2 concentrations (Friedlingstein et al 2019, Peters et al 2017. The variations originate instead from the variability of the global carbon cycle in response to climate variability, which is inherent to the physics of the Earth System. For instance, the variations of the tropical land carbon sink is dominated by the El Niño-Southern Oscillation (Jones et al 2001, Zeng et al 2005, and the pronounced Southern Ocean carbon sink is susceptible to changes in atmospheric circulation patterns (Landschützer et al 2015, McKinley et al 2017. Therefore, this internal variability of the global carbon cycle in atmospheric CO 2 may disguise the detection of potential CO 2 emission reductions in atmospheric CO 2 observations. But CO 2 emission reductions are required to achieve the targets of the Paris Agreement (UNFCCC 2015). Here we ask what the probability is that a slowdown in atmospheric CO 2 growth is attributable to a policy change implementing CO 2 emission reductions as the difference between Representative Concentration Pathway (RCP) 4.5 and RCP2.6, in the face of internal climate variability. This question becomes policyrelevant as policy-makers assess the efficacy of CO 2 emission reductions in the Global Stocktake every 5 years (Peters et al 2017, Schwartzman andKeeling 2020). Furthermore, we ask after how many years this policy change will cause atmospheric CO 2 growth rates to slow down for certain.
The challenge of emissions reductions verification in atmospheric CO 2 concentrations was first outlined by Peters et al (2017). We address this challenge by using a large ensemble of Earth System Model (ESM) simulations (Maher et al 2019).
We integrate 100 simulations based on the code of a single ESM with slightly perturbed initial conditions that serve as different realizations of the Earth System. Our analysis compares RCP4.5, which is close to the pledged and current policies until 2035 (Rogelj et al 2016, Hausfather andPeters 2020), with an emission reductions scenario compatible with the Paris targets under RCP2.6 (figure S4 (https://stacks.iop.org/ERL/15/114058/mmedia)). We attribute a reduction of trend in atmospheric CO 2 concentrations to CO 2 emission reductions in the comprehensive causation framework of Pearl (2000) and Hannart et al (2016). In the context of CO 2 emission reductions, necessary causation means that a factual trend reduction would not have occurred without a policy change. By contrast, sufficient causation implies that while a policy change may trigger a trend reduction, this trend reduction is not certain.
We go beyond approaches in previous studies (Tebaldi and Friedlingstein 2013, Peters et al 2017, Marotzke 2019, Samset et al 2020 Keeling 2020) by comprehensively diagnosing atmospheric CO 2 variability in an ESM, which is compatible with the terrestrial and oceanic carbon sinks variations. The recently formalized emissions reductions verification of Schwartzman and Keeling (2020) uses an autoregressive model based on the observed carbon imbalance and a different statistical framework. While Tebaldi and Friedlingstein (2013) only focus on causation in a necessary causation sense, we here complete the probabilistic setting by asking also about sufficient causation. We compare two RCP scenarios in a single-model framework; formally only internal variability may undermine the detectability of CO 2 emission reductions. Assessing the contribution of our quantitative results against structural model uncertainty and imperfections is left for future study.
From a policy-maker's perspective looking into the near-term future, necessary and sufficient causation of CO 2 emission reductions slowing down atmospheric CO 2 trends deal with two different questions (Pearl 2000, Hannart et al 2016: 1. Will a policy change towards CO 2 emission reductions suffice to slow down atmospheric CO 2 growth? Other factors, such as a weakening uptake by the natural carbon sinks, may induce an increase in atmospheric CO 2 growth despite policy measures. From the viewpoint of a pathway without CO 2 emission reductions, the uncertainty in this question is based on sufficient causation. 2. Would a factual atmospheric CO 2 growth slowdown have occurred even without the policy change? This question asks whether the policy change was necessary to achieve the policy goal. From the viewpoint of a factual pathway of CO 2 emission reductions and a factual slowdown, the uncertainty in this question is based on necessary causation. Based on this causation framework, we obtain probabilities that a policy change causes atmospheric CO 2 trends to decline. However, this causation may be far from certain depending on the time-scale assessed. Should CO 2 emission reductions not soon lead to reduced atmospheric CO 2 growth trends, we might face a debate analogous to the warming hiatus debate (Lewandowsky et al 2015, Fyfe et al 2016 about why CO 2 rises faster despite falling emissions. Therefore, scientists need to communicate the role of internal variability to policy-makers and the public (Deser et al 2012).
Marotzke (2019) shows the uncertain effect of emission reductions on global mean surface temperature (GMST) 15-year trends. As atmospheric CO 2 drives the forced GMST signal, the emissions reduction signal should become detectable earlier in atmospheric CO 2 . Analyzing the effect of individual climate forcers, Samset et al (2020) confirms that anthropogenic CO 2 has the highest potential for emission reduction detection.
In our study, we also ask after how many years internal variability can still obscure the identification of CO 2 emission reductions in atmospheric CO 2 . In other words, how long does it take until certainty arises in causation? This is a distinctly different question than the classical time-of-emergence of anthropogenic signals, which asks on which timescales the climate change signal emerges from natural variability (McKinley et al 2016). Here, we ask on which timescales a forcing change induced by this policy change causes a climate response considering sufficient and necessary causation (Marotzke 2019). These timescales of CO 2 emission reduction detection might be longer than the periodicity of the Global Stocktake in which policy-makers will assess the efficiency of mitigation measures.

Causation attribution framework
To identify whether CO 2 emission reductions cause a reduction in atmospheric CO 2 growth, we apply the concept of event causation (Pearl 2000, Hannart et al 2016, Marotzke 2019. We use the scenario RCP2.6 as implementing CO 2 emission reductions and RCP4.5 for the near-term future without CO 2 emission reductions (for a detailed justification see section 2.2). Taking a decline in atmospheric CO 2 growth as an effective consequence of CO 2 emission reductions policy, we define a reduction in the linear trend in global atmospheric CO 2 concentration as the policy goal, comparing the period before emission reductions started with the period afterwards. While this response is expected as the forced response averaged over all realizations, the trend of single ensemble members could potentially increase due to internal variability. For a five-year trend period and a scenario separation in 2020, compatible with the switch from RCP4.5 to RCP2.6, we hence compare the trends 2016-2020 and 2021-2025. The fraction of responses in a given scenario s yields the probability of trend reduction P RCPs . The two scenarios serve as either the real world, labeled as factual, or the alternative world, labeled as counter-factual. The probabilities of trend reduction can be translated into a probability P {S,N} that the trend reduction is caused by the policy change (Pearl 2000, Hannart et al 2016: • In a currently pledged policy pathway (factual RCP4.5 world) without CO 2 emission reductions in near-term, we ask in advance whether CO 2 emission reductions (a policy change towards the counterfactual RCP2.6) would be sufficient to cause a reduction in atmospheric CO 2 trends. The probability P S means how likely a policy change would be sufficient to cause a reduced trend: In our case of CO 2 emission reductions, P S is important when answering questions whether the policy goal of reduced atmospheric CO 2 growth will be achieved from the perspective of a planner. • Considering CO 2 emission reductions in RCP2.6 as the factual where in retrospect atmospheric CO 2 trends have indeed declined and no CO 2 emission reductions in RCP4.5 as the counter-factual world, the probability P N that the policy change was necessary to cause the trend reduction is: In our case of CO 2 emission reductions, P N is important when answering questions whether the policy goal would not have been reached without the policy change. • Combining the two aforementioned, P NS describes the probability that the policy change is both necessary and sufficient to cause the respective trend reduction: This strongest causation probability P NS means how likely a reduced trend would occur in case of a policy change and would not occur without.
These probabilities hence describe probabilities that trend reductions over a given trend length are caused by the policy change, but how long do these trends need to be in order to be virtually certain that CO 2 emission reductions caused them? To answer this question, we define Time to Detection of CO 2 emission reductions in a causation sense D {S,N} as the trend length around CO 2 emission reductions start in 2020 for which P {S,N} > 99%, using the probability framing of Mastrandrea (2010). This time-scale marks the maximum range of influence of internal variability over changes in the forced signal due to a policy change.

Choice of scenarios
We identify RCP2.6 as roughly compatible with the Paris targets ( van Vuuren et al 2011). Compared to the pre-industrial control, MPI-ESM Grand Ensemble warms by 1.4 ± 0.2 • C in RCP2.6 and 2.2 ± 0.2 • C in RCP4.5 until the end of the century (Suarez-Gutierrez et al 2018, Maher et al 2019). Anthropogenic CO 2 emissions in RCP2.6 are increasing until 2020; after 2020, emissions are expected to decrease by 2% per year in RCP2.6 until 2030 (figure S3). By contrast, RCP4.5 has similar anthropogenic CO 2 emission levels as RCP2.6 until 2020 and continues a moderate emissions increase until 2040 with a 1% per year increase until 2030 (figures S3, S4). This scenario was designed to reach a forcing stabilization at the end of this century (Thomson et al 2011) at about 3 • C warming. Although RCP8.5 is closer to the most recently recorded combined land-use and fossil-fuel emissions, we choose RCP4.5 as a reference scenario, because the differences between RCP4.5 and reported emissions until 2018 originate in land-use change whereas fossil-fuel emissions match (figure S3). More importantly, the levels of projected fossil-fuel emissions based on current and pledged policy until 2035 parallel RCP4.5 (Rogelj et al 2016, Hausfather andPeters 2020).
In the above-described comparison under the causal theory framework, we compare trends before and after this policy change to assess causality of this policy change on changes in trends with respect to the period before the policy change. This policy change is assumed to happen in one scenario (RCP2.6), and not in the other one (RCP4.5). Therefore, we require simulations under mostly identical forcing before this policy change. We assume that this policy comes into effect as implemented by the RCP scenarios (Meinshausen et al 2011). We identify the combination of RCP2.6 vs RCP4.5 with a scenario split in 2020 as suitable scenario comparison. This scenario combination and timing also describes the present quest aiming for an at most 2 • C warmer world with net emissions reduction of 3% per year over 10 years.
Comparing RCP2.6 with RCP8.5 would be another possible combination. However, RCP8.5 entails higher fossil-fuel CO 2 emissions than recently observed and much higher levels than what current policies pledge for until 2035 (Rogelj et al 2016, Hausfather andPeters 2020). Furthermore, RCP2.6 and RCP8.5 separate at a time when emissions in RCP2.6 still grow. This would make the definition of climate event as trend reduction awkward, if our goal is to investigate the effect of policy on possible trend reductions. Therefore we compare RCP2.6 and RCP4.5 and include the comparison against RCP8.5 in the supplementary material.

Diagnostic atmospheric CO 2 concentration
CO 2 concentration-driven simulations do not represent a variable atmospheric CO 2 concentration tracer. To quantify the expected variations in global atmospheric CO 2 concentration that are compatible with variations of the global land and ocean carbon sinks, we diagnose a virtual tracer of global atmospheric CO 2 based on the changes due to internal variability of the land and ocean carbon sinks in atmospheric CO 2 [ figure S2]. The global residual CO 2 flux G i,s is the difference of CO 2 flux F i,s of the each ensemble member i to the ensemble mean of CO 2 flux F s : where M = 100 is the number of ensemble members and i the number of a single ensemble member and s the scenario. The ensemble mean F s is subject to all forcings (anthropogenic fossil-fuel CO 2 emissions, non-CO 2 emissions, land-use change, aerosols) on CO 2 flux, but no internal variability. The remaining residual shows the variations of CO 2 flux around neutral flux only due to internal variability. The forced atmospheric CO 2 signal f s is scenario s-dependent and generated by a simplified climate model fed with emissions from integrated assessment models (Meinshausen et al 2011) incorporating the strengthening of the carbon sinks with higher CO 2 concentrations and land-use CO 2 emissions. This internal variability component of time-accumulated global CO 2 flux is then superimposed on the smooth atmospheric CO 2 forcing f s and defines internally varying diagnostic global atmospheric CO 2 concentration XCO 2,i,s : We assume that the internal variability of the global carbon cycle to be driven by climate variability. This ignores the short-term effects of atmospheric CO 2 variability on CO 2 flux as in all concentration-driven simulations. Explicitly, for diagnostic atmospheric CO 2 , we use as forcing f s the concentration scenarios generated by the simplified climate model and not directly the CO 2 emissions generated by the integrated assessment models. This assumes that the emission scenarios from the integrated assessment model roughly match the resulting concentration scenarios from the simplified climate model ( figure S9). The hereby diagnosed variations of global atmospheric CO 2 capture the observed global atmospheric CO 2 variations (figures 1, S7; Spring and Ilyina (2020)). For a detailed method description and verification in emission-driven simulations, see Supplementary Information section S1.

Method limitations
The generalisability of our results strongly depend on the strength and timing of the CO 2 emission reductions underlying the two compared scenarios, where causation probabilities P {N,NS} are even more sensitive than the probabilities of reducing atmospheric CO 2 trend P RCPx in scenario x. Here, we present one special case of CO 2 emission reductions as the difference between RCP4.5 and RCP2.6 representing a net 3% annual emission reductions until 2030. There is an active debate whether RCP4.5 (argued for by Hausfather and Peters 2020) or RCP8.5 (argued for by Schwalm et al 2020) tracks the current anthropogenic CO 2 emissions pathway best. Also the attribution probabilities are contingent upon whether the climate model simulates realistic magnitudes of internal variability (Marotzke 2019). Furthermore, our attribution method focuses only on one observable variable under internal variability, although atmospheric CO 2 is the most important indicator for CO 2 emission reductions. Lastly, we use the atmospheric CO 2 concentration prescribed to MPI-ESM generated by the simplified climate model based on CO 2 emission scenarios from the integrated assessment model and not the CO 2 emissions from the integrated assessment models themselves to calculate diagnostic

Probability of CO 2 emission reductions causing changes in atmospheric CO 2 growth trend
We first assess the frequency distributions of fiveyear trends in atmospheric CO 2 . These distributions over the period 2016-2020 in RCP2.6 and RCP4.5 are nearly indistinguishable (figures 1, 2(a) and (d)). The most recent 2015-2019 observations-based estimate for global atmospheric CO 2 (Dlugokencky and Tans 2019) trend is in the upper tercile and thereby captured by our model (figures 2(a) and (d), S7). Comparing the distributions before and after CO 2 emission reductions onset in 2020 in RCP2.6, we find overlapping distributions with a tendency towards lower trends after CO 2 emission reductions (figures 2(a) and (b)). The ensemble mean responds to CO 2 emission reductions with a decrease in trend of 1 ppm over 5 years. The trend reduces in 70 ensemble members, resulting in P RCP2.6 = 70% ( figure 2(c)). This implies that with a 30% probability, atmospheric CO 2 growth will strengthen despite emissions reductions. In RCP4.5, the distributions of atmospheric CO 2 trends before and after 2020 look similar because the emissions rise steadily. Hence, only roughly half of the ensemble members show a reduced trend, with P RCP4.5 = 48% (figure 2(d)-(f)). The atmospheric CO 2 may increase more strongly despite the onset of CO 2 emission reductions, when the global carbon cycle triggered by internal climate variability releases more CO 2 than CO 2 emission reductions save. For instance, this is possible when the tropical forests react to higher temperature and less precipitation caused by a strong El Niño event (Jones et al 2001, Zeng et al 2005. The released CO 2 from the tropical biosphere persists in the atmosphere and can overwhelm the reduction of anthropogenic emissions (figure 1). These stronger atmospheric CO 2 growth trends despite CO 2 emission reductions might occur for trend comparisons around the CO 2 emission reductions start of up to ten years (figure 3).
These probabilities of trend reduction of the two scenarios can be converted into probabilities of trend reduction being caused by CO 2 emission reductions (see section 2.1). If asked in advance in 2015, the answer would be that a policy change from RCP4.5 to RCP2.6 representing CO 2 emission reductions starting in 2020 are sufficient to cause a fiveyear trend reduction in atmospheric CO 2 growth by Here, this policy change works toward a trend reduction, but the trend reduction might also be prevented by internal variability. Asking from a 2025 perspective looking into the recent past, CO 2 emission reductions in 2020 were necessary by P N = 31% to cause trend reductions (figure 3). This policy change causes the five-year trend reduction in a necessary and sufficient sense by P NS = 22% (figure 3, dark blue in box). These results show that CO 2 emission reductions are far from certain to cause trend reductions in global atmospheric CO 2 growth when considering five-year trends.
To estimate the time-scales when CO 2 emission reductions are virtually certain to cause reduced atmospheric CO 2 growth trends, we consider trends calculated over different time window lengths around the CO 2 emission reductions start. As expected, the shorter the trend-lengths considered, the more dominant internal variability is. Therefore, trend reductions are less likely occurring in the CO 2 emission reductions scenario. The 3-year-trend probabilities of trend reduction even overlap with the 50% random forecast ( figure 3). Conversely, when longer trends are considered, the influence of the signal of emissions change becomes stronger. CO 2 emission reductions reduce atmospheric CO 2 trends in RCP2.6 virtually for certain only when considering ten-year-trends (figure 3). In contrast, trend reductions are still possible due to internal variability despite the absence of CO 2 emission reductions for much longer in RCP4.5 (figure 3). Note that under RCP4.5 the annual anthropogenic CO 2 emissions increase very little until the 2040 s. Therefore, a few members can still have reduced trends over time. Consequently, P RCP4.5 does not drop to 1% until 2042.
The low causation probabilities over short timescales show the inability to clearly attribute reduced atmospheric CO 2 trends to a policy change from RCP4.5 to RCP2.6 due to the large internal variability. The longer the time-scales considered, the stronger the two scenario pathways differ, and the attribution probabilities rise. If P RCP2.6 > P RCP4.5 as assumed by the response to CO 2 emission reductions, P S increases more quickly than P N when P RCP2.6 approaches 1 faster than P RCP4.5 0. Therefore, in the context of the scenarios RCP2.6 and RCP4.5, P S > P N (Marotzke 2019). This means that in our context, sufficient causation is a stronger causation facet than necessary causation. Sufficient causation P S describes whether the objective of reduced atmospheric CO 2 trends is met, which might be prevented by internal variability. As soon as growth trends decline in all realizations (P RCP2.6 = 1), also P S saturates. In contrast, necessary causation P N describes whether the response of reducing atmospheric CO 2 would only have happened in the presence of CO 2 emission reductions. Therefore, as long as trend reductions are possible even without CO 2 emission reductions, necessary causation will not be certain, that is, if P RCP4.5 > 0, then P N < 1.
The time to detection of CO 2 emission reductions D {S,N} describes after how many years this policy change is virtually certain to cause atmospheric CO 2 growth trends to decline. CO 2 emission reductions sufficiently cause trend reductions after D S = 10 years and necessary cause of reduction after D N = 27 years. We note that once sufficient Figure 3. Probabilities of trend reduction in diagnostic atmospheric CO2 between periods of varying trend length before and after CO2 emission reductions start in 2020. PRCP2.6 (green) shows the probability of trend reduction in CO2 emission reductions scenario RCP2.6. PRCP4.5 (red) shows the probability of trend reduction in the currently most likely scenario for the near-term RCP4.5. PS (pale blue) show the probability that a change from RCP4.5 to RCP2.6 causes the respective trend reduction in a sufficient causation sense. PN (blue)show the probability that a change from RCP4.5 to RCP2.6 causes the respective trend reduction in a necessary causation sense. PNS (dark blue) shows the probability that change from RCP4.5 to RCP2.6 causes the respective trend reduction in a sufficient and necessary causation sense. Error bars show the 1% and 99% confidence intervals based on bootstrapping with replacement. Dotted lines show 99% confidence interval for time of virtual certainty in trend reduction or causation (D {S,N} ). Results for policy-relevant five-year trends are highlighted in the gray box. causation is certain, P S = 1 in 2030 see (1), necessary causation and causation both necessary and sufficient coincide, P N = P NS ; compare (2) and (3) with P RCP2.6 = 1. Virtual certainty in P {N,NS} is hindered by P RCP4.5 above 1%. Due to the slow increase in emissions in the 2030 s, internal variability allows a few members to have increasing trends. Taking a less strict threshold of 95% certainty like in Tebaldi and Friedlingstein (2013), we obtain D N = D NS = 16 years. This time-scale of CO 2 emission reductions detection in a necessary causation sense D N is a bit longer than the similarly defined estimate based on IPSL-CM5A-LR (Tebaldi and Friedlingstein 2013, table 1). Our analysis also shows that whether this policy change from RCP4.5 to RCP2.6 can be identified as the cause of reduced atmospheric CO 2 trends after 10 or 16 years depends on the causation attribute. The differently defined emission reduction detection protocol of Schwartzman and Keeling (2020) finds a similar detection delay of 9 ± 4 years for comparable 2% net annual emissions reduction.

Summary and conclusions
In the context of potential future CO 2 emission reductions, we ask whether atmospheric CO 2 growth trend reductions in the near term can be attributed to a policy change. We focus on one specific pathway of CO 2 emission reductions interpreted as a policy change from scenario RCP4.5 without near-term CO 2 emission reductions to emissions reduction scenario RCP2.6 designed to achieve for the Paris targets representing 3% net annual CO 2 emission reductions until 2030. We apply a causation framework comprising two perspectives of policy elaboration (Hannart et al 2016, Marotzke 2019. We diagnose atmospheric CO 2 variations compatible with the natural carbon sinks variations and compare growth trends of atmospheric CO 2 before and after the onset of CO 2 emission reductions in 2020 in RCP2.6. While 5-year trends reduce in 70% of all realizations in the CO 2 emission reductions scenario RCP2.6 (consequentially implying increasing trends despite of CO 2 emission reductions by 30%), there is 48% probability of trend reductions in RCP4.5. This translates into CO 2 emission reductions from RCP4.5 to RCP2.6 being sufficient to cause a five-year trend reduction beforehand by 42% and in hindsight necessary by 31%. The probability that this policy change is both necessary and will suffice to bring the desired outcome considering five-year trends is only 22%. These probabilities are far from certain for up to a decade. It takes ten or 16 years of CO 2 emission reductions from RCP4.5 to RCP2.6 to virtually certainty cause a trend reduction in a sufficient or necessary causation sense, respectively. Communicating these probabilities in a clear manner is challenging but needed to inform policy-makers about the impact of internal variability on CO 2 emission reduction causation in the Earth system (Deser et al 2012, Hannart et al 2016, Howe et al 2019. The five-year Global Stocktake following the Paris Agreement (UNFCCC 2015) makes the five-year internal variability highlighted in this study especially relevant for policy-makers. This study demonstrates the inherent uncertainty in near-term atmospheric CO 2 projections. As a partial solution to this challenge, initialized ESM-based prediction systems can reduce this uncertainty by predicting natural variations of the global carbon cycle. Global oceanic CO 2 flux is predictable for two to three years (Li et al 2019, Lovenduski et al 2019 and global atmospheric CO 2 variations have the potential to be predicted for up to three years in advance (Spring and Ilyina 2020). These multi-year ESM-based predictions of the global carbon cycle thereby bring added value about the expected natural variations of atmospheric CO 2 to policy-makers in the Global Stocktake process (UNFCCC 2015).
Our analysis shows that it is crucial to have realistic expectations of the efficacy of climate policy in the near term (Marotzke 2019, Samset et al 2020. Also Schwartzman and Keeling (2020) find a detection delay of up to a decade in a different approach. Even if anthropogenic emissions begin to decline after 2020, there still remains a substantial probability that atmospheric CO 2 trends will not have declined five years afterwards. In this case, the effects of CO 2 emission reductions on other iconic climate variables, such as global mean surface temperature, very likely get delayed even longer (Marotzke 2019). The likelihood of this happening is substantial. For instance, there is a three-out-of-ten chance that atmospheric CO 2 rises even stronger in the five years after CO 2 emission reductions started compared to before. Assuming the evolution of the RCPs (Meinshausen et al 2011) and the magnitude of internal variability in the global CO 2 fluxes in MPI-ESM-LR, such increasing atmospheric CO 2 growth trends despite CO 2 emission reductions from RCP4.5 to RCP2.6 are possible for up to a decade. Although this analysis relies on only a single model, internal variability may disguise CO 2 emission reductions efforts in the Earth System for a couple of years. Should this be the case, climate science should explain the observed atmospheric CO 2 evolution honoring internal variability. Policy makers should rather be informed by initialized predictions about the internal variability in the near-term evolution of atmospheric CO 2 (Betts et al 2018, Spring andIlyina 2020). Evaluation of CO 2 emission reduction efficacy from an atmospheric CO 2 perspective needs to take internal variability, and therefore longer than 5-year trends, into account.