Cumulative carbon emissions and economic policy: in search of general principles

We exploit recent advances in climate science to develop a physically consistent, yet surprisingly simple, model of climate policy. It seems that key economic models have greatly overestimated the delay between carbon emissions and warming, and ignored the saturation of carbon sinks that takes place when the atmospheric concentration of carbon dioxide rises. This has important implications for climate policy. If carbon emissions are abated, damages are avoided almost immediately. Therefore it is optimal to reduce emissions significantly in the near term and bring about a slow transition to optimal peak warming, even if optimal steady-state/peak warming is high. The optimal carbon price should start relatively high and grow relatively fast. © 2019 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).


Introduction
In the last decade, climate science has delivered two important and related insights. First, global warming appears to be approximately linearly proportional to cumulative emissions of carbon dioxide. Second, the temperature response to an emission of CO 2 appears to be approximately instantaneous and then constant as a function of time. As Ricke and Caldeira (2014) write, "it is a widely held misconception that the main effects of a CO 2 emission will not be felt for several decades" (p1).
In this paper, we build a climate-economy model based on these insights and assess the implications for optimal global climate policy, in search of general principles. At the heart of the model is a simple, linear function relating warming with cumulative CO 2 emissions, with at most a very short delay. We combine this with reduced-form representations of climate damages and the costs of CO 2 emissions abatement, each of which is capable of capturing stylised facts. Our model belongs to the class of 'analytical Integrated Assessment Models' (Golosov et al., 2014;Traeger, 2015;Rezai and van der Ploeg, 2016; van d e nB i j g a a r te ta l . ,2 0 1 6 ; Lemoine and Rudik, 2017;Gerlagh and Liski, 2018), 1 developed to provide more transparent results than numerical IAMs (e.g. the DICE, FUND and PAGE models) and energy models (see Clarke et al., 2014). 2 Our approach, based on cumulative carbon emissions, is particularly useful for evaluating optimal peak warming of the planet, and the circumstances in which the 1.5-2 • C target range of peak warming in the Paris Agreement can be given support in a globally aggregated, welfarist framework. We show (Proposition 1 and Corollary 1) that optimal peak warming depends sensitively on several parameters that are highly uncertain, implying that optimal peak warming itself is highly uncertain. We suggest that if each parameter is calibrated on the breadth of relevant evidence and opinion -i.e. this does not necessarily reflect our own opinions -optimal peak warming is 3.4 • C. However, we are also able to identify a wide range of circumstances in which peak warming of 2 • C or less is optimal. We further show that the relatively short adjustment timescale of temperature to cumulative emissions can be ignored in calculating optimal peak warming and all that follows.
Our model is also simple enough to enable the characterisation of the optimal transition path to peak warming in closed form. A key insight of this exercise is that the optimal transition is slow: it is optimal to put in significant effort early o n ,i no r d e rt os l o wt h er a t eo fi n c r e a s eo fc u m u l a t i v eC O 2 emissions. Consequently the uncertainty about optimal transient warming in 2100 is much lower than the uncertainty about optimal peak warming. We show that this is fundamentally due to the stock-flow nature of CO 2 -induced warming, in the context of the structural assumptions made in our model about damages and abatement costs. Climate scientists have argued for some years that transient warming is a more policyrelevant variable than equilibrium warming (e.g. Allen et al., 2009) and our results give this view an economic grounding.
We obtain a closed-form solution for the optimal carbon price (Proposition 2). It shows that the optimal carbon price does not just increase at the growth rate of the economy, a key result of Golosov et al. (2014), rather it increases faster. The fundamental reason why is the saturation of carbon sinks (Corollary 2), a positive climate feedback whereby more of a CO 2 emission remains in the atmosphere, the higher is the background atmospheric CO 2 concentration. This is ignored, or given insufficient treatment, by economic models. Due to the saturation of carbon sinks, the marginal effect of cumulative emissions on warming is constant (barring the very short initial delay). Assuming damages are a convex function of warming, this implies the optimal carbon price increases faster than aggregate output. Quantitatively, this effect adds around 0.5 percentage points to the initial growth rate of the optimal carbon price under central parameter values, falling to about zero in 100 years.
Having characterised what we might call the unconstrained optimal path, we consider the effect of a policy constraint to reflect the temperature limits set out in the Paris Agreement, namely "Holding the increase in the global average temperature to well below 2 • C above pre-industrial levels and pursuing efforts to limit the temperature increase to 1.5 • Ca b o v e pre-industrial levels". In our model this can be represented by an inequality constraint -an upper limit -on cumulative CO 2 emissions. Proposition 3 shows that the optimal carbon price under a binding temperature constraint comprises the social cost of carbon, plus a Hotelling premium to ensure inter-temporally efficient use of the cumulative emissions budget implied by the 1.5-2 • C limit. Many studies have sought to derive optimal emissions and carbon prices under such a temperature constraint (see Lemoine and Rudik, 2017;Clarke et al., 2014, review numerical energy models). What distinguishes our approach is that the planner does not just minimise discounted abatement costs, rather the planner still values damages by minimising the discounted sum of abatement and damage costs.
We finish up by showing what difference this makes, by running the temperature-constrained model ignoring damages. The optimal price path to minimise abatement costs just follows the simple Hotelling rule (Proposition 4). This contrasts with the common conception that the cost-minimising carbon price follows an augmented Hotelling rule, increasing at the rate of interest plus the depreciation rate of atmospheric CO 2 . But it also contrasts with the recent findings of Lemoine and Rudik (2017).Intheir paper, the cost-minimising carbon price starts low and grows very slowly for more than half a century, because a long assumed delay between CO 2 emissions and warming buys time. Our result again comes from taking into account the saturation of carbon sinks, as well as from not over-estimating the delay between emissions and warming. When we compare the cost-effective price path with the price path that maximises net benefits, we show that ignoring damages leads the planner to delay emissions cuts (Proposition 5). This effect is large: initial emissions are 31% lower when damages are included in its determination, under central parameter values.
The rest of the paper is structured as follows. Section 2 lays out the building blocks of the model and provides a detailed justification of them, starting with the science alluded to above. Section 3 studies optimal emissions in the model, focusing on peak warming, the speed of transition to peak warming, and carbon prices. Section 4 introduces the constraint on warming made salient by the Paris Agreement. Section 5 concludes. Fig. 1. Temperature response of 16 × 16 climate model combinations to an instantaneous 100 GtC emission of CO 2 on a background atmospheric CO 2 concentration of 389 ppm, following the same method as Ricke and Caldeira (2014). The equilibrium climate sensitivity is fixed at 3.2 • C. The solid black line is the response of DICE-2013R.

A linear model of warming
Our climate model is based on two important results from Earth system modelling. First, as mentioned above, the temperature response to an emission of CO 2 is approximately constant as a function of time, except for a short initial adjustment period of ten years or so (Matthews and Caldeira, 2008;Shine et al., 2005;Solomon et al., 2009;Eby et al., 2009;Held et al., 2010;Joos et al., 2013;Ricke and Caldeira, 2014). Fig. 1 demonstrates this result for a representative set of 16 carbon-cycle models and 16 atmosphere-ocean general circulation models, closely following the approach of Ricke and Caldeira (2014).Second, the warming effect of a CO 2 emission does not depend on the background concentration of CO 2 in the atmosphere Gillett et al., 2013). As we now show, insofar as the temperature response to CO 2 emissions is both time-and concentration-independent, warming is linearly proportional to cumulative CO 2 emissions.
The two stages of (i) CO 2 emissions raising the atmospheric CO 2 concentration and (ii) elevated atmospheric CO 2 causing global temperatures to rise can be collapsed into a single parametric relationship between cumulative emissions and warming. This has been defined by IPCC as the Transient Climate Response to Cumulative Carbon Emissions (TCRE: Collins et al., 2013). Formally, the TCRE is The TCRE is the product of temperature change per unit increase of atmospheric carbon, ΔT∕ΔM, and the increase in atmospheric carbon per unit of cumulative emissions, ΔM∕ΔS. ΔT∕ΔM is a concave increasing function of time, because of thermal inertia, i.e.
it takes time before an energy imbalance will lead to a new equilibrium temperature given the large heat capacity of the oceans.
Conversely ΔM∕ΔS is a convex decreasing function of time, because carbon is gradually absorbed by the biosphere and oceans. Warming from a CO 2 emission is constant over time in Earth system models, because the rate of increase of ΔT∕ΔM is cancelled out by the rate of decrease of ΔM∕ΔS, except for the first five to ten years. The physical explanation for why these processes mirror each other is that the sequestration of heat and carbon by the oceans are both governed by the same mixing of surface and deep ocean waters Solomon et al., 2009;Goodwin et al., 2015;MacDougall and Friedlingstein, 2015). Models also find the TCRE is independent of the background atmospheric CO 2 concentration, M.A sM increases, it is well known that ΔT∕ΔM decreases, due to CO 2 becoming less effective at absorbing outgoing long-wave radiation. The relationship is approximately logarithmic. However, again this is cancelled out in Earth system models by an increase in ΔM∕ΔS,d u et o the saturation of the ocean carbon sink MacDougall and Friedlingstein, 2015). 3 In contrast to the timeindependence of the TCRE, however, there is no obvious physical explanation for why these two processes more-or-less exactly offset each other.  Insofar as the TCRE is independent of time and CO 2 concentration, we can interpret it as a time-invariant parameter and global warming is approximately linearly proportional to cumulative CO 2 emissions Zickfeld et al., 2009Zickfeld et al., , 2013Gillett et al., 2013;Collins et al., 2013). Fig. 2 reproduces an important chart from the Fifth Assessment Report of the Intergovernmental Panel on Climate Change (IPCC), which illustrates this. The linear relationship is produced by almost all Earth system models. The observational record is more noisy, but does not imply a non-linear relationship. 4 The resulting quasi-linearity between cumulative emissions and warming allows us to obtain an extremely simple climate model. The global mean temperature at a point in time responds to cumulative emissions up until that point in time: where T is warming since pre-industrial times and parameterises the 'initial pulse-adjustment timescale' of the climate system (Allen, 2016). According to the climate models in Fig. 1,t h i si sa b o u tt e ny e a r sa n ds o ≈ 0.5. In the limit as → ∞(0),the planet warms instantly (never warms) in response to emissions. S is cumulative emissions of CO 2 ,sȯ where E is the instantaneous flow of emissions.
The climate science set out here has significant implications for how IAMs and analytical IAMs are parameterised. Some IAMs like William Nordhaus' DICE assume a substantial delay between emissions and warming. Fig. 1 shows that in DICE it takes 55 years for the temperature response to an emission of 100 GtC to peak, not ten. 5 Therefore analytical IAMs calibrated on the DICE climate will also warm up too slowly in response to emissions (e.g. Lemoine and Rudik, 2017). In addition, IAMs and analytical IAMs typically do not include the feedback created by saturating carbon sinks, i.e. they do not model the removal of atmospheric CO 2 as a function of the background CO 2 concentration. 6 Millar et al. (2017) have shown that, without such saturation of carbon sinks, simple climate models underestimate observed atmospheric CO 2 decay in the past, overestimate decay in 2100 compared 6 As the ocean absorbs CO 2 , it evolves towards a new equilibrium , i.e. acidification, a feedback that is of much greater importance. Therefore DICE has a lower TCRE under a high background CO 2 concentration, because the decreasing effectiveness of CO 2 in absorbing outgoing longwave radiation is modelled, while the saturation of carbon sinks is not.
with the projections of Earth system models, and are unable to reproduce the linear response of warming to cumulative CO 2 emissions set out in Fig. 2.
Before moving on, there are a number of limitations of a linear model of warming, which we should point out. First, the relationship is an approximation. As S increases beyond a threshold of c. 7,700 gigatonnes of CO 2 , almost all models find that begins to slightly decrease (MacDougall, 2016). Similarly some models find the TCRE varies slightly with the rate of CO 2 emissions (e.g. Krasting et al., 2014), i.e. it is not fully path-independent. Second, the relationship only holds for CO 2 and does not represent warming from other greenhouse gases. Ideally one would model the behaviour of other greenhouse gases separately, but this adds significant complexity. Instead, other forcing agents can be accommodated in the linear model by assuming total anthropogenic warming is a fixed fraction of warming induced by CO 2 alone, e.g. 10% higher (Allen, 2016). Fig. 2 indicates this is a good characterisation of the past 150 years or so, while there is no clear case for assuming the ratio of CO 2 -induced warming to total anthropogenic warming will be higher or lower in the future; it could go either way. Accordingly, in our numerical modelling we multiply by a factor of 1.1 to account for non-CO 2 greenhouse gases. Third, the theory of linear warming comes from Earth system models; it cannot be directly tested using observations of the climate system, although related observations do not refute it. Lastly, we implicitly assume that the so-called Zero Emissions Commitment, i.e. the temperature change that occurs after emissions have ceased, is negligible. While many models find this is true, some models find non-trivial warming or cooling after emissions have ceased (Frölicher and Paynter, 2015).

Economic model
A social planner chooses consumption per capita c and emissions E in order to maximise a discounted classical/total utilitarian social welfare functional: where W is social welfare, n is the population growth rate (the initial population is normalised to unity) and is the utility discount rate. Appendix A confirmsthisisequivalenttoadecentralisedcompetitive market equilibrium with a Pigouvian tax on CO 2 emissions. Instantaneous utility is where is the negative of the elasticity of marginal utility. We assume production is homogeneous of degree one in capital and labour and technological change is labour-augmenting. Aggregate output is given by whereL = e (n+g)t is effective labour, g is productivity growth andk = K∕L is capital per unit of effective labour. We assume positive and diminishing returns to capital and labour, and that the Inada conditions hold with respect to both. We keep the functional form of f(k) general, but assume that initial k is close to its steady state (we return to the role of this assumption in a moment). We work with an exponential-quadratic damage function mapping warming to a loss of welfare-equivalent output: where D(T) is a damage multiplier on output. It is important to recognise the appropriate form of the damage function is notoriously uncertain. There is little empirical evidence that is directly relevant (Pindyck, 2013) and it has been argued that costs have been systematically underestimated in the economic literature, at least at high levels of warming (Weitzman, 2009(Weitzman, , 2012, if not indeed at any level (Stern, 2013). With this caveat front and centre, (7) is a good fit of the data in the meta-analysis by Nordhaus and Moffat (2017). We capture the relationship between production and emissions by thinking of E as an input (Brock, 1973). This simply captures the idea that, in order to produce a given amount with fewer emissions, more capital and labour are required. By entering the production function through the multiplier exp , the marginal productivity of emissions is assumed to be linear decreasing in emissions, when expressed as a proportion of GDP: This also serves as the marginal abatement cost (MAC) function in our model, since abatement A can be defined as baseline or business-as-usual emissions ∕ minus emissions, A ≡ ∕ − E. Doing so allows us to rewrite (8) as This MAC function has two key properties. First, the MAC is proportional to output. The main driver of this proportionality is energy demand. Economic growth drives up energy demand, which in turn drives up the MAC, because most low-carbon energy  technologies have decreasing marginal productivity (e.g. wind energy at less windy and/or more expensive locations). Second, the MAC, as a proportion of output, increases linearly as a function of abatement. This is an unrealistic assumption for a large instantaneous increase in abatement (where the MAC function is likely to be convex increasing), but a more realistic assumption for small increases in abatement over time, because technological progress provides a countervailing effect to any convexity of the MAC that results from moving along the instantaneous MAC curve (Bramoullé and Olson, 2005;Neij, 2008). Fig. 3 looks at evidence from the IPCC Fifth Assessment Report on the shape of the MAC function, when expressed as a proportion of GDP, and when abatement is a function of time. These are results derived from a variety of different energy models. It can be seen that a linear increasing function is a relatively good fit of the data. Capital accumulation equals production less consumption and depreciation . Expressed per unit of effective labour, k =q −ĉ −( + n + g)k.
(10) Table 1 presents central values of the model's parameters, as well as ranges from the literature. 7 In order to solve the model in closed form, we assume the economy is approximately on a balanced growth path throughout, with constant growth of output per capita net of climate damages and abatement costsg ≡̇q∕q, and a constant savings rate. 7 We combine and n in view of their diametrically opposing effects in the model (on the utility discount rate). The parameters and are jointly determined, so their respective minima, central values and maxima must be taken together.

Table 2
Decomposition of the growth rate in a numerically optimised model with our central parameters (more details of the optimisation procedure in Appendix D).

Time (years)
exp( 2 Strictly speaking the economy will only converge to a balanced growth path when emissions approach zero, 8 but we assume the economy starts sufficiently close to this path, for two reasons. First, we can assume the global economy has been on a balanced growth path in the past, because growth of global output per capita has been broadly trendless since the late 19th century (Maddison, 2010;World Bank, 2018). This implies productivity growth g is constant and k is at its pre-climate-change steady state. Second, on the optimal path, temperature and emissions from now on should have a small effect on production relative to labour-augmenting technological progress, if this technological progress continues at the same rate as recent decades.
To see this formally, we can manipulate (6) into an expression forg ∶ The factor fkk is negligible, assumingk is already close to its steady state at the start. What about TṪ? T starts at about 1 • C and increases gradually, but because temperature increases slowly on the optimal path (Ṫ is small) and representative values of the damage function coefficient are small, TṪ is much smaller than g. Similarly, even though E andĖ are significantly larger than g until the long run, the calibrated values of and are again very small relative to g,s ot h a t( − E)Ė will amount to a small subtraction, overall. See Table 2. These arguments may appear novel, but they are not. The effects of temperature and emissions abatement typically found in the literature are very small when expressed as a reduction in the growth rate (e.g. Clarke et al., 2014). This does not mean climate change is a small problem, indeed we will derive low optimal emissions paths and high optimal carbon prices below. Temperature effects on growth rates may not be small on non-optimal, high emissions paths.

The optimal path
The (unconstrained) optimal path is obtained when the planner solves (4),subjectto(2), (3), (10) and initial S, T and K.The current value Hamiltonian, expressed per unit of effective labour, is , where S is the shadow price of cumulative emissions, T is the shadow price of temperature andk is the shadow price of capital per unit of effective labour. The Hamiltonian is defined such that all shadow prices are positive. Substituting fork,the necessary conditions for a maximum include , we have a standard Ramsey-Cass-Koopmans model with labouraugmenting technological progress, which is known to have a balanced growth path. Our climate model implies that, in the long run, non-zero emissions would lead to infinite warming (due to the linear warming response), which in turn would lead to zero consumption. For a wide range of parameter estimates, the model instead converges on a constant temperature in the steady state (see below). Therefore the factor exp is constant in the long run and the model converges to a standard Ramsey-Cass-Koopmans model. On a balanced growth path,q∕q is constant. Strictly speaking we need only assumėĉ c −̇q q is constant, which is a weaker condition, because both deviations from the balanced growth path,̇c andq, have the same sign (in the phase diagram of the Ramsey model, the stable arms are in the regions wherėc andk have the same sign), and because is relatively close to 1. For a log utility function ( = 1), balanced growth is not required, rather a constant savings rate is sufficient, as in Golosov et al. (2014).
Eq. (15) is the Ramsey rule and governs optimal capital accumulation. Eq. (12) expresses the well-known optimality condition that the social cost of carbon S must always equate to the MAC.
Integrating (14) gives an expression for the shadow price of temperature: Bearing in mind that S is the social cost of carbon, T just represents the welfare cost of an initial increase in temperature, for given cumulative emissions. Since the climate system is linearly proportional to cumulative emissions except for the short initial adjustment period of c. ten years, this means the effect of the initial temperature perturbation disappears after ten years too.
That is why T is found by discounting the flow of marginal disutility from the temperature perturbation by a delay-adjusted rate − n + , where the central value of is0.5,sothedelay-adjusteddiscountrateismorethan50%.
Over this short period, we can also safely assume that the marginal disutility of warming is constant: while marginal utility c − decreases over the space of a few years in a growing economy, marginal damage q T is increasing in a warming world, and neither will change much. This allows us to make the following approximation of (16): Assumption 1. Because the climate system adjusts quickly to CO 2 emissions, c −η qγT is constant over short periods and therefore This assumption allows us to rewrite (13) aṡ Taking the time derivative of the first-order condition in (12) and substituting this into (13),weobtain Then applying the assumption of balanced growth gives us an expression for the evolution of emissions: Integrating (2) gives As was the case with (16),thefactthat = 0.5 means the value of the integral (21) is dominated by just a few years, in this case the most recent few years. In other words, to determine warming at time t it is nearly sufficient to know cumulative emissions at the same time, and the history of emissions has little effect. Over such a short period, we can treat the growth rate of cumulative emissionsasaconstant, ≡̇S∕S. Then: Assumption 2. Because the climate system adjusts quickly to CO 2 emissions, is constant over short periods and We can then substitute (22) into (20) to obtaiṅ Rearranging (23) and substitutingṠ for E, we arrive at a linear differential equation for cumulative emissions: Clearly the linearity of (24), combined with constant coefficients and a constant term, is key to obtaining a closed-form solution for the optimal path. 9 It is worth taking a moment to interpret the constants a, b and c, as they will often appear in the remainder of the analysis. The constant a is the standard 'Ramsey' discount rate minus the growth rateg.Assuchitisthediscountratethatisappliedto the future flow of marginal damages from a tonne of CO 2 emitted at time t, when those damages are expressed as a proportion of output. This can be shown by integrating (18) with respect to time, dividing both sides by c − t q t = c − q eg ( −1)( −t) and defining where SCC % is the social cost of carbon as a proportion of GDP.
The reason that marginal damages as a proportion of output are discounted at the reduced rate − n + ( − 1)g is that output growth has two countervailing effects on the social cost of carbon at any instant. On the one hand it reduces the present value of future damages, because it reduces marginal utility in the future. This is the conventional effect of consumption growth on discounting. On the other hand it increases the undiscounted value of future damages, because they are proportional to output inthemodel.Thisisanimportantfeatureofmodels where damages are multiplicative.
The first element is the delay factor, which can be further broken down into the physical effect of delay on marginal damages, ∕ ( + ), and the discounting effect of delay, ∕ ( − n + ). If temperature would adjust instantaneously to CO 2 emissions, then the delay factor would be equal to one and b = 2 ∕ . This second element of b can also be written as Q S ∕−Q A (A∕S).
This can be interpreted as the ratio of the slope of the marginal damage function with respect to S and the slope of the MAC function with respect to E, when both marginal damages and abatement costs are expressed as a proportion of output. 10 This ratio turns out to be central to interpreting our results for the optimal transition path. Lastly, the constant c = a ∕ ,w h e r e ∕ is baseline/business-as-usual emissions.
Returning to the task of solving the optimal path, the solution to the differential Eq. (24) is: The particular integral c∕b is the inter-temporal equilibrium value of S. At the inter-temporal equilibrium, the growth rate of cumulative emissions = 0andfrom(11) it is clear thatg = g,so Appendix B demonstrates that S * is dynamically stable.

Peak warming
At S * , the linear climate model dictates that the maximum increase in the global mean temperature relative to the preindustrial level is simply T * = S * ,so: Proposition 1. [Optimal peak warming] In the climate-economy system characterised by (2)-(6), optimal peak warming is given by Proposition 1 tells us the maximum warming of the planet that is optimal from an economic point of view. The first element is the delay factor, but, not for the first time, the fact that is much larger than − n is significant. It means the delay factor will invariably be close to one. Take the central values of these three parameters as set out in Table 1; = 0.5and − n = 0.006. 9 An extension to our model would be to have marginal damages and MACs that are not linearly proportional to production, i.e. q T =− q T and q E = q Φ ( − E),where and Φ are the elasticities. This leads to an alternative differential equation.
which is linear if =Φ, although there is no obvious reason why this equality should hold. 10 In a version of the model without delay, D(S)=exp and so Q S =− 2 2 SQ. See Appendix C.

Parameter
Point elasticity of T * with respect to parameter Sign Then the delay factor is equal to 1.012. Even if we set − n = 0.03, which we can take as about the maximum value that is plausible, the delay factor is equal to a still modest 1.06. 11 Corollary 1. [The delay factor is insignificant to optimal peak warming] Because the climate system adjusts quickly to CO 2 emissions, optimal peak warming can be approximated by This is also naturally the exact solution of the model when warming is simply assumed to be an instantaneous function of cumulative emissions, as shown in Appendix C. Appendix D shows using numerical techniques that the versions of the model with and without a temperature delay give very similar optimal warming and are both very close approximations of the numerical solution to the maximisation problem, which takes into account the short delay, the feedback from temperature and emissions to the growth rate, and does not depend on Assumptions 1 or 2. Comforted by this, we henceforth work with the model without a temperature delay.
In Table 3 we compute the point elasticities of T * with respect to the parameters that feature in (29). We find that optimal peak warming is an increasing function of the pure rate of time preference , a new version of an old result. Since there is no delay between CO 2 emissions and warming from those emissions, this is fundamentally due to the long residence time of CO 2 in the atmosphere. Close inspection of the point elasticity of T * with respect to reveals that it is equal to the ratio of to a, the discount rate on SCC % .P o p u l a t i o ng r o w t hn has the opposite effect on peak warming to ,b e c a u s ei tr e d u c e st h e population-adjusted discount rate. 12 Increases in both and the productivity growth rate g result in an increase in T * ,providedthat ≥ 1. Moreover, comparing the two elasticities, it is clear that the elasticity of T * with respect to is larger by exactly g, which reflects the fact that, whereas only has an effect on the discount rate, g affects both the discount rate and the undiscounted value of marginal damages, as explained above. Three of the model parameters have an especially simple relationship with optimal peak warming. There is a negative unit elasticity of T * with respect to , the TCRE parameter, and , the coefficient of the damage function. A one per cent increase in either of these parameters reduces T * by one per cent. Conversely there is a unit elasticity of T * with respect to ,themarginal cost of zero emissions. Notice that peak warming is independent of the parameter that governs the slope of the MAC function. Fundamentally this is because T * is determined by comparing the social cost of carbon at T * with the abatement cost of zero emissions (see Eq. (25)), which does not depend on . If we plug the parameters' central values from Table 1 into Eq. (29), we obtain optimal peak warming of 3.4 • C, corresponding to stationary cumulative emissions of 7,014 GtCO 2 since the beginning of the industrial revolution. With central values of , n, and g, the consumption discount rate is about 3.1%, while the central value of implies that 2 • Cw a r m i n gc a u s e sal o s so f output of 2% and 4 • C warming causes a loss of output of 8%. Therefore damages in the central case are relatively modest and they are discounted at a medium rate, which explains why T * is well above 2 • C. Considering the ranges of parameter values in Table 1, it is clear that T * is highly sensitive to most of the model parameters.
Take for instance the TCRE parameter . A central estimate from climate science might be 0.00048 • C/GtCO 2 .B u tt h er a n g e of uncertainty about spans approximately ±50%. Given that T * has a unit elasticity with respect to , T * varies by ±50% accordingly. Much the same is true of the other two parameters with a unit elasticity: the range of uncertainty either side of the central value of is −50% to +100%, while for it is −40% to +120%. The elasticities of T * with respect to the other four parameters are non-constant, however in most cases they can also be expected to be large. Holding the other parameters to their central values, E will be close to one over the range of , which according to Druppetal.(2018)is −45% to +209%. E is particularly high, ranging from 1.3 for maximum to 3.2 for minimum ,w i t hE = 2.1 for the central value (again holding the other parameters to their central values). This makes clear the limitations of models that assume log utility when thinking about uncertainty governing optimal warming. 11 Bycontrast,ifweweretoset = 0.06 in order to match DICE's climate dynamics, the delay factor would be equal to 1.1. 12 Notice that in the limit as → 1 (i.e. log utility), the elasticity of T * with respect to − n is one: a doubling of − n leads to a doubling of optimal peak warming. Higher tempers this, but given the magnitudes involved it does so only slightly.  Fig. 4 plots optimal peak warming as a function of variation in the model parameters. For this we impose a constraint that cumulative emissions may not exceed 'burnable carbon' embodied in the Earth's fossil fuel resources. 13 The constraint binds only with respect to . When looking at sensitivity with respect to , bear in mind that, not only does lower (higher) result in higher (lower) optimal cumulative emissions, it also results in lower warming as a result of those emissions. Observe that when − n is set to its minimum value of 0.1%, T * = 2.0 • C. When is set to its minimum value of roughly one, T * = 1.6 • C. When is set to its maximum value, such that 2 • C warming causes a loss of output of 4% and 4 • C warming causes a loss of output of 16%, T * = 1.7 • C. Many combinations of parameter values support optimal peak warming of 2 • Corbelow.

The slow transition to equilibrium
While an analysis of optimal peak warming reveals useful information, it does not reveal how long it takes for warming to peak along the optimal path and therefore it is unlikely to reveal the key features of optimal emissions in the near future.
Appendix B demonstrates that the transition to S * is governed by Since b > 0, the exponent is negative and cumulative emissions approach their stationary value c∕b asymptotically. Put another way, optimal emissions are strictly decreasing, at a decreasing rate. There is an intuitive explanation for this: the social cost of carbon as a proportion of output is an increasing function of S. 14 Since E =̇S > 0, SCC % increases all along the path. Since the MAC function is linear increasing as a proportion of output, the necessary condition for an optimum that SCC % = MAC % means that emissions must decrease all along the path. It is not optimal for emissions to peak at t > 0, for instance.
But how fast do emissions approach zero? In other words, how long does it take for warming to approach its peak? It turns out that the answer is slowly, very slowly indeed. Fig. 5 plots optimal paths of T over the next 250 years that correspond with our central parameter values, as well as with scenarios of high and low damages, which we choose as being illustrative of the transition path when optimal peak warming is low and high respectively. These optimal paths are obtained by plugging Eq. (30) into (2). 13 These are estimated to be in the region of 22,000 GtCO 2 , including fossil fuels burned since the beginning of the industrial revolution (Nordhaus, 2008). When some parameters take extreme values, optimal cumulative emissions may exceed this. This constraint gives peak warming of 10.6 • C for the central value of . Although optimal peak warming corresponding with our central parameter values is 3.4 • C, optimal (transient) warming ac e n t u r yf r o mn o wi sj u s t1 . 7 • C; 250 years from now it is 2.5 • C. When damages are high, optimal peak warming is 1.7 • C, but optimal warming a century from now is just 1.3 • C. When damages are low, optimal peak warming is 6.7 • C, but optimal warming in a century's time is only 2.2 • C. So, while peak warming is highly sensitive to the parameters that determine it, warming over the next couple of centuries is much less so.
Why is the transition so long? The rate of change of emissions iṡ (31) A slow transition to peak warming implies | |̇E ∕E | | is small. The reason for this is that b is very small. Recall that b is the ratio of the slope of the marginal damage function with respect to S and the slope of the MAC function with respect to E, both expressed as a proportion of output: Eq. (25) shows that Q A is much larger than Q S , because the latter is a perpetual stream of damages from a non-decaying stock of CO 2 . The second factor of b, A∕S, is also small, because abatement A is a flow and S is a non-decaying stock. Therefore this result bears the imprint of the flow-stock nature of CO 2 -induced warming. It is this flow-stock property that leads to the result illustrated by Fig. 5, where optimal emissions in the near term are much less sensitive to parameter variations that lead to large differences in optimal peak warming. The short delay between emissions and warming is a driving force behind this result. Since damages occur almost immediately, it is worth avoiding them from the start. The flow-stock dynamic also stems from the fact that warming does not decay in our climate model. However, a weakness of the model in characterising the transition to peak warming is that it ignores 'locked-in' emissions from the capital stock existing at t = 0, which will in reality constrain near-term emissions reductions, presumably leading to a transition path where emissions are higher in the near term and lower in the long term, and where warming thereby approaches its peak faster. A simple way to account for this and therefore to test the robustness of our stylised finding of a slow transition is to increase initial S by the cumulative emissions embodied in the global capital stock today, assuming it is operated to the end of its economic lifetime. Davis and Socolow (2014) have estimated that future cumulative CO 2 emissions embodied in global power plants in 2012 were 307 GtCO 2 . 15 Adding this to initial S, the transition to peak warming is faster, but only marginally so. For central parameter values, optimal warming a century from now rises from 1.7 • Cto1.9 • C. When damages are high, it rises from 1.3 • Cto1.4 • C.

Carbon prices
As well as peak warming, we can characterise the optimal carbon price by differentiating (30) with respect to time, substituting the resulting expression into (8) and rearranging:

Proposition 2. [The optimal carbon price] In the climate-economy system characterised by T = S, (3)-(6), the optimal carbon
price is .
Proposition 2 shows that the evolution of the carbon price depends on two factors. On the one hand, the carbon price is proportional to output, so as output grows at the rateg + n the carbon price does likewise, all else being equal. We call this the growth effect. On the other hand, the carbon price depends on emissions, which means that the evolution of the carbon price is also subject to the emissions dynamics set out above.
In particular, what we call the emissions effect increases, but it does so at a decreasing rate, since emissions converge to zero in the long run. The overall effect is that p * grows at a rate that is initially faster than aggregate output, but converges tog + n asymptotically, with the transition governed via E by a and b: Alternatively, integrating (13) gives: In the steady state, the optimal carbon price expressed as a percentage of GDP is . As a corollary to Proposition 2, we can show that in our model the optimal carbon price grows at the same rate as aggregate output if damages are an exponential-linear rather than exponential-quadratic function of warming. If damages are exponentiallinear in warming, then marginal damage is constant in cumulative emissions.

Corollary 2. [The optimal carbon price under exponential-linear damages] In a climate-economy system where D(T)=exp (− T)
, the optimal carbon price grows at the rateg + n.

Proof. If D(T)=exp
Hence the carbon price is a fixed proportion of aggregate output, Q E = Q − n +( − 1)g , and increases atg + n. Golosov et al. (2014) also found that the optimal carbon price grows at the same rate as the economy, although they assumed damages are an exponential-linear function of atmospheric CO 2 , not of temperature, i.e. Q = Q 0 exp (− M).T h er e l a t i o n s h i p between the two approaches can be better understood if we decompose marginal damage as a function of cumulative emissions, Relating this back to Eq. (1), the right-hand side is marginal damage as a function of warming, multiplied by the TCRE in the limit as Δ → 0. In Golosov et al. (2014), dlnQ∕dS is constant, because increasing marginal damages with respect to temperature d 2 ln Q∕dT 2 > 0 are assumed to be exactly offset by decreasing marginal climate sensitivity d 2 T∕dM 2 < 0 (not to be confused with equilibrium climate sensitivity), and marginal carbon sensitivity is constant d 2 M∕dS 2 = 0 . By contrast, in our model dlnQ∕dS is constant if and only if marginal damages are constant with respect to temperature, because decreasing marginal climate sensitivity is assumed to be exactly compensated by increasing marginal carbon sensitivity. That is, the TCRE is constant. So the optimal carbon price grows faster than the economy in our standard model (Proposition 2), because marginal damages are an increasing function of cumulative emissions, and the saturation of carbon sinks means that marginal carbon sensitivity is increasing. Fig. 6 plots optimal carbon prices under our central parameter values, and in scenarios of low and high damages. The optimal carbon price corresponding with our central parameter values starts at $44/tCO 2 today and increases to $59 in 10 years' time, $185 at t = 50 and $729 at t = 100. The rate of increase of the optimal price falls from 3.0% (real) initially to 2.7% after 100 years, which is close to the growth rate of aggregate output, assumed to be just under 2.5% (roughly 2% productivity growth, plus 0.5% population growth). The optimal price in the low damages scenario starts at $26/tCO 2 and increases to $36 after 10 years, $118 at t = 50 and $488 at t = 100. This reinforces the message of the previous passage that, even if optimal peak warming is high, optimal transient warming over the coming centuries is low. Achieving this requires a significant and significantly increasing carbon price. Again the rate of increase of the optimal price in this scenario falls over time, but at 3.2% it is initially higher than the central case, falling to 2.8% after 100 years. The optimal price in the high damages scenario starts at $68/tCO 2 and rises to $966 after a century. The price grows in this scenario at a rate of 2.8% initially, falling to 2.6% after a century.

The optimal path under a temperature constraint
Important as it is to examine the unconstrained optimum of the model, so far 185 countries have ratified the Paris Agreement, the central aim of which is "Holding the increase in the global average temperature to well below 2 • C above pre-industrial levels and pursuing efforts to limit the temperature increase to 1.5 • C above pre-industrial levels". This indicates that, as a description of the real world, the maximisation problem in Section 3 could be under-specified. Rather, we might say that the Paris Agreement leaves us with the objective of maximising (4), subject to the usual constraints, plus the inequality constraint that S ≤ S,where S = T and T is 2 • C (or even 1.5 • C). 16

Maximising welfare subject to the temperature constraint
Technical details are relegated to Appendix E. Solving the constrained maximisation problem, we find: [The optimal carbon price under a binding temperature constraint] When cumulative CO 2 emissions are constrained such that S ≤ S, where S = T, the optimal carbon price is where t is the time when the cumulative emissions constraint binds. Therefore the optimal carbon price under a temperature constraint equals the social cost of carbon, plus a premium, which is a function of the cumulative emissions constraint and which increases at the discount rate. 17 The premium therefore follows Hotelling's rule, ensuring that the cumulative emissions budget implied by S < S * is allocated in an inter-temporally efficient manner If the temperature constraint binds, we have Therefore the premium is strictly positive, which further implies that emissions will be lower everywhere on the constrained path compared with the unconstrained path. The Hotelling price premium required to stay within the temperature constraint is significant, even in the relatively near term. Fig. 7 shows that the Hotelling premium begins at $4/tCO 2 today, rising to $5 in 10 years, $23 in 50 years and $150 in 100 years, under central parameter values. This is on top of a social cost of carbon of $41/tCO 2 today, $55 in 10 years, $170 in 50 years and $641 in 100 years. Notice that the social cost of carbon is lower than in the corresponding unconstrained optimisation (Fig. 6), because cumulative emissions and therefore warming are lower. When the Hotelling premium is added on, however, the overall carbon price is higher than its equivalent in the unconstrained optimisation. Fig. 7 also shows that when = 0.005 the Hotelling premium is a larger share of the carbon price, both because the social cost of carbon is lower and because, with higher optimal unconstrained warming, the constraint binds earlier.

Minimising abatement costs to meet the temperature constraint
Most studies on the costs of emissions abatement solve a different problem to the preceding section. In particular, they ignore climate damages and determine the emissions path that meets the constraint S at minimum total discounted abatement cost (Clarke et al., 2014). This is often referred to as cost-effectiveness analysis, as opposed to cost-benefit analysis. In our set-up, the cost-effective policy is the solution to maximising (4),subjecttoS ≤ S, but where the marginal disutility of warming is zero. The optimal carbon price path follows straightforwardly from integrating Eq. (36):

Proposition 4.
[The cost-effective carbon price] When cumulative CO 2 emissions are constrained such that S ≤ S, where S = T, and damages are ignored, the optimal carbon price is That is, inter-temporal efficiency is ensured by letting the carbon price follow the simple Hotelling rule. This is different to the standard assumption that the cost-effective carbon price increases at the 'augmented' Hotelling rate, i.e. at the consumption discount rate plus the decay rate of CO 2 in the atmosphere. This assumption rests on atmospheric decay creating a reason to • C when the discounted sum of total abatement and damage costs are minimised, compared with when only abatement costs are minimised, and when temperature is unconstrained but optimal peak warming is 2 • C(high ).
postpone abatement, since CO 2 emitted earlier has the chance to decay more. Decay also enlarges the carbon budget for given T. However, while this is true in and of itself, the saturation of carbon sinks, which our model implicitly accounts for, has the opposite effect; additional emissions today saturate the carbon sinks earlier. Saturation of carbon sinks reduces the carbon budget for given T. Lemoine and Rudik (2017) have argued for a different kind of augmented Hotelling rule, in case there is a substantial delay between emissions and warming, as suggested by the DICE model. This enlarges the carbon budget for given T. But in our climate model there is only a short delay between emissions and warming. This, together with saturation of carbon sinks, more-or-less exactly offset the effect of decay of atmospheric CO 2 . Section 2 showed that DICE is too slow to respond to CO 2 emissions. Consequently the simple Hotelling rule is in fact appropriate. Appendix E shows that the rate of emissions reduction must be faster on the cost-effective path than on the cost-benefit path. Because both paths must result in the same cumulative emissions, the cost-effective path must therefore begin with higher emissions, but eventually cross the constrained cost-benefit path and reach zero emissions faster.

Proposition 5.
[Cost-effective emissions abatement is lower initially, but higher eventually] Compared with the emissions path that maximises net benefits, subject to the emissions constraint, the cost-effective emissions path has higher emissions initially, but emissions fall to zero earlier. Fig. 8 shows the difference in the cost-benefit and cost-effective emissions paths, for central parameter values. We also include for illustration an unconstrained, welfare-maximising emissions path, where is solved backwards so that optimal peak warming is 2 • C. Initial emissions on the cost-effective path are about 44% higher than on the constrained cost-benefit path, but the rate of emissions reduction is always higher and the two paths cross after about 50 years. Finally, observe how low and flat the emissions path is when optimal peak warming is 2 • C; initial emissions are about 31% lower than on the constrained cost-benefit path.

Conclusions
In this paper we have built a model of optimal CO 2 emissions by exploiting recent advances in climate science, which have identified a near-instantaneous and quasi-linear warming response to cumulative CO 2 emissions, and combining them with reduced-form representations of climate damages and the costs of CO 2 emissions abatement, which are capable of capturing the stylised facts of the large applied literature on each topic.
The model is surprisingly simple and yields closed-form solutions for optimal peak warming, optimal emissions along the transition to peak warming and optimal carbon prices, including under a temperature constraint that is consistent with the Paris Agreement. We draw five conclusions: 1. Optimal peak warming depends on: the utility discount rate; the elasticity of marginal utility; population growth; productivity growth; the marginal cost of abatement at zero emissions; the transient climate response to cumulative carbon emissions; and the damage function coefficient. Moreover optimal peak warming has a unit elasticity with respect to the last three of these parameters, and an elasticity of around one or more with respect to most of the others. Large uncertainty about some of these parameters therefore means there is large uncertainty about optimal peak warming. 2. Even if optimal peak warming is high, optimal transient warming over the coming centuries is not. The transition is slow, because of the stock-flow nature of CO 2 -induced warming. If optimal peak warming is 3.4 • C, optimal transient warming one century from now is only 1.7 • C. 3. The optimal carbon price initially grows faster than output per capita, converging to the same rate in the long run. The underlying reason is that damages are a convex function of cumulative emissions, which is amplified by the saturation of carbon sinks. For central parameter values, we calculate that the optimal carbon price grows 0.5 percentage points faster than the economy initially. 4. The optimal carbon price under a binding temperature constraint comprises the social cost of carbon, plus a Hotelling premium. If we take account of damages, then we should abate emissions more quickly than if we simply meet the temperature constraint at the lowest discounted abatement cost. This effect is quantitatively large. 5. When the objective is to minimise abatement costs alone, the optimal carbon price follows the simple Hotelling rule, not various kinds of augmented Hotelling rule, as in previous work. This is because the small delay between CO 2 emissions and warming, together with the saturation of carbon sinks, more-or-less exactly offset the effect of decay of atmospheric CO 2 .
Finally, our paper has generated many points of comparison with the literature, particularly other analytical IAMs. We synthesise these points of comparison in Table 4, with a focus on rules for optimal carbon price growth and the cumulative emissions budget. The rate of decay of atmospheric CO 2 is denoted . The results are independent of the shape of the MAC curve, and the damage functions in the cost-benefit models are all virtually equivalent (assuming a unit elasticity of marginal damages with respect to income), so the differences between the pricing rules and cumulative emissions budgets come down to features of the climate system. The table highlights the crucial role of feedback from the saturation of carbon sinks to the decay of atmospheric CO 2 , which is not present in other models and is a key driver of warming being linearly proportional to cumulative emissions.

A. Equilibrium in a decentralised economy
Competitive firms maximise profit taking T and wage payments wL as given. E are emissions tax payments, iK are interest payments on household savings 18 and K is depreciation of capital.L = L 0 e (n+g)t represents effective labour. The representative household maximises subject to an aggregate budget constrainṫ With L = L 0 e nt andk = K L 0 e (n+g)t , the household's budget constraint is the same as equation of motion for capital in the social 18 iK can be thought of as both 'normal' interest and dividend payments, while Π represents extra-ordinary profits, such as resource rents or oligopoly rents, unrelated to the marginal productivity of capital.
planner's problem: The government hands back the income from an emissions tax as a lump-sum transfer. Household utility maximisation yields the Ramsey rule Q K − = − n +̇c∕c. Profit maximisation ensures that net marginal productivity equals the yield paid on capital Q K − = i. Firms choose emissions that maximise profits: If the government sets a Pigouvian emissions tax at = S e (n+g)tĉ ,with S satisfying (12)- (14), the decentralised economy will follow the same emissions path as the social planner's solution.

B. The transition to stationary cumulative emissions
Convergence to S * is dictated by the complementary function .
We may assume b > 0 and hence the characteristic roots are real.
In order to satisfy the transversality condition on cumulative emissions, lim t→∞ e (n− )t S = 0, S may not increase at a rate larger than − n: Applying l'Hôpital's rule gives Substituting this with the state Eq. (13) yields Since T is always positive, the transversality condition requires the denominator to be positive. Hence the transversality condition is violated if E > .Ifk 2 > 0, cumulative emissions would be on an explosive increasing path, leading to negative marginal productivity of emissions and violating the transversality condition. Consequently k 2 = 0. The initial condition on cumulative emissions S 0 implies k 1 = S 0 − c b , so the transition to S * is described by

C. The optimal path in a model without delay
The model without delay has Eqs. (8)-(6) in common, but the climate model and its relationship with damages are now different. Because warming is an instantaneous function of cumulative emissions, it is simply the case that Hence we can write damages as a direct function of cumulative emissions, , and dispense with a state variable in the Hamiltonian, which is now just .
The necessary conditions for a maximum include Taking the derivative with respect to time of (35) and substituting it into (36) gives: Then applying the assumption of balanced growth gives us an expression for the evolution of emissions: which, after following the same steps as in Section 3, eventually delivers T * = [ − n + ( − 1) g] .

D. Model comparison
In this paper we have shown that exact solutions can be obtained for the optimal path of CO 2 emissions and warming in a quite general framework, albeit we have to take one of two shortcuts. Either we take into account the short delay between cumulative emissions and associated warming of the atmosphere, which on the other hand requires making Assumptions 1 and 2, or we ignore the short delay.
Here we compare the performance of these two simplified analytical models with the numerical solution of the 'full' model. The full model comprises discrete-time equivalents of Eqs. (2)-(4))-(6), a five-year time step in the interests of rapid computation, and a finite model horizon, where the terminal period is chosen to be far enough in the future (1000 years) that it does not exert a discernible effect on the optimal path on a decision-relevant timescale (which we take to be 250 years). Optimisation proceeds by choosing t=0 so as to maximise W = ∑ ∞ 0 e −( −n)t u(c t ), assuming a constant savings rate, constant productivity growth g = 2%, but allowing climate damages and abatement to feed back on growth. As Fig. 9 shows, the solutions of the three models are very close. After 50 years, the difference between the solutions is at most 0.01 • C (or 1%), while in 100 years' time it is 0.02 • C (or 1.6%).

Figure 9
The optimal path of T in the simplified model with an analytical solution and in the full model with a numerical solution.

E. Maximising welfare subject to the temperature constraint
We add the inequality constraint that S ≤ S,whereS = T, to the model that has an instantaneous temperature response to emissions. The current value Lagrangian is The necessary conditions for a maximum includê ≤ 0 (= 0w h e nS < S).
The constrained problem results in a modified differential equation for cumulative emissions: The constraint binds if S < c∕b.Wedefinet as the time when the constraint binds so that Note that E = 0att, because the costate variable is required to be continuous. This prevents a discontinuous fall in emissions from taking place at t = t.U n t i lt = t, = 0 and the state equation of the Lagrangian (41)  The optimal path of cumulative emissions again derives from the general solution (26) to the differential Eq. (24).Tofindk 1 , k 2 and t we have a system of three boundary conditions. The system has an analytical solution using the following approximation: at t = t. The approximation is based on the insight that, at t = t, the exponent of the first term is much smaller than unity, while the exponent of the second term is much larger than unity: , , Solving this system of equations gives: When damages are ignored and the problem is to meet the constraint S at minimum total discounted abatement cost, Eq. (45) becomes integration of which allows us to obtain a general solution for cost-effective emissions: On the cost-effective emissions pathĖ ce = aE − c, whereas on the constrained cost-benefit pathĖ cb = aE + bS − c.S i n c ebS is positive, the rate of emissions reduction is faster on the cost-effective path. Because both paths must result in the same cumulative emissions, the cost-effective emissions path must begin with higher emissions, but eventually cross the constrained costbenefit path and reach zero emissions faster (Proposition 5). Note that for a general damage function, the differential equation isĖ cb = aE + Q S ∕Q ∕ − c. Therefore Proposition 5 holds for any damage function that has positive damages over the whole path.