LINKING PERMIT MARKETS MULTILATERALLY

We formally study the determinants, magnitude and distribution of efficiency gains generated in multilateral linkages between permit markets. We provide two novel decomposition results for these gains, characterize individual preferences over linking groups and show that our results are largely unaltered with strategic domestic emissions cap selection or when banking and borrowing are allowed. Using the Paris Agreement pledges and power sector emissions data of five countries which all use or considered using both emissions trading and linking, we find linking can generate annual efficiency gains of up to US$3.26 billion, split roughly equally between effort and risk sharing. « [E]conomic theory suggests [carbon] markets should merge, and over time they probably will merge because of beneﬁts of consolidation, including stability and lower cost. ... [A recent study on linking found] beneﬁts accrued for all parties as the market grew, though some parties beneﬁtted more than others ... Expanding the partnership can be welfare improving in total but it can have distributional eﬀects ... [That is,] diﬀerent countries may beneﬁt more or less depending on the mix of buyers and sellers who join any given market. »

any linkage group as simple functions of aggregate gains in all its internal bilateral linkages, thereby representing a linkage group as the union of its building blocks.
These decompositions are complementary. The decomposition into effort and risk sharing offers a compact and intuitive interpretation of individual efficiency gains as a function of the expectation and variance of the autarky-linking price difference. Yet, it is unclear prima facie how efficiency gains, especially those due to risk sharing, relate to jurisdictions' characteristics. In this respect, the decomposition into internal bilateral links enables us to easily compute the efficiency gains generated by arbitrary linkage groups, which constitutes a pivotal tool for a quantitative illustration of our model. Additionally, it allows us to tease out and formally analyze the determinants of linkage gains and preferences.
Specifically, to study the efficiency gains from linking ETSs multilaterally and under uncertainty we start from a standard framework featuring permit demand shocks à la Weitzman (1974) and Yohe (1978). Our benchmark model is set up in a static environment where domestic emissions caps are assumed exogenously given and fixed to isolate the efficiency gains from linkage. The benchmark model abstracts from endogenous selection of domestic caps and intertemporal permit trading. We formally analyze the implications of allowing them in two extensions to our benchmark model below and show that our results continue to hold.
Our bilateral decomposition result allows us to rank groups from the perspective of individual jurisdictions and characterize the aggregate gains from the union of disjoint groups analytically. In turn, we emphasize why the conditions for the global market to be the most preferred group universally are unlikely to be satisfied in practice and we show that jurisdictional preferences for smaller linkage groups cannot be aligned without politically unpalatable compensatory monetary transfers. Additionally, we clarify the relationship between autarky and linking permit prices. In line with one's intuition, we show that relative to autarky linkage reduces price volatility on average though not necessarily for each individual entity.
We provide a precise characterization of this effect.
We illustrate the quantitative implications of our model by focusing on all possible linkages across ETSs covering the CO 2 emissions from the power sectors of five real-world jurisdictions which all use or have considered both emissions trading and linking. Specifically, we calibrate our model to Australia, Canada, the EU, South Korea and the USA under the assumption that each jurisdiction implements its Paris Agreement pledge. We find that the linkage group which includes all five jurisdictions generates aggregate efficiency gains of $3.26 billion (constant 2005 US$) per annum which are split approximately equally between effort sharing, $1.58 billion, and risk sharing, $1.68 billion. Despite generating the largest aggregate gains, we observe that this linkage group is not the most preferred option unanimously. In fact, it is not the most preferred option for any individual jurisdiction. For instance, the USA would gain the most in a linking group with Australia and Europe. This three-jurisdiction group would also lead to lower price variability than in the group where all jurisdictions are linked.
How are these results altered if jurisdictions anticipate the option of future linking when choosing their domestic emissions caps, or if unrestricted intertemporal permit trade is allowed? First, we endogenize domestic cap selection based on self interest and in anticipation of linking à la Helm (2003). We derive closed-form solutions for the induced strategic and damage welfare impacts from linkage. The signs and magnitudes of these impacts are ambiguous and depend on the modeling structure and parameter distributions, as the subsequent literature, e.g. Carbone et al. (2009) and Gersbach & Winkler (2011), attests. Crucially, they exist independently of the efficiency gains we focus on here, justifying the omission of these effects from the benchmark model. Second, we show in a multi-period setting how the introduction of unrestricted intertemporal permit trading alters, but crucially does not eliminate, the efficiency gains due to linking. In our quantitative illustration we find that allowing for unrestricted intertemporal trading reduces the effort-and risk-sharing gains by about 30% and 60%, respectively. Throughout we abstract from economic and political costs of linking which could preclude linkages that are otherwise beneficial. For example, large and persistent differences in jurisdictional ambition levels imply some jurisdictions are net permit buyers in mutually beneficial transactions but which nonetheless trigger ongoing financial transfers. Both the financial transfers in the buying jurisdictions and the persistently stricter-than-cap emission levels in the selling jurisdictions can face domestic political resistance. In fact, the balance between the efficiency gains and linkage costs may be one reason why some jurisdictions are already linked (e.g. California and Québec) while other links are expected to take a long time to emerge (e.g. the EU and the Chinese national ETS). In this paper we exclusively study the efficiency gains not because we think economic and political costs are negligible but because the efficiency gains provide a strong incentive for jurisdictions to overcome them.
First and foremost, our paper is related to the literature on the economics of linking which has primarily emphasized three sources of gains from linking agreements, namely price convergence, a cost-effective reallocation of abatement efforts and a reduction of price volatility (Stevens & Rose, 2002;Flachsland et al., 2009;Fankhauser & Hepburn, 2010;Pizer & Yates, 2015;Ranson & Stavins, 2016;Doda & Taschini, 2017;Quemin & de Perthuis, 2018;Rose et al., 2018). Our two decomposition results allow us to formalize and refine these arguments 4 in a multilateral setup under uncertainty. Specifically, we offer a precise characterization of both effort-sharing and risk-sharing gains from linkage, qualifying the results in Newell & Stavins (2003) and Caillaud & Demange (2017), who respectively studied efficiency gains in using market-based instruments relative to command-and-control policies, and linking disjoint ETSs. Additionally, we utilize our bilateral decomposition result to get a better sense of linkage preferences and we further characterize permit price properties.
While our work is framed in the context of linking permit markets, it also relates to the use of efficiency-improving trading ratios within permit markets (Holland & Yates, 2015). It is similar in spirit to the multinational production-location decision studied in de Meza & van der Ploeg (1987) and the choice of decentralization in permit markets analyzed by Yates (2002). Additionally, our results can have implications for interconnections between other types of supply-control programs with transferable licenses (e.g. production or fishery quotas) and international trade (e.g. cross-border electricity trading or energy unions). For instance, our paper formalizes some risk-sharing features attributable to permit transferability that were first highlighted in a more general context by Krishna & Tan (1999). In this respect, it also relates to several recent studies focusing on efficient risk sharing through international finance (Callen et al., 2015) or power interconnections (Antweiler, 2016).
The remainder is organized as follows. Section 2 presents the model and discusses the theoretical results. Section 3.1 provides a qualitative illustration in a three-jurisdiction world. Section 3.2 contains a calibrated quantitative illustration. Section 4 introduces two extensions: endogenous cap selection and intertemporal trading. Section 5 concludes. All numbered tables and figures are provided at the end. There are two appendices dealing with the analytical derivations and proofs (A) and the description of our calibration methodology (B).

Economic environment
To keep the model parsimonious and within the canonical framework, we consider a standard model of competitive markets for emission permits designed to regulate uniformly-mixed pollution in several jurisdictions in the manner of Weitzman (1974) and Yohe (1978), or more recently as in Hoel & Karp (2002), Newell & Pizer (2003, 2008 and Habla & Winkler (2018).
In practice, the canonical framework analyzed in these papers and others implies that we make three assumptions. First, markets for permits and for other goods are separable and do not interact. Second, jurisdictions' benefits from emissions are expressed as quadratic functional forms which can be viewed as local approximations of general specifications and were shown to trace their real-world counterparts well (Klepper & Peterson, 2006;Böhringer et al., 2014). Third, uncertainty is introduced in the form of additive shocks affecting jurisdictions' unregulated emission levels. Our benchmark model is static and takes jurisdictional emissions caps as fixed and independent of the decision to link. In Section 4, we show that our key results continue to hold in two extensions where we (1) endogenize domestic cap selection based on self interest in anticipation of linking à la Helm (2003) and (2)  Aggregate benefits from emissions in jurisdiction i ∈ I are a function of the jurisdiction-wide emissions level q i ≥ 0 and of the random variable θ i such that where the parameters β i > 0 and γ i > 0 control the intercept and slope of i's linear marginal benefit schedule, respectively. 1 Specifically, the parameter γ i reflects i's abatement technology at the margin, hereafter technology for short. Thus, when comparing two jurisdictions i and j, γ i > γ j means that i has access to a lower-cost abatement technology than j.
Jurisdiction i's laissez-faire emissions maximize its benefits and are given bỹ The shock θ i thus affects i's laissez-faire emissions. For analytical convenience and without loss of generality, we assume that shocks are mean-zero with constant variance and that they may be correlated across jurisdictions. Specifically, for any pair (i, j) we let For instance, θ i > 0 may reflect a favorable shock that increases i's benefits from emissions, and therefore, the laissez-faire emissions relative to baseline emissionsq i = E{q i } = γ i β i .
1 Jurisdiction i's benefits correspond to the aggregate benefits accruing to all firms located within its boundaries. Indeed, covered firms are all united by a uniform price on emissions, which causes their marginal benefits to equalize. By horizontal summation, individual marginal benefit curves can thus be combined into one aggregate marginal benefit curve. Therefore, only the efficiency side of linking is covered here and the intra-jurisdictional distributional aspects are outside the scope of the paper.

Emissions caps
The emissions cap profile (ω i ) i∈I is exogenous and fixed. Having domestic caps independent of the decision to link anchors the aggregate level of emissions and rules out strategic spillovers. This allows us to (1) have well-defined autarky outcomes that serve as references throughout, (2) isolate the efficiency gains from linkage, and (3) compare these gains across linkages and jurisdictions in a meaningful way. We later relax this assumption in Section 4.1 and discuss its implications. For clarity, we express caps as proportional to technology by an ambition parameter such that which implies that jurisdictional caps are all -but not equally -stringent relative to baseline.
In particular, notice the negative relationship between α i and the level of ambition implicitly Autarky equilibria Under autarky, jurisdictions comply with their own caps. We assume that θ i > α i − β i for all i and shock realizations so as to focus on interior autarky equilibria exclusively. That is, there are weak restrictions on individual shocks such that domestic caps are always binding. Specifically, autarky permit prices are positive and read wherep i = β i −α i > 0 denotes i's expected autarky price and noticep i is lower for jurisdictions with higher α i . 2 First, note that for a positive (resp. negative) shock realization θ i , i's autarky price is above (resp. below)p i . Second, note that when autarky prices differ -whether it be due to differences in ambition measured byp i or shock realizations -the aggregate abatement effort is not efficiently allocated among jurisdictions. In particular, cost-efficiency could be improved by shifting some abatement away from relatively high-ambition (resp. high-shock) to low-ambition (resp. low-shock) jurisdictions until autarky price differentials are eliminated.
We now characterize and quantify how linkage performs such a function. 2 Lecuyer & Quirion (2013) and Goodkind & Coggins (2015) provide explicit treatments of corner solutions in related contexts and demonstrate they can be of importance. Appendix B describing the calibration for our quantitative illustration shows that the interior equilibria assumption is innocuous for our quantitative results sincep i > 2σ i for all i ∈ I we study. That is, assuming the shocks are normally distributed, zero-price corners occur with less than 2.5% probability in autarky and, a fortiori, under linkage.

Multilateral linkage and market equilibrium
Let G ⊆ I be a non-empty subset of I. We call G a group and G-linkage the linked permit market between all jurisdictions in group G. An interior G-linkage equilibrium consists of the (|G|+1)-tuple (p G , (q G,i ) i∈G ), where p G is the equilibrium permit price in the linked market and q G,i denotes jurisdiction i's equilibrium level of emissions. 3 The equilibrium is characterized by the equalization of marginal benefits across jurisdictions in G and market clearing, that is where Ω G denotes G's cap. Cost-efficiency requires that any jurisdiction abates in proportion to its own technology, i.e.q i − q G,i = γ i p G . In particular, the G-linkage equilibrium price can be expressed as the technology-weighted average of autarky prices, that is where Γ G = i∈G γ i measures G's technology. Additionally, jurisdictional net permit demands are proportional to technology and the difference between the autarky and prevailing linking prices, that is In particular, jurisdiction i is a net permit importer (resp. exporter) under G-linkage provided that p i > p G (resp. p i < p G ), i.e. the linking price is lower (resp. higher) than its autarky price. Ceteris paribus, this shows that G-linkage is observationally equivalent to an increase (resp. decrease) in i's effective cap relative to autarky.

Efficiency gains in multilateral linkages
Because aggregate emissions are invariant, the welfare impacts from linkage only stem from an efficiency improvement. 4 Specifically, the economic efficiency gains accruing to i under G-linkage denoted δ G,i correspond to the difference between i's benefits under G-linkage (inclusive of proceeds from permit trading in the linked market) and autarky, that is It is a well-known result that with fixed caps, linkage is mutually beneficial, i.e. efficiency gains are always non-negative. We characterize these gains further in the following proposition.
Proposition 1. Under G-linkage, the expected efficiency gains accruing to jurisdiction i ∈ G can be decomposed into effort-and risk-sharing gains, namely Proof. Relegated to Appendix A.1.
Jurisdiction i's expected efficiency gains from G-linkage are proportional to the expectation of the square of the difference in autarky and G-linkage prices, i.e. the square of the distance in autarky-linking prices. 5 Crucially, efficiency gains can be decomposed into two non-negative components. 6 The effort-sharing component is proportional to the square of the expected autarky-linking price wedge, relates to the intra-group variation in domestic ambition levels (i.e. expected autarky prices) and is independent of the shock structure. Intuitively, the larger this wedge, the larger the gains associated with the equalization of jurisdictional marginal benefits on average. In practice, however, significant disparities in expected autarky prices can compromise the political feasibility of a link for two reasons. First, they imply sizeable, persistent and politically-unpalatable monetary transfers associated with permit flows across jurisdictions. Second, they may connote different preferences in terms of environmental ambition or role of the carbon price signal as a domestic climate policy instrument.
The risk-sharing component is proportional to the variance of the autarky-linking price wedge, relates to jurisdictional and G-wide shock characteristics, and is independent of jurisdictions' ambition levels. 7 That is, provided realized shocks differ across partnering systems, linking induces a strictly positive gain compared to the case without uncertainty, which is a strict Pareto-improvement due to risk pooling. Intuitively, controlling for the intra-group variation in expected autarky prices, the larger the ex-post wedge in autarky and linking prices, the larger the gains due to risk sharing. For instance, all else equal, i will prefer to be in linkage groups where the price happens to be high w.r.t. its expectation when i's (counterfactual) domestic price would have been low w.r.t. its expectation, and vice versa.
Moreover, because the G-linkage price is the technology-weighted average of autarky prices in members of G, all else equal, it is primarily driven by jurisdictions with higher γ's. Similarly, for jurisdictions of similar technology, it is largely determined by those jurisdictions whose permit demand is highly variable. Therefore, only considering the risk-sharing component of gains, one expects that high-γ and high-σ jurisdictions may prefer to link with several jurisdictions to augment their autarky-linking price distances. By contrast, low-σ (resp. low-γ) jurisdictions may prefer to link exclusively with a single low-σ (resp. high-γ) jurisdiction, for otherwise the influence of that jurisdiction on the link outcome is likely to be mitigated. We further discuss the complex dependence of linkage preferences on the correlation coefficients in the next section and illustrate it using a qualitative example in Section 3.1.

Bilateral decomposition of gains in multilateral linkages
Equation (10) offers a compact and intuitive interpretation of jurisdictional gains in terms of autarky-linking price distance. This clarifies the behavior of the effort-sharing component, but it remains unclear prima facie how the risk-sharing component relates to jurisdictional characteristics. To illuminate this further, we unpack Equation (10) and to focus momentarily on the determinants of the risk-sharing component, we assume identical ambition across jurisdictions so that autarky-linking price wedges arise only due to shocks, i.e. p i − p G = θ i −Θ G . Substituting this into Equation (9) and using the definition ofΘ G , we obtain Expanding the above and taking expectations then yields Intuitively and as described further in Doda & Taschini (2017), the aggregate risk-sharing gains from {i, j}-linkage are (1) positive as long as jurisdictional shocks are imperfectly correlated and jurisdictional volatility levels differ, for otherwise the two jurisdictions are identical in terms of shock characteristics, (2) increasing in both jurisdictional volatilities and technology parameters, (3) higher the more weakly correlated jurisdictional shocks are, and (4) for a given group's technology, maximal when jurisdictions have identical technology. Additionally, note that aggregate gains are apportioned between jurisdictions in inverse proportion to technology parameters. This is so because, for a given volume of trade, the distance between the autarky and linking prices is greater in the higher-cost technology jurisdiction.
Returning to the general case of any G-linkage, we could pursue a similar approach to compute E{δ G,i } as i's expected gains from a bilateral linkage between i and G\{i}. However, the nature of the entity G\{i} becomes exceedingly complex as the cardinality of G increases.
In this respect, one of our contributions is to recognize that bilateral linkages constitute the building blocks of the multilateral linkage analysis. Specifically, in a given linkage group, we show that it is more convenient to express the associated quantities as a function of the group's internal bilateral linkage quantities. With the tacit convention that ∆ {i,i} = 0 for any i, we can state the following proposition.

G-linkage gains (inclusive of both effort-and risk-sharing components) accruing to jurisdiction
i ∈ G write as function of the aggregate gains in all bilateral linkages within G The number of such internal bilateral links is triangular and equals |G|+1 2 .
Proposition 2 helps us tease out jurisdictional linkage preferences. Specifically, jurisdiction i is better off linking with sets of jurisdictions such that on the one hand, the aggregate gains in bilateral links between i and each jurisdiction in these sets are high, and on the other hand, the aggregate gains in bilateral links internal to these sets are low. Referring to the above description of the determinants of the risk-sharing gains in bilateral links, these desirable sets, from the perspective of i, should comprise of jurisdictions that are similar to each other, with higher σ and γ than i, and negatively correlated with i. At the extreme and considering only the risk-sharing component of gains, i would ideally like to link with as many replicas of its most preferred bilateral linking partner as possible.
Additionally, summing Equation (14) over all i ∈ G gives In words, the aggregate G-linkage gains write as a technology-weighted sum of all gains from bilateral linkages within G. This decomposition result permits a more practical formulation and quantification of gains generated by an arbitrarily large group. Moreover, it allows us to provide an intuitive description of the efficiency gains in linking disjoint groups of linked jurisdictions. Specifically, let G ⊂ G and G be the complement of G in G, i.e. G = G ∪ G and G ∩ G = ∅. Then, we can express the aggregate gains in G as a function of those in G and G by unpacking Equation (15), that is Note that the third term in the parenthesis captures the interaction among jurisdictions in G and G , which is precisely the quantity we want to isolate. To do so, we denote the aggregate gains of merging groups G and G by ∆ {G ,G } and define them by With this definition, Appendix A.3 shows that which is non-negative given the mutually beneficial nature of linkage with fixed caps. That is, the aggregate expected gains from the union of disjoint groups is no less than the sum of the separate groups' aggregate expected gains. 8 This implies the standard result that I-linkage -the global market -is the linkage arrangement that is conducive to the highest aggregate cost savings in complying with the aggregate cap Ω I .

Risk-sharing and permit price properties under linkage
The G-linkage price p G =p G +Θ G is composed of two terms. The former,p G , is commensurate with the stringency of the group-wide cap relative to its baseline emissions. It measures the marginal cost of abatement when the group-wide expected abatement effort is allocated costefficiently. The latter,Θ G = i∈G γ i θ i /Γ G , quantifies the price impact due to the variability of the stringency of the group's cap relative to laissez-faire emissions that would be consistent with a profile of realized shocks. Indeed, given (θ i ) i∈G , the quantity i∈G γ i θ i measures the difference in the group's laissez-faire and baseline emissions. Then, dividing it by the groupwide technology Γ G gives the corresponding price impact.
Next, we characterize the features of linkage in terms of risk-sharing by analyzing the properties of the linking permit price variability. We say that a partition P of I is coarser than partition P if P can be obtained from P by some sequence of linkages between groups in P.
With this terminology we can then state the following proposition Proposition 3. Linkage reduces permit price volatility on average in groups and partitions, but not necessarily for each of their member jurisdictions. That is, (a) Linkage diversifies risk since for any group G and partitions (P, P ) with P coarser than In particular, relative to autarky, linkage always reduces price volatility in higher volatility jurisdictions but may increase it in lower volatility jurisdictions.

Statement (a) indicates that linkage improves shock absorption and reduces price volatility
on average relative to autarky. In a given group, the linking price volatility is smaller than the technology-weighted average of autarky price volatilities. That is, the variability of the group's cap stringency is less than the one implied by its members' individual cap stringencies taken together. Importantly, this property extends to partitions: the coarser a partition, the more diversified the domestic shocks on average. Obviously, on the flip side, linking implies that relative to autarky jurisdictional emission levels are uncertain and contingent on own and linkage partners' shock realizations. This, however, can be desirable as it introduces some responsiveness in domestic caps much like a hybrid instrument does. 9 Although linkage-induced diversification guarantees that price volatility is reduced on average in a group, Statement (b) indicates that (1) enlarging a group does not always imply lower price variability, which would be true only if domestic shocks were independent and (2) not every member jurisdiction necessarily experiences a reduction in price volatility w.r.t. autarky.
On the one hand, relatively volatile jurisdictions always experience reduced price volatility w.r.t. autarky as domestic shocks are spread over a thicker market and thus better cushioned.
On the other hand, because linkage also creates exposure to foreign shocks, relatively stable jurisdictions may face higher volatility w.r.t. autarky. However, we emphasize that linkage is always preferred to autarky even when it leads to higher price volatility domestically, i.e. despite that some jurisdictions might 'import' volatility as a result of the link.

Qualitative illustration
In this section we illustrate our theoretical results in a stylized setup with three jurisdictions i, j and k. Taking jurisdiction i's perspective, we compare its linkage options graphically in In this case, the 45 o line depicts the indifference frontier along which {i, j} and {i, k} generate the same risk-sharing gains for i. Above the frontier i prefers to link with k because k has a lower-cost abatement technology than j does. All else constant, deviations from SUB such as σ i = σ j < σ k or ρ ij = 0 > ρ ik distort the indifference frontier to the dashed curve. These deviations imply that k is i's preferred partner in a larger region of the {γ j , γ k }-space.
In Panel 1b we revert back to SUB but now allow for the formation of {i, j, k} in addition to the bilateral links just discussed. First, observe that at the point of identical technologies, i prefers {i, j, k} to the bilateral linkages. This is to be expected because with j and k ex ante identical, {i, j, k} is twice as large as the bilateral groups i could form and therefore offers more abatement opportunities ex post. 10 Now note that i's indifference point between {i, j, k} and bilateral linkages (denoted by a diamond) implies γ i < γ j = γ k . Indeed, given the restrictions implicit in SUB, it must be that j and k can each offer sufficiently cheaper abatement opportunities to i to render bilateral linkages at least as rewarding as {i, j, k}.
Finally, it is informative to characterize j and k's linkage preferences in the same {γ j , γ k }-

Quantitative illustration
In this section we explore our model quantitatively by considering linkages between hypothetical ETSs regulating the carbon dioxide emissions from the power sector of five real-world jurisdictions with different levels of ambition. 12 We assume annual compliance without any permit banking and borrowing across compliance periods. This implies that the per-annum Equipped with these caps and MACCs, we compute the expected autarky permit prices using our model which range from 27.1$/tCO 2 in AUS to 113.7$/tCO 2 in CAN. The annual baselines (q i ), emission caps (ω i ) and corresponding expected autarky permit prices (p i ) are reported in Table 1, which also contains the linear intercepts (β i ) and technology coefficients (γ i ) we calibrate with a linear interpolation of MACCs in the vicinity of domestic caps. 14 We calibrate the shock properties using the residuals from the regression of historical emissions on time and time squared with data from the International Energy Agency. These shocks capture the net effect of stochastic factors that may influence emissions and their associated benefits, e.g. business cycles, TFP shocks, jurisdiction-specific events, changes in prices of factors of production, weather fluctuations, etc. Table 2 provides the volatility of the autarky permit prices as measured by the coefficient of variation, as well as the pairwise shock correlations implied by our theory. We note that there is large cross-jurisdiction variation in autarky price variability and that there are instances where the correlation between shocks is negative (e.g. KOR and EUR) or effectively zero (e.g. KOR and CAN). In 5J the aggregate effort-sharing gains amount to $1.58 billion, and those associated with risk sharing are $1.68 billion, totalling $3.26 billion. Risk sharing is the dominant source of gains in all jurisdictions but AUS. At $1.38 billion AUS's effort-sharing gains account for 14 The parameter γ i compounds the productivity of i's abatement technology and i's volume of regulated emissions. As such, comparing the ratios γ i /q i can give us a sense of the ordering of the volume-adjusted costs of abatement opportunities at the margin in the vicinity of the domestic caps. For instance, Table 1 shows that AUS has the cheapest abatement opportunities whereas the most expensive ones are in EUR. 15 The small squares are an exception, e.g. KOR's effort-sharing gains in 5J, and indicate gains too small to be visible in the graph. almost 90% of aggregate effort-sharing gains. This is not surprising because the expected autarky-linking price wedge in AUS is the largest ($27.1 vs $86.5 per tCO 2 ). Conversely, EUR captures the largest risk-sharing gains which amount to $0.62 billion or just over a third of the aggregate risk-sharing gains. First, observe that 5J is not the group that generates the largest gains for USA. In light of the previous section, we conclude that 5J will therefore not emerge naturally for these five jurisdictions, even though it would generate the largest gains in aggregate. Neither is it the case that 5J delivers the lowest price volatility for USA which obtains in the bilateral link with EUR. In fact, USA permit price volatility may increase relative to its autarky level (horizontal line in the middle panel). However, we emphasize that in our model an increase in permit price volatility relative to autarky does not have any negative implications, which for many jurisdictions in the real world can be an important consideration.

Discussion
Second, there is not a monotonic relationship between the magnitude of efficiency gains and cardinality of a group. For example, adding EUR to {AUS,USA} increases USA's efficiency gains while adding KOR or CAN decreases them. Third, linkage preferences do not tally.
While USA would gain the more from adding EUR to {AUS,USA}, AUS would rather have KOR or CAN join the bilateral group next as it would benefit AUS more.
Finally, the bottom panel illustrates the large variation in the two components of gains across groups including USA which are ordered so risk-sharing gains decline along the x-axis. In all groups where AUS is a member, USA enjoys significant effort-sharing gains driven by the large difference in expected autarky prices between AUS and the others. In groups with greater number of members, USA effort-sharing gains tend to be lower as they are more diluted across jurisdictions relative to {AUS,USA}. Risk-sharing gains also vary significantly across all groups. USA efficiency gains consist almost exclusively of risk-sharing gains in groups that do not include AUS (e.g. {EUR,KOR,USA}) and may be larger than effort-sharing gains in groups that do include it (e.g. {AUS,EUR,USA}). These observations underline the need for a model to evaluate the efficiency gains from linking ETSs multilaterally.

Linking with endogenous cap selection
Our analysis of linkage in Section 2 assumes away strategic cap selection and takes domestic caps as given. This can be justified by reference to the domestic political-economy constraints that emerge from the complex internal negotiation processes which must render the resulting policies acceptable to a host of actors with divergent interests (Flachsland et al., 2009;Marchiori et al., 2017). Deviating from one's cap is therefore costly. However, one may contend that the prospects of inter-jurisdictional permit trading will drive regulators to set their caps in anticipation of linking based on self interest.
In this case, Helm (2003) showed that jurisdictions which expect to be net sellers (resp. buyers) of permits on the linked market have an incentive to inflate (resp. reduce) their caps relative to autarky to maximize their gains from linking. This strategic aspect and attendant shift in aggregate emissions and damages imply additional welfare impacts which in turn could compromise the feasibility of linkage, and autarky may even welfare-dominate linkage.
At a minimum, jurisdictional linkage preferences may be altered. Below we analyze how endogenizing cap selection affects our analysis of linkage.
In what follows, we make the conventional assumption that marginal damages are constant and let η i denote i's marginal damage (Pizer, 2002;Newell & Pizer, 2003). This assumption is consistent with damages being determined by the global cumulative emissions since the beginning of the Industrial Revolution (Allen et al., 2009;Allen, 2016). Here, it implies that jurisdictional reaction functions are orthogonal. We also assume that domestic caps are selected non-cooperatively with Cournot-Nash conjectural variations. 16 Under autarky jurisdiction i sets its cap ω A,i to maximize its benefits net of damages, which simply yields ω A,i = γ i (β i − η i ), i.e.p A,i = η i . This reflects the weak form of the international free-riding problem, i.e. the intercepts of the reaction functions imply higher emission levels than in the global optimum, as i does not internalize the negative externality inflicted by its emissions upon others. 17 Socially-efficient caps satisfy the Lindahl-Samuelson condition, are lower than the Cournot-Nash ones and imply all jurisdictions face the same pricep I = i∈I η i in expectation, which is congruent with a global social cost of carbon (Kotchen, 2018).
Under G-linkage, endogenizing cap selection is congruent with a two-stage game where jurisdictions set their caps at stage one and inter-jurisdictional permit trading occurs at stage two, which is typically solved in subgame Nash perfection using backward induction (D'Aspremont et al., 1983;Barrett, 1994;Carraro & Siniscalco, 1993;Helm, 2003). As shown in Appendix A.6 jurisdiction i's cap in anticipation of G-linkage becomes where η G = i∈G η i /|G| is the average marginal damage in G. Under the prospects of forming a linkage group the weak form of the free-riding problem is magnified (resp. mitigated) for relatively low-damage (resp. high-damage) jurisdictions and in turn, inter-jurisdictional 16 If caps are selected cooperatively within a group, the prospects of inter-jurisdictional trading are inconsequential for cap selection (Carbone et al., 2009). Our results would be qualitatively similar under alternative conjectural variations because marginal damages are constant (MacKenzie, 2011;Gelves & McGinty, 2016).
17 Due to the linearity of damages our framework does not capture its strong form, i.e. the crowding-out of domestic abatement efforts (reaction functions are negatively sloped with quadratic damages) which will always be strategic substitutes in a pure emissions game. In the context of international market for permits, Holtsmark & Midttømme (2015) are able to transform domestic abatement efforts into strategic complements by tying the dynamic emissions game to the dynamics of (investments in) renewables. Caparrós & Péreau (2017) and Heitzig & Kornek (2018) analyze sequential linkage processes with strategic cap selection. permit trading has an ambiguous effect on aggregate pollution relative to autarky since  (2017) show that linkage increases aggregate emissions relative to autarky absent and present trade in other goods, respectively. Using a computable general equilibrium model, Carbone et al. (2009) show that the opposite situation is more likely to occur.
The equilibrium market price under G-linkage with endogenous cap selection reads Note that the G-linkage prices with fixed and endogenous caps in Equations (7) and (20) are identical up to a shift in their deterministic parts from i∈G γ ipi /Γ G to i∈GpA,i /|G| and that i is a net seller in expectation i.f.f. η i ≤ η G . Because endogenous cap selection does not alter price variability, it will a fortiori not affect risk-sharing gains from linkage. Specifically, Helm (2003) shows that with endogenous caps the welfare impacts from linkage can be decomposed into three components, namely the efficiency gains from inter-jurisdictional trading, the strategic effect as measured by the market value of the difference in cap choices under autarky and linking, and the damage effect of changes in aggregate emissions. In the following proposition, we offer a precise analytical characterization of these three components.

Proposition 4. With endogenous cap selection, the expected welfare impacts from G-linkage in jurisdiction i can be decomposed into three components
Proof. Relegated to Appendix A.6.
As with exogenous caps in Proposition 1, efficiency gains have effort-sharing and risk-sharing subcomponents, which are both non-negative. Note that the latter is independent of cap selection which justifies the choice of considering exogenous caps in Section 2. That said, the interplay between the three welfare components is intricate and the latter two effects can be positive or negative. The strategic effect is positive i.f.f. η i < η G while the damage effect is on the stochastic properties of permit prices is also unaltered, but that Proposition 2, which provides an alternative formulation of individual efficiency gains, no longer holds.

Linking with banking and borrowing
Most if not all emissions trading systems allow for some form of intertemporal trading, that is banking issued permits for future compliance or borrowing future permits for present compliance. In Section 2, we abstracted from banking and borrowing when characterizing efficiency gains due to linking. By providing emitters with the opportunity to rearrange emissions over time, intertemporal trading can in principle reduce the price variability under autarky which in turn should shrink the risk pooling potential left over to linkage. In this section, we quantify the size and determinants of the efficiency gains due to linking with unrestricted intertemporal trading. We find that banking and borrowing does not eliminate the efficiency gains due to linking, and in some cases may increase them. The result turns on the persistence of shocks over time, the discount factor and the planning horizon.
For simplicity, we consider a stylized model of unrestricted banking and borrowing, which abstracts from constrains on the amount of permits that can be banked or borrowed. Without loss of generality, we assume that jurisdictions apply the same discount factor λ and that their benefit functions are time invariant. Additionally, given our discussion in the previous section, we revert to exogenous caps and further assume they are constant over time. Allowing for intertemporal trading alters market equilibrium permit prices but crucially not the definition of per-period linkage gains in Equation (14). In a given group G and period t we denote by p G,t and p G,t the prices with and without intertemporal trading, respectively. 20 Substituting them into (14) then gives the corresponding linkage gains, which we respectively denote by That is, allowing for intertemporal trading alters, but does not neutralize, the efficiency gains from inter-jurisdictional trading.
Below, we characterize δ i,G,t as well as the ordering of δ i,G,t and δ i,G,t .
Consider two adjacent time periods t and t+1. We let θ i,t and θ i,t+1 denote the corresponding shocks in jurisdiction i and assume that unconditional expectations are normalized to zero, i.e. E{θ i,t } = E{θ i,t+1 } = 0. To specify the expectation of θ i,t+1 conditional on θ i,t we assume that the joint distribution of (θ i,t , θ i,t+1 ) follows a standard AR(1) process. That is, using the shorthand notation E t {·} to denote expectation conditional on all information available Under autarky, the permit price in jurisdiction i in period t without intertemporal trading is simply given by Equation (5), i.e. p i,t =p i + θ i,t . With intertemporal trading, the standard no-arbitrage condition with discounting and uncertainty (Samuelson, 1971;Schennach, 2000) is satisfied That is, the discounted permit price is a martingale. Additionally, invoking the tower rule, Equation (22) can be chained over time with any given horizon of length h ∈ N yielding We can solve for the period-t equilibrium price with intertemporal trading over the horizon h with Equation (23) and overall market closure at date t + h yielding where Φ i = h z=0 ϕ z i and Λ = h z=0 λ −z , and which reduces to p i,t = p i,t only when ϕ i = 1 and λ = 1. When λ < 1, the deterministic part of p i,t is smaller than that of p i,t due to temporal effort sharing. In practice, some abatement is postponed because (h + 1)/Λ decreases with h and λ −1 . 21 Not surprisingly, intertemporal trading reduces price variability because (3) the lower the discount factor, the less marked the price impact of shocks today as firms prefer to pass on more of the shocks to future periods. Only in the limit as h → ∞ or λ → 0 is price variability nil and the intertemporally tradable quantity instrument closely mimics the outcomes of a price instrument. 23 That is, unrestricted banking and borrowing alone cannot in general absorb all contemporaneous price variability. 24 Similarly, the G-linkage equilibrium price in period t without intertemporal trading is given by Equation (7), i.e. p G,t =p G + i∈G γ i θ i,t /Γ G , whereas with intertemporal trading it reads This implies that the static analysis of efficiency gains in Section 2 remains valid with intertemporal trading if shocks are rescaled by Φ i /Λ to account for optimal, unlimited banking and borrowing. We can then state the following proposition.

Proposition 5. With unrestricted intertemporal trading over a finite time horizon of length
h, the efficiency gains due to G-linkage accruing to jurisdiction i in any period t amount to 21 Similarly to linking, intertemporal trading generates effort and risk sharing gains. In a deterministic setting, temporal vs. spatial effort sharing gains have been analyzed, see e.g. Stevens & Rose (2002).
22 This is not mathematically precise but conveys the core intuition. See Appendix A.7 for details. 23 In a seminal paper comparing price and quantity instruments for stock pollutants, Newell & Pizer (2003) hint at this result (see their footnote 7) but do not develop it formally as their analysis abstracts from banking and borrowing. Extensions with intertemporal trading quantify further quantify this result, see equation 9 in Fell et al. (2012) or Newell et al. (2005 and Pizer & Prest (2016) for a particular focus on policy updating. 24 Moreover, comparing Equations (7) and (24) shows that, from the perspective of i intertemporal trading is observationally equivalent to linking with h uncorrelated replicas of i whose individual shocks are given by {θ i,t z s=0 ϕ s i / z s=0 λ −s } z=1,...,h . In other words, we quantify how time periods and jurisdictions are observationally equivalent 'divisions' in pollution permit markets as first analyzed in Yates (2002).

24
Proof. Relegated to Appendix A.7 Proposition 5 extends Proposition 1 to a dynamic setup where, in addition to linking, unrestricted intertemporal trading within horizon h is allowed. Although effort-sharing gains due to linking always decline when intertemporal trading is allowed, risk-sharing gains can decrease or increase, resulting in non-negative efficiency gains which may be lower or higher than in the case with no intertemporal trading. As further discussed in Appendix A.7, the ordering of E{δ i,G,t } and E{δ i,G,t } depends on the complex interaction between the time horizon h, the discount factor λ and the shock properties {σ i , ρ ij , ϕ i } i,j∈G .
In our quantitative example, we argue that the efficiency gains due to linking are attenuated but not eliminated when intertemporal permit trading is allowed. To that end, we first note that when λ < 1 and shocks are similarly persistent across jurisdictions, i.e. ϕ i ϕ < 1 for all i ∈ G, efficiency gains are always attenuated by intertemporal trading and the ratios of effort-and risk-sharing gains with and without intertemporal permit trading are given by and producers typically hedge production up to three years ahead, so h = 3 seems a reasonable first-pass value. 25 Finally, we take λ = 0.9 for the discount factor. Plugging in these values in Equation (28) we find that intertemporal trading eats away about 30% and 60% of the effort-and risk-sharing gains presented in Section 3.2, respectively. 26 Notwithstanding the attenuation in efficiency gains, we note that under unrestricted banking and borrowing, Proposition 2 is unaltered and Proposition 3 holds up to the shock rescaling above.
Finally, we highlight some of the differences between the stylized theory of intertemporal trading just analyzed and how it operates in practice. First, our theory assumes unrestricted intertemporal trading. In reality, borrowing is almost never authorized and banking can be 25 See e.g. Neuhoff et al. (2012) and Schopp et al. (2015) and references therein in the case of the EU ETS. This 3-year hedging is typically incomplete as producers keep opportunities open to exploit changes through time, which means efficiency gains are likely to be reduced by less than what Equation (28) measures. 26 In practice, caps are declining over time, typically at a rate of 2% per annum. When this is the case, intertemporal trading implies that effort-sharing gains decrease by 32% relative to 28% with constant caps. limited, either by regulation via holding limits or due to firm-level internal or managerial constraints. Fell et al. (2012) show that these constraints matter: as soon as they are expected to bind, banking offers little flexibility in smoothing out shocks. 27 Second, observed price dynamics in the EU ETS and elsewhere suggest that banking strategies by firms are not optimal, which might inter alia be caused by regulatory uncertainty (Salant, 2016;Fuss et al., 2018).
Third, firms may be rationally short-sighted for hedging purposes or because they are poorly informed about future supply and demand conditions (Neuhoff et al., 2012;Schopp et al., 2015;Quemin & Trotignon, 2019). Last but not least, some risk may be jurisdiction-specific and so not diversifiable using intertemporal trading. Although a more comprehensive treatment of these considerations is in order, they can be thought of as impinging on intertemporal trading opportunities, de facto leaving more scope for inter-jurisdictional trading.

Conclusion
In this paper we advance the frontier of research on the integration of permit markets by proposing a general model to describe and analyze multilaterally-linked ETSs formally. In our model, efficiency gains and permit prices in any linkage group are well-defined objects and we study their analytical properties. First, we identify the two independent components which constitute the efficiency gains in any multilateral linkage, namely the effort-and risk-sharing components. The former is determined by the inter-jurisdictional variation in domestic ambition levels and the latter is driven by the nature of the uncertainty affecting the demand for permits in individual jurisdictions. Second, we decompose any multilateral linkage into its internal bilateral linkages. That is, we characterize aggregate and individual gains in any linkage group as a weighted average of the aggregate gains in all bilateral links that can be formed among its constituents. This decomposition formula is a practical tool to compute the gains generated in arbitrary linkage groups. It further allows us to rank groups from the perspective of individual jurisdictions and characterize the aggregate gains from the union of disjoint groups analytically. Third, we clarify the relationship between autarky and linking prices and show that relative to autarky, linkage reduces price volatility on average but not necessarily for individual entities. Finally, we show that our key findings hold when domestic caps are selected strategically or when unrestricted intertemporal trading is allowed.
In other words, risk-sharing gains from linkage are independent of cap selection and remain substantial even when banking and borrowing is permitted.
Linkages between ETSs have an important role to play in the successful, cost-effective implementation of the Paris Agreement. A quantitative application calibrated to five jurisdictions with similar levels of development and which all use, or have considered, both emissions trading and linking, illustrates that our model can be used to gauge the magnitude and analyze the distribution of efficiency gains from linkage. Specifically, we calibrate our model to the power sector CO 2 emissions of Australia, Canada, the EU, South Korea and the USA under the assumption that each jurisdiction implements its Paris Agreement pledges. In the five-jurisdiction linkage the aggregate effort-sharing gains amount to $1.58 billion (constant 2005US$) and risk-sharing gains are $1.68 billion, totalling $3.26 billion per annum relative to autarky. This provides evidence on the practical relevance of our theoretical findings and shows how our model can readily be used for policy-oriented applications. Tables   Table 1: Annual baseline emissions (q i , 10 6 tCO 2 ) and annual emissions caps (ω i , 10 6 tCO 2 ) obtained from Enerdata. Calculated expected autarky permit prices (p i , 2005US$/tCO 2 ), calibrated flexibility coefficients (γ i , 10 3 (tCO 2 ) 2 /2005US$), linear intercepts (β i , 2005US$/tCO 2 ) and ambition coefficients (α i = ω i /γ i , 2005US$/tCO 2 ) obtained using Enerdata data.

A.1 Proof of Proposition 1 (effort-and risk-sharing gains)
Recalling the definition of i's efficiency gains from G-linkage in Equation (9), we have where the third and fifth equalities obtain via the first-order condition in Equation (6) and the net permit demand in Equation (8), respectively. Taking expectations and observing We propose an alternative interpretation of efficiency gains in terms of reduction in emissions control costs. Letã i =q i −ω i > 0 and ∆B i denote i's domestic abatement level and associated foregone benefits due to compliance with i's binding cap under autarky, respectively. That where the last equality follows from ω i =q i −ã i andq i = γ i (β i + θ i ). By convexity of ∆B i , Jensen's inequality implies that an increase in uncertainty about laissez-faire emissions (and corresponding cap stringency) raises the expected policy costs under autarky. Specifically, because θ i is mean-zero, these autarky costs can be decomposed as where the first term measures costs under certainty, which are proportional to i's ambition level, and the second term captures the increase in costs due to uncertainty, which is proportional to the shock variance. By the same token, the aggregate expected policy costs under In words, given caps, linkage induces a cost-efficient reduction in the group's expected policy costs by (1) spreading the expected aggregate abatement effort in proportion to each member's technology and (2) improving the absorption of shocks within the linked system. Hence the effort-and risk-sharing gains.
It is useful to note that the two following identities hold true Using these identities and rearranging the sums in Equation (A.5), we obtain that Regrouping terms by bilateral linkages, Equation (A.9) rewrites (A.10) By symmetry, i.e. ∆ {i,j} = ∆ {j,i} , Equation (A.10) finally coincides with Equation (15).
As a side note, because variance is a symmetric bilinear operator, it holds that Intuitively, although it is clear that I = arg max G⊆I E{∆ G }, there is no reason that forming larger groups reduces volatility of gains and a fortiori that I = arg min G⊆I V{∆ G }.

A.3 Proof of Equation (17)
With G and G in I such that G ⊂ G and with G = G\G , expanding Equation (15) gives (A.12) The aggregate gains from linking G and G are ∆ By transposing Equation (13a) from two singletons to two groups, it holds that

A.4 Proof of Proposition 3 (linking price properties)
For any G ⊆ I, first note that price volatilities satisfy V{p G } 1/2 ≤ Γ −1 G i∈G γ i V{p i } 1/2 with a strict inequality provided that there exists (i, j) ∈ G 2 such that ρ ij < 1. Indeed, (A.15) Note that we have a similar inequality for price variances. Indeed, it jointly holds that and observe that the inequality holds strictly when there exists (i, j) ∈ G 2 such that ρ ij < 1 and/or σ i = σ j .
It suffices to establish the rest of Statement (a) for a unitary linkage since the proof extends to a more general case by transitivity over the relevant sequence of unitary linkages. Thus, let P = {G 1 , . . . , G z } and assume w.l.o.g. that P = {G 1 ∪ G 2 , G 3 , . . . , G z }. Then, it holds that The Cauchy-Schwarz inequality gives |Cov{p G 1 ; p G 2 }| ≤ V{p G 1 } 1/2 V{p G 2 } 1/2 and concludes.
We now turn to Statement (b). Note that it is sufficient to verify the claim on jurisdictional price variability as a result of linkage for bilateral links -the argument naturally extends to multilateral links. Then, by applying Equation (A.16a) to {i, j}-linkage it holds that For a given triple (σ i , σ j , ρ ij ), {i, j}-linkage effectively reduces volatility in the low-volatility jurisdiction provided that the high-volatility jurisdiction's γ is relatively not too large.
Finally, to establish the claim on price convergence in probability, we let G be ordered such that γ 1 ≤ · · · ≤ γ m , and denoteσ = max i∈G σ i . Fix ε > 0. Then, it holds that where the first inequality is Chebyshev's inequality and the second follows by construction.
Since γ m andσ are finite, only when the second term in the above bracket is nil (i.e. shocks are independent) does it hold that p G converges in probability towardsp G as |G| tends to infinity, that is lim m→+∞ P |Θ G − E{Θ G }| > ε = 0, i.e. lim m→+∞ P |Θ G − E{Θ G }| ≤ ε = 1.

A.5 A proof for the non alignment of preferences
We prove the following claim: Without inter-jurisdictional monetary transfers, jurisdictional linkage preferences are not aligned in the sense that (a) I-linkage may or may not be the most preferred linkage group for all jurisdictions in I; (b) any G ⊂ I cannot be the most preferred linkage group for all jurisdictions in G.
Fix G ⊂ I. Let G ⊃ G be a proper superset of G and denote by G = G ∩ G the complement of G in G. By way of contradiction, assume that E{δ G ,i } ≥ E{δ G,i } holds for all i ∈ G , with at least one inequality holding strictly. By summation over (A.23) and contradicts superadditivity, which requires the above expression to be non-negative.
That is, G cannot be the most weakly preferred linkage coalition for all jurisdictions thereof.

A.6 Proof of Proposition 4 (endogenous cap selection)
Let D i denote i's damage function with M D i = η i constant and positive. For any partition P of I we let Ω −i P = j∈I\{i} ω P,j where ω P,j is j's cap given P. Let also A = {{1}, . . . , {n}} denote complete autarky. The autarkic Cournot-Nash caps satisfy, for all i ∈ I By identification with Equations (4) and (5) we find jurisdictional ambition parameters and expected autarky permit prices to be α Jurisdictional regulators can anticipate linkage when selecting their caps. This situation is congruent with a two-stage game where regulators set caps at stage one and permit trading between linked markets occurs at stage two. We solve this game using backward induction and focus on subgame perfect Nash equilibria. Fix a partition P of I. Crucially, because reaction functions are orthogonal, individual cap-setting decisions in any G ∈ P will only be affected by the perspective of G-linkage but not by what happens outside G.

Stage 2: Inter-jurisdictional permit trading and emissions choices.
Take any G ∈ P. Given cap and realized shock profiles (ω i ) i∈G and (θ i ) i∈G , Equations (6) and (7) respectively give the equilibrium emission level in i q * G,i and permit price p * from linkage into efficiency gains from inter-jurisdictional permit trading, strategic effect due to domestic cap selection in anticipation of linkage, and damage effect, that is After standard computations, we find each of these components to be worth

A.7 Proof of Proposition 5 (intertemporal trading)
We take the perspective of a group which might be degenerate, i.e. a single jurisdiction. The group-wide shock and benefit parameters (linear intercept and slope) obtain by horizontal summation of the individual marginal benefit schedules. In the following, we drop the group index for clarity and without loss of generality.
We adopt a dynamic programming approach and assume that time t runs in [1; T ] where T is the date at which the problem effectively ends. Let b t ≷ 0 denote the volume of the permit bank at time t (a negative bank corresponds to borrowing) with b 0 = 0 and b T ≥ 0. At each time t the group emits q t = ω t + b t−1 − b t and faces the recursive optimization problem where b t is the control variable to simplify taking derivatives, and λ denotes the discount factor. The first-order condition reads and the envelope theorem yields so that we obtain the standard result that the discounted equilibrium marginal benefit (i.e. the discounted permit price) is a martingale via the no-arbitrage condition under uncertainty Let h = T − t denote the number of future periods at time t. The period-t equilibrium price with intertemporal trading p t obtains through chaining the optimal law of motion across two adjacent periods in Equation (A.37) over the remaining h periods Alternatively, the series of equilibrium prices {p t } t can be obtained recursively through backward induction.
To simplify computations and without loss of generality, we assume that β t = β t+1 , γ t = γ t+1 and ω t = ω t+1 for all t. Then, solving Equation (A.38) with period-T market clearing gives Equation (24) after some rearrangement. Next, Equation (26) follows thanks to the linearity of both the group-wide shock in the individual shocks and the group's expected price in the individual expected prices. All our results can be extended to incorporate time varying caps and benefit functions, as well as heterogeneity in discounting.
Equation (27) obtains by computing δ i,G,t and taking expectation. Comparing Equations (10) and (27), we have E{δ i,G,t } = E{δ i,G,t } only when h = 0 or when ϕ i = 1 for all i ∈ G and λ = 1. When h ≥ 1, it is typically the case that E{δ i,G,t } = E{δ i,G,t } and their ordering will depend on the values of the jurisdiction-specific persistence parameters, the common discount factor and the length of the time horizon. We note that when h → ∞ or λ → 0, intertemporal permit trade attenuates the efficiency gains due to linking towards zero, i.e. E{δ i,G,t } → 0.
In particular, given an arbitrary λ < 1 there exists a threshold value of h above which E{δ i,G,t } < E{δ i,G,t } holds unambiguously. When h is small the ordering of E{δ i,G,t } and E{δ i,G,t } depends on the complex interaction between h and shock properties {σ i , ρ ij , ϕ i } i,j∈G . We explore this analytically below in the case of a bilateral link.
The aggregate and jurisdictional risk-sharing gains due to {i, j}-linkage are proportional to with or without intertemporal trading, respectively, where Φ i = h z=0 ϕ z i and Λ = h z=0 λ −z .
Let prime denote the partial derivative w.r.t. ϕ i , then we have . In general, RS ≷ 0 which crucially depends on the behavior of and the interaction between the series Φ i , Φ j , and Φ i . The former, Φ i , is equal to h + 1 when ϕ i = 1; positive and increasing with ϕ i and h when ϕ i ∈ (0; 1); equal to 1 when ϕ i = 0; non-monotonic in ϕ i and h but non-negative for ϕ i ∈ (−1; 0); alternating between 0 and 1 when ϕ i = −1. Its partial derivative, Φ i , is positive and increasing with ϕ i and h when ϕ i > 0; equal to 1 for all h when ϕ i = 0; non-monotonic in ϕ i and h but non-negative for ϕ i ∈ [−0.5; 0); of alternate sign (negative for h ≥ 3 and odd) below some threshold h value and then always positive when ϕ i ∈ (−1; −0.5); of alternate sign (negative for all h odd) when ϕ i = −1. In the two-period no-discount case h = λ = 1, (A.40) simplifies Similarly, it is not straightforward to compare RS and RS. Indeed, The behavior of Φ 2 i is like that of Φ i . The series Λ is equal to h + 1 at most when λ = 1 for all h; increasing with h; decreasing with λ. By contrast, it is straightforward to show that allowing for intertemporal trading always reduces the effort-sharing gains from linkage by a fraction ((h + 1)/Λ) 2 . For instance, in the two-period no-discount case h = λ = 1 with ϕ i = 1, effort sharing is unaltered and (A.41) simplifies to RS ≥ RS ⇔ ρ ij (3 + ϕ j )σ i ≥ σ j .
Finally, we clarify the mathematical statement in footnote 22 by specifying the behavior of the series Φ i /Λ. It is equal to 1 when ϕ i = 1 and λ = 1; increasing with ϕ i and λ and decreasing with h when ϕ i ≥ 0; essentially increasing with ϕ i and λ and decreasing with h when ϕ i < 0, although may be non-monotonic locally for small h values and large λ values.

B Calibration methodology
This appendix describes the calibration of jurisdictional annual emission caps (ω i ), baseline emissions (q i ), volume-adjusted technologies (γ i ) and linear intercepts (β i ) based on proprietary data we obtained from Enerdata; and the calibration of price shock volatilities (σ i ), the pairwise correlations across jurisdictions (ρ ij ) and the AR(1) shock persistences (ϕ i ) based 45 on IEA data on historical power sector emissions. In our quantitative illustration we focus on five jurisdictions with similar levels of development and which all use, or have considered using, both emissions trading and linking: Australia (AUS), Canada (CAN), the European Union (EUR), South Korea (KOR) and the United States (USA).
We obtained annual emissions caps and MACCs of the power sectors from Enerdata. First, Enerdata models emission caps consistent to three possible scenarios. The Ener-Brown scenario describes a world with durably low fossil fuel energy prices. The Ener-Blue scenario provides an outlook of energy systems based on the achievement of the 2030 targets defined in the NDCs as announced at COP 21. The Ener-Green scenario explores the implications of more stringent energy and climate policies to limit the global temperature increase at around 1.5-2 o C by the end of the century. We selected the scenario with annual emission caps consistent with the Paris INDCs (Ener-Blue scenario).
Second, Enerdata also generates MACCs and annual emission baselines using the Prospective Outlook on Long-term Energy Systems (POLES) model. MACCs are available for four time periods (2025, 2030, 2035 and 2040). We selected emission baselines and the MACCs available for 2030. Using these annual caps and MACCs, we compute the expected autarky permit prices, which range from 27.1 in AUS to 113.7$/tCO 2 in Canada. All monetary quantities are expressed in constant 2005US$. A linear interpolation of MACCs around domestic caps gives the linear intercept β i and the inverse of its slope γ i , reported in Table 1.
The shock characteristics are calibrated using historical times series of CO 2 emissions from the jurisdictional power sectors. We obtain annual data covering 1972-2015 from the International Energy Agency. We denote observed emissions from jurisdiction i in year t by e i,t .
We identify historical emission levels with laissez-faire emissions, i.e. we assume that no or relatively lax regulations on CO 2 emissions were in place prior to 2015.
In Equation (2) laissez-faire emissionsq i comprise a constant term, the baselineq i = γ i β i , and a variable term,q i −q i = γ i θ i . Assuming the latter is small enough relative to the former, we obtain the following linear Taylor approximation for the natural logarithm of laissez-faire emissions ln(q i ) ln(q i ) + (q i −q i )/q i . (B.1) We associate the variable term in the above to the residual from the regression of ln(e i,t ) on time and the square of time. In other words, we use log-quadratic detrending to decompose ln(e i,t ) into trend and cyclical components (Uribe & Schmitt-Grohé, 2017). This is consistent with our interpretation of variations in marginal benefits of emissions as being driven by business cycles, TFP shocks, changes in the prices of factors of production, jurisdictionspecific events, weather fluctuations, etc.
Specifically, we denote the residuals from the regression i,t . To calibrate shock characteristics, we assume that { i,t }'s provide information about the distributions of the underlying shocks θ i 's. Then, given our modeling framework, i,t is related to a draw from the distribution of Note that { i,t }'s are mean zero by construction. We compute the standard deviation of θ i consistent with the model using and the standard deviation of domestic laissez-faire power-sector emissions simply obtain by the rescaling γ i σ i . Table 3 below reports the standard deviations of autarky permit prices (σ i ) and normalized standard deviations of laissez-faire emissions (σ( i,t ) = γ i σ i /q i ). The table also includes the estimated persistence parameter ϕ i when an AR(1) model is fitted to { i,t }. We used the estimated ϕ i 's to argue for the validity of the rule of thumb in Equation (28) discussed in the intertemporal trading extension in Section 4.2.  Note that price shock variabilities are roughly such thatp i > 2σ i and β i >p G + 2V{Θ G } 1/2 for any jurisdiction i and any possible group in our sample, i.e. zero-price and zero-emissions corners can safely be neglected. 29 Therefore, our focus on interior autarky and linking market equilibria is of negligible consequence for our analysis of linkage gains.
Finally, we calibrate pairwise correlation between shocks in i and j using ρ ij = Corr(β i i,t , β j j,t ). (B.4) and note that the ρ ij 's -reported in Table 2 -can be positive, negative or approximately zero.
29 Note that a sufficient condition for the second type of inequalities to hold is β i >p i + 2σ i for all i.

47
We also note that this large variation in inter-jurisdictional correlation is to be expected.
To see why note that emissions of jurisdictions whose economies are tightly interconnected through trade and financial flows will likely move together, especially if jurisdictions' emissions are procyclical. If the economic links between jurisdictions are weak and/or they are geographically distant, one would expect a low level of correlation. Finally, if a jurisdiction's business cycles are negatively correlated with others, also observing negative correlations in emissions fluctuations would not be surprising. These conjectures are consistent with empirical studies such as Calderón et al. (2007) which provides evidence on international business cycle synchronization and trade intensity, and Doda (2014) which analyzes the business cycle properties of emissions. Finally, Burtraw et al. (2013) suggest that demand for permits may be negatively correlated over space due to exogenous weather shocks.
We highlight the following three points regarding our calibration strategy and results. First, we assume that the pair characteristics are not affected by the recent introduction of climate change policies. Some emitters in some of the jurisdictions in our sample are regulated under these policies. We argue that any possible effect would be limited because these policies have not been particularly stringent, affect only a portion of the jurisdiction's emissions, and do so only in the last few years of our sample. Second, we use the log quadratic filter to decompose the observed emissions series into its trend and cyclical components. Not surprisingly, the calibrated shock characteristics are altered quantitatively when we alternatively use the band pass filter recommended by Baxter & King (1999), the random walk band pass filter recommended by Christiano & Fitzgerald (2003) or the Hodrick-Prescott filter as detrending procedures. However, our conclusions are similar qualitatively so we restrict our attention to the simple and transparent log quadratic detrending. Third, we take the calibrated ρ ij 's at face value in our computations, rather than setting insignificant correlations to zero, which does not alter the results in a meaningful way.