“No fences make bad neighbors” but markets make better ones: cap-and-trade reduces cross-border SO2 in a natural experiment

The clean air interstate rule (CAIR) was a regional cap-and-trade program announced in 2005 which covered 27 eastern US states and sought to reduce sulfur dioxide emissions from coal-fired power plants. The rule was later vacated after a court found that the non-targeted design of the program did not comply with the Clean Air Act provision to regulate interstate air pollution. Using a custom air pollution dispersion model, I calculate the interstate SO2 pollution from 493 coal-fired power plants across the United States between 1997 and 2020. In a difference-in-differences setup with plants not covered by CAIR in the control group, I estimate the treatment effect of the program on overall- and cross-border SO2 emissions and find a 24% reduction in overall emissions and reduces the risk that a plant violates air quality standards across state borders by 2–4%. I report evidence of heterogeneous treatment effects where the reduction in overall emissions attributed to CAIR is lower among plants transporting SO2 in excess of 1% of the National Air Quality Standards to another state.


Introduction
A key consideration in any attempt at regulating air pollution is its ability to effortlessly cross administrative and legal boundaries. A comprehensive theory of crossborder externalities was proposed as early as Montgomery (1972), who showed that the abatement effort mandated by the regulator ought to be higher for upwind sources that contribute to ambient pollution in downwind receptor regions. Indeed,

Background
The Clean Air Act is the United States' primary federal law to reduce nationwide air pollution. Initially enacted in 1963 the law, henceforth CAA, has been praised as a success of early U.S. environmental policy, for example in terms of health outcomes (Chay and Greenstone 2003). A collection of major amendments to the law came into force in 1990 and included tradeable permits in NO x and SO 2 . A cap-and-trade system under Title IV of the CAA, also known as the Acid Rain Program regulates acidifying pollutants, mainly from coal-burning power plants, by allocating permits to emitters and allowing reallocation via auction to improve economic efficiency. (McCubbin 2009) Allowances under Title IV are regulated by the Environmental Protection Agency (EPA) under §7408(a) of the Clean Air Act. The Acid Rain program has involved two phases, beginning in 1990 and 2000 respectively. Title IV also requires sources to install a continuous emission monitoring system (CEMS) and annually report emissions to the EPA and state regulators (Ellerman et al. 2000).

3
In Phase I, individual emissions limits were assigned to the 263 most SO 2 intensive generating units at 110 plants operated by 61 electric utilities and located largely at coal-fired power plants east of the Mississippi River. After January 1, 1995, these utilities could emit sulfur dioxide only if they had adequate allowances to cover their emissions.
During Phase I, the EPA allocated each affected unit, on an annual basis, a specified number of allowances related to its share of heat input during the baseline period [1985][1986][1987]. By Phase II, almost all coal-fired power plants were covered by the system. If trading permits represents a carrot in the system, the stick is a penalty of $2000 per ton of emissions that exceed any year's allowances and a requirement that such excesses be offset the following year (Stavins 2003). Largely considered successful, it is estimated that between 1990 and 2008, the majority of reductions in U.S. air pollution was due to changes in environmental regulation (Shapiro and Walker 2018).
The federal CAA regulates individual states' emissions via the National Ambient Air Quality Standards (NAAQS) where they are responsible for maintaining caps on ambient concentrations of air pollutants. The NAAQS for SO 2 is 75 ppb, measured as the 99th percentile of 1-h daily maximum concentration, averaged over three years. The EPA requires that individual states submit so-called State Implementation Plans (SIPs) detailing how they will comply with the national standards for each pollutant set under §7408 (Potoski 2001). Building on the success of the acid rain program, the EPA in 2005 introduced the Clean Air Interstate Rule (CAIR), which mandated that states and the federal government work together to address regional pollution.
Constructed upon the previous pollution credit programs in the ARP, CAIR created a regional trading program to reduce interstate pollution (Pleune 2006). The EPA determined which states would participate in the regional program based on a determination "significant contribution" to nonattainment of NAAQS for a downwind state (Glasgow and Zhao 2017). However, there was not a designation of individual plants as high-or low risk of significant contributions, and one does not yet exist. The 1990 amendments to the CAA also added provisions specifically to combat externalities due to spatial diffusion of air pollutants. This "Good Neighbor" provision states that an"upwind" state may be ruled in violation of Title IV if pollutants from point sources move to"downwind" states in such quantities that they impede the ability of the downwind state to meet its allowances under §7408 and its implementation plans (Gerrish 2020;McCubbin 2009). Although EPA found that out-of-state sources would cause non-attainment in 2010 (the States' deadline under the CAA for reaching attainment), EPA determined that it would not be feasible to reduce the out-of-state emissions by that time.
Instead, CAIR required the reduction to be implemented in two phases. States would implement the first phase of reductions by 2009 for NOx or 2010 for SO 2 . A second set of reductions would bring the level of out-of-state contributions to air quality non-attainment to an acceptable level by 2015. After a downwind state has filed a complaint of a Good Neighbor violation under Section 126, EPA has 60 days to respond. If EPA determines action is necessary, the upwind state must adress the emissions in their SIP, effectively reducing the permits its emitters are allowed to use. Failure to do so could, if the Good Neighbor provision is enforced, make the violating firm liable to pay the $2000 per excess tonne SO 2 . Since there is no borrowing of permits from future allocation to plants allowed under Title IV (Schennach 2000), plants in the upwind state must either invest in abatement or buy permits at auction.

The collapse of CAIR: North Carolina v. EPA
An additional event on the timeline of interstate SO 2 regulation is of a particular note. The D.C. court of appeals ruling in the 2008 case North Carolina v EPA in favor of the state and a number of electric utilities, arguing that CAIR had a number of flaws, and because the EPA had adopted it as one, integral action, the rule in its entirety must be vacated and remanded to the EPA. The court's opinion was that CAIR could not properly respect the 'good neighbor provision' requiring sources to take responsibility for their contribution to nonattainment of NAAQS in the downwind state. One flaw found by the court was in CAIR's trading programs for SO 2 , which it said essentially amounted to a "regionwide approach" which failed to prohibit sources"within the State from contribut [ing] significantly to nonattainment in any other State…" (Kruse 2009) because sources could purchase enough SO 2 allowances to cover current emissions, resulting in no change (Tait 2009).
The result of the cap-and-trade system, North Carolina and a number of downwind power companies argued, is that downwind states and firms can do very little in terms of policy to address nonattainment of NAAQS, if significant contributions to ambient air pollution come from out-of-state sources that can buy permits to make up the difference. As summarized in Kruse (2009), the D.C. Circuit decided that the CAIR trading program went beyond the mandate of the CAA because the regional program did not address sources from one specific state contributing to nonattainment in another specific state.
In 2011, the Obama administration announced the Cross-State Air Pollution Rule (CSAPR) which replaced CAIR in 2015 and involves the same eastern states. CSAPR attempted to address the legal issues in CAIR by allowing only within-state trade in permits (Chan et al. 2012). As of 2021, there have been several Section 126 petitions: Between 2016 and 2018, Connecticut, Delaware, Maryland, and New York each petitioned EPA to regulate pollution from an upwind state. EPA denied all four petitions. Delaware, Maryland, and New York challenged those denials in court. In 2020, the D.C. Circuit denied Delaware's petition, granted Maryland's petition in part, and vacated EPA's denial of New York's petition (returning the petition to EPA for reconsideration) (Gerrish 2020). The unwillingness of the federal regulator to grant Section 126 petitions may be interpreted by emitters as a signal that violations are unlikely to be proven (Harstad and Eskeland 2010) (Fig. 1).

Theoretical foundation
This article contributes to an ongoing empirical literature on the effectiveness of capand-trade programs (Barreca et al. 2021;Chan and Morrow 2019;Glasgow and Zhao 2017) by focusing on the less studied aspect of cross-border pollution (Chen et al. 2022) 1 3 and by combining causal inference with geophysical modelling. In formulating an initial hypothesis, and throughout the remainder of this article, I make limited assumptions about the way firms respond to changes in the expected cost of polluting the air.
The natural experiment takes place in an economy with one environmental regulator and many polluting powerplants. Plants are distributed across several regions, each with administrative borders and responsibility for maintaining limits on pollution set by the regulator. In the standard cap-and-trade model, and in the absence of interstate pollution rules, the regulator determines ambient air quality standards according to its own evaluations of the social damage function, then introduces an emissions cap to achieve the ambient standards. Once the EPA allocated emission permits to coal-fired power plants and allowed trading in permits between plants it effectively introduced a market price for SO 2 emissions (Montgomery 1972;Xepapadeas 1992a). The regulator does not know the firm's abatement cost and so initial allowances were not allocated based on the marginal abatement cost but on its share of heat input (Stavins 2003). To enforce compliance, the Clean Air Act allows the EPA to impose a fine of $2000 per tonne SO 2 in excess of the cap. When the representative firm is a price-taker on the permit market, it chooses its abatement level such that its marginal abatement cost equals the market price for permits. This result is trivial, as when the permit price exceeds the marginal abatement cost the firm would rather abate another tonne of emissions, and vice versa. When the emissions cap is reduced as anticipated by CAIR-states following its announcement in 2005, average abatement costs rise and with them the permit price. Irrespective of its compliance status, the firm will stop investing in abatement once the marginal abatement cost equals the market price of permits. This is because the marginal abatement cost rises with the abatement effort, while the price for permits does not depend on the individual firm's choices (Stranlund and Chavez 2000). On the issue of market power in the permit market, Hintermann (2017) shows that price manipulation by dominant firms primarily results in pass-through of abatement costs onto consumers and taxpayers.
Overall, a reduction in the emissions cap is still expected to increase the price for permits. Accordingly, granted only the assumption that the threat of penalties for noncompliance with the CAIR caps is credible, we make the following proposition: Proposition I An increase (decrease) in the market price of permits results in a decrease(increase) in average emissions across power plants.

The firm's response to cross-border pollution
Suppose now that a firm in state Upwind, due to exogenous geographic conditions, exports some share ∈ [0, 1] of its pollution to state Downwind. Supported by historical accounts of the CAIR period (Glasgow and Zhao 2017;Schmalensee and Stavins 2013), we assume that states did not reliably penalize emissions from sources outside their borders. In this setting, theory predicts that as a larger share of pollution is transported out of Upwind (δ tends towards 1) a higher permit price is required for the Upwind firm to switch from permits to abatement. This is because in the event of noncompliance of amount ∆ tonnes above its allocated emission cap, the firm only expects to be penalized for (1 − ) ∆. Granted the assumption that externalities affecting downwind states do not affect the enforcement of State Implementation Plans, we state the second proposition: Proposition II The firm does not expect to be fined for emissions exceeding its allowances if the excessive pollution is transported out of the state in which it operates.

Gaussian dispersion modelling
To quantify downwind SO 2 dispersion from each coal-fired power plant I develop GAUSSMOD, a three-dimensional Gaussian dispersion model, in Python 3.6. The Gaussian model is one of the simplest dispersion models for point-source air pollutants. The plume dispersion equations featuring Gaussian distributed dispersion were first derived in Sutton (1947) and have become increasingly popular. In the advent of stringent environmental control regulations, there was an immense growth in the use of air pollutant plume dispersion calculations between the late 1960s and today (Zannetti 2013).
Gaussian models are popular because they are mathematically tractable, easy to implement, and rely on widely available data. They offer advantages over simple trajectories used in e.g. Heo et al. (2023) because they allow for estimation of crossborder concentrations (Fig. 2).
In this paper, I implement the Gaussian model from Abdel-Rahman (2008) and U.S. EPA (1989) and apply it to SO 2 emissions. The plume dispersion equations are as follows: where f = exp −y 2 ∕ 2 2 y is the crosswind dispersion parameter and is the vertical dispersion. C is the concentration of emissions, in g/m 3 , at any receptor located x meters downwind from the emission source, y meters crosswind from the emission plume centerline, and z meters above ground level. σ y is the horizontal standard deviation of emissions dispersion, while σ z is the standard deviation in the vertical. σ y and σ z are functions of the atmospheric stability class (i.e. a measure of the turbulence in the ambient atmosphere) and of the downwind distance to the receptor. The two most important variables affecting the degree of pollutant emission dispersion obtained are the height of the emission source point and the degree of atmospheric turbulence.
The more turbulence, the better the degree of dispersion. For a description of the six stability classes A-F used in this model that depend on wind speed and cloud cover, see Pasquill (1961). The equations for σ y and σ z are: where I, J, and K are coefficients that depend on the stability class at the stack location (Seinfeld and Pandis 2016), Ch. 18. Equation 2 shows that both crosswind dispersion and vertical dispersion are functions of distance downwind from the pollution source, with lower concentration in both dimensions further from the smokestack. Equation 1 also shows that the concentration at ground level can be reduced by increasing the height of the smokestack H. The effective height H e of the smoke centerline is the sum of the stack height and the plume rise at a given distance x from the smoke stack. The plume rise is determined by the downwind horizontal distance from the stack and the buoyancy factor, which describes the upward force exerted by the gas on the air above (Beychok 2005). The buoyancy factor F is calculated using the following equation: where T g − T a gives the temperature difference between the exit gas and the surrounding air. Because hot gases rise faster, a large temperature gradient between the sulfur dioxide and ambient air will allow the pollutant to rise higher before the temperatures equalize and wind speed and direction dominate as drivers of plume trajectories. Similarly, a high gas exit velocity will have the same effect (Beychok 2005). The model uses the plume rise equation from Briggs (1982) where the plume rise is Δh = 1.6F 1∕3 x 2∕3 h −1 and thus the effective stack height H e = H s + Δh (Table 1) (3)

Data
The raw data used in this article is exclusively from publicly available sources. Replication code and documentation, including the source code for the dispersion model GAUSSMOD, are made available as supplementary material. Hourly data on wind speed, wind direction, ambient temperature and cloud cover were obtained from the Global Historical Climatology Network (Menne et al. 2012).
The hourly 30-year normals dataset includes 1991-2020 averages for every hour, totalling 8760 h. After incomplete time series had been removed, complete records remained for 423 weather stations across the continental United States. The normals are constructed from hourly observations, and quality assurance checks are routinely applied to the full dataset, although Menne et al. (2012) acknowledge that the data are not homogenized to account for artifacts associated with the various eras in reporting practice at any station (i.e., for changes in systematic bias). Hourly data were aggregated into 12-h daytime (07.00-18.59) and night-time (19.00-06.59) averages.
Normals in wind direction, speed and cloud cover over a 30-year period were used because they are the most indicative of hourly variation in these variables across any given year (Arguez et al. 2012). To account for climate trends, observed air temperature daily time series were used instead of normals following Leppert et al. (2021). Daytime temperature was calculated as a weighted average of maximum and minimum temperatures (0.75 * TMAX + 0.25 * TMIN) and night-time as 0.25 * TMAX + 0.75 * TMIN.
Data on plant characteristics were obtained from the U.S. Energy Information Administration which publishes data collected from all coal-fired power utilities in annual EIA-767 and EIA-923 surveys. The surveys include data on net generation, heat input, stack height, stack radius, mean exit gas velocity, and mean exit gas temperature. The environmental compliance form also provide self-reported plant-level spending on flue gas desulfurization (FGD). While self-reports come with the usual caveats, EIA form data have been used in previous research on coal-fired utilities' emissions accounting (Quick 2014) and remain the most comprehensive publicly available reports.
Data on annual SO 2 emissions and permit holdings for coal-fired power plants across the CAIR/CSAPR region were collected from the Air Markets Program data supplied by the U.S. EPA. Plant-level emissions data are available from the conception of the Acid Rain Program in 1995 through to today and include values from firms' own reports as well as EPA monitoring. Utility codes are consistent across EIA and EPA datasets and allow me to track individual utilities through changes in the surveys over the years.
Emissions, net generation, operational flue gas desulfurization spending (filters, scrubs, sorbent, and labor) have missing entries as completed surveys are not received by the EIA for every utility in every year. There is a small discontinuity in 2007 when the EIA-923 form superseded the EIA-906, EIA-920, FERC423 and EIA-423. This change improved coverage. Schedule 2 of the EIA-923 collects the Environmental Economics and Policy Studies (2023) 25:407-433 plant level fuel receipts and cost data previously collected on the FERC and EIA forms.
Several approaches exist to deal with missing data. The researcher might collect more data themselves, drop observations containing missing data in at least one variable from the sample, or use one among a number of imputation methods (Little and Rubin 2019). As the first option is not feasible and the second presents an avoidable loss of power, I begin with imputation and compare the summary statistics with the complete analysis data, where entries containing missing data are removed. Missing values were imputed based on the remaining plant characteristics while accounting for plant-and yearly fixed effects using multivariate imputation with the R MICE package. Multivariate imputation is commonly used in survey data and can provide smaller variance than alternative methods with small sample sizes (< 10,000) (Yadav and Roychoudhury 2018).
The Multiple Imputation by Chained Equations (MICE) algorithm is implemented in four steps: (1) Missing values are imputed with a simple method such as imputing the mean. (2) These placeholder mean imputations are returned to missing, one variable at a time. (3) The non-missing observations of the variable currently in step (2) are regressed on the other variables. (4) Regression coefficients for each predictor are used to impute missing values in the variable, which is then itself used as a predictor in case of further variables containing missing data (Van Buuren and Groothuis-Oudshoorn 2011). Table 2 shows summary statistics from the imputed dataset next to the complete data. Comparing means and medians shows that distributions for several variables are skewed towards zero. Following suggestions in Little and Rubin (2019) I therefore use predictive mean matching in step (1) which is implicit and does not require specifying the distribution of the target variable. A Jarque-Bera test rejects a normal distribution for all variables in both samples (p-values < 0.01). Deviation from normality does not in itself invalidate regression analysis but may be exaggerated by outliers in the sample and should be handled with care in model specification. Figure 3 shows scatterplots of four covariates against SO 2 emissions. I plot a log-log specification which best fits the linear model given the distributions of covariates. Figure 3 shows that the imputed sample (red) contains more outlier observations. Specifically, they arise from imputed zeros in unobserved emissions data. Weighing the risk of overstating standard errors using the imputed sample against the modest loss of power (8452 versus 8557 observations) I proceed with the smaller sample without imputation.

Method
This paper aims to estimate the effect of tightening the cap on SO 2 emissions on overall and cross-border pollution. According to Proposition I, the announcement of CAIR should increase the market price for permits as affected firms scramble to comply with the lowered emission cap. Compliance was incentivized via a $2000 fine per excess tonne of SO 2 and the enforcement mechanism involved mandatory installation of CEMS and emission reporting (Ellerman et al. 2000). Higher permit prices relative to marginal abatement costs (the cost of flue gas desulfurization, such as limestone wet scrubbers, has declined throughout the study period, for both treatment and control groups (Chestnut and Mills 2005)) are expected to increase abatement in CAIR-states compared with unaffected emitters. Equation (1) states that SO 2 dispersion correlates positively with emission rates. I can therefore state in conjunction with Proposition I the first null hypothesis: Hypothesis I The announcement of CAIR caused no change in average cross-border SO 2 emissions from the power sector.
Rejecting hypothesis I would confirm that emission rates are important drivers of cross-border pollution, possibly alongside time-invariant factors like the locations of point-sources. The 2008 North Carolina v. EPA ruling established that interstate trade in permits between sources invalidates protection against cross-border pollution. A separate enforcement mechanism exists via the Good Neighbor provision wherein downwind states can petition the EPA to penalise cross-border sources. However, as emphasized in Harstad and Eskeland (2010), the reluctance of the EPA to grant Section 126 petitions call into question the likelihood of penalties. Based on Proposition II that firms do not expect to be fined for excess emissions that are transported out of their home state, I formulate the following null hypothesis: Hypothesis II Plants contributing cross-border transport of SO 2 emissions did not respond differently to the CAIR announcement.
Rejecting hypothesis II would provide evidence of moral hazard in the permit market, where interstate polluters are less incentivized to comply with emission caps.
In a natural experiment with electric utilities covered by CAIR in the treatment group and remaining ARP utilities as controls, inference relies first on identifying upwind power plants and estimating their cross-border emissions. I do this by feeding hourly data on SO 2 emission rates and local weather conditions for coal-fired power plants in 27 eastern states into a custom Gaussian air dispersion model GAUSSMOD.

Defining cross-border pollution
The cross-border SO 2 is defined as the average SO 2 concentration (µg/m 3 ) dispersed from a given plant outside of the state in which it is located. Based on heat input, stack flue characteristics and local weather conditions, GAUSSMOD calculates the concentration measured at ground level (1.5 m) where health impacts are typically measured (World Health Organization 2006). Dispersion is calculated across a 50,000 m 2 area around the plant, with a resolution of 1000 m 2 following De Kluizenaar et al. (2001). Figure 4 displays the average daily SO 2 dispersion for two large coal-fired power plants, Barry Electric Generating Plant in Alabama and George Neal South Power Plant in Iowa. Over an average day, pollution from George Neal is transported across the Iowa-Nebraska border. Figure 4 illustrates how location and weather trends affect the problem of cross-border pollution.

Causal identification and estimation
Difference-in-differences (DD) is a method designed to estimate the causal impact of a policy on some outcome, such as cross-border pollution. It is known as a quasiexperimental method, because it attempts to approximate randomized controlled experiments, arguably the gold standard of empirical science, using observational data outside of a controlled lab setting. It requires observations from before and after some policy intervention, from the treatment group and unaffected controls.
CAIR raised the price of permits for SO 2 emissions by reducing the supply relative the nationwide Acid Rain Program via a new regional cap-and-trade program (Shouse 2018). An increase in the permit price is expected to cause an increase in abatement, because power companies are willing to accept a higher abatement cost. The increase in the permit price following the announcement of CAIR in 2005 Crucially, CAIR was a regional program covering power plants in 27 states. Plants covered by the rule are labelled as treated, while remaining plants serve as a control group. These groups produce the second difference in DD. All coal-fired power plants within the CAIR region are treated at the same time and in the absence of staggered treatment (Goodman-Bacon 2021), I use the canonical two-way fixed effects difference-in-differences model with a panel of plants i and years t: where the index k for outcome e denotes (a) total SO 2 emissions, (b) cross-border SO 2 emissions, and (c) CO 2 emissions as a robustness check. G i is a dummy variable taking the value 1 if plant i is covered by CAIR, and zero otherwise. CAIR t is a dummy variable taking the value 1 when year t is in the post-CAIR years and zero otherwise. X it is a vector of n covariates. As noted in Schmalensee and Stavins (2013), initial allocation of annual allowances to firms under the Acid Rain Program was based on heat input. Greater heat input is therefore expected to be associated with higher emissions.
Similarly, I control for number of permits held by the firm, where high emitters are expected to hold more permits. Further control variables are net electricity generation, total operation time across a plant's generators, and desulfurization  (Fig. 3) shows that a log-log specification in sulfur, heat input, generation and operating time produces the best linear model fit.β DD is the double-difference estimator and the coefficient of interest. It is the difference in average outcome in the treatment group before and after treatment, minus the difference in average outcome in the control group before and after treatment. It can be interpreted as the average treatment effect on CAIR states if, without the policy, the outcome would have evolved in parallel in the treatment-and control groups. This is the parallel trends assumption (Donald and Lang 2007) which I will discuss in detail shortly. If β DD is significantly different from zero, hypothesis I is rejected.
To test hypothesis II, Eq. (4) is extended in Eq. (5) with a triple differences model (Kellogg and Wolff 2008) where the DD variable is interacted with a dummy variable C it indicating if the maximum cross-border SO 2 from plant i in year t exceeds 1% of NAAQS, or 0.75 ppb. This is the screening threshold to identify states with sources that may contribute significantly to air quality problems in downwind states (Shouse 2018, U.S. EPA 2019. I do this to test for heterogenous treatment effects between plants that contribute meaningfully to downwind cross-border pollution and those that do not, following similar experimental designs in e.g. Berck et al. (2016)  In the triple differences (DDD) setup, following the reasoning in Gruber (1994), I compare the double difference among plants that are interstate polluters (max crossborder SO 2 > 0.75 ppb) against the double difference among plants that are not. The coefficient of interest β DDD tells us the difference in the treatment effect between cross-border polluters and others.
An estimate of β DDD statistically different from zero rejects hypothesis II. Assumptions established in Sect. 3 predicts a β DDD > 0 due to moral hazard. The identifying assumption of this DDD estimator is fairly weak: I have previously established that there is no change in policy between C it = 1 and C it = 0 due to the insufficiency of CAIR to penalize cross-border pollution. Like the double difference setup, it also requires that there be no contemporaneous shock that affects the relative outcomes of the treatment group in the same state-years as the law. (5)

Addressing selection bias and parallel trends
Environmental Economics and Policy Studies (2023) 25:407-433 This is unsurprising as the CAIR region sought to address SO 2 pollution from the worst emitters. Although Heckman et al. (1996) recommend that the two-by-two treatment group and time interaction is robust to selection bias, the double-and triple difference estimators only recover the true causal effect of the policy of interest when there are not concomitant (simultaneously occurring) trends that differentially affect the treatment and control groups (Wooldridge 2007). A robustness test following Jia et al. (2021), with a sample of plants matched on treatment-assignment propensity scores is reported in Appendix A.
In this case, concomitant treatment effects could arise from policies and economic trends that differentially (dis)incentivizes pollution between CAIR states and outside. To test for concomitance bias, I also estimate a variant of Eqs. (4) and (5) with CO 2 emissions as the outcome variable. CO 2 emissions result from the same coal burning process as do SO 2 emissions and are perfectly correlated absent any abatement. However, CO 2 emissions were not differentially regulated in the two regions as part of CAIR. If no CAIR-related treatment effect can be observed for carbon emissions, the concomitance hypothesis can be more confidently rejected.

Results
Event studies (Fig. 7) on the three main outcome variables (sulfur, cross-border sulfur, and carbon emissions) show that any observable pre trends are not statistically significant. Zero (or parallel) pre trends suggest that future coverage by CAIR does not have an effect on the outcomes. These results support the hypothesis that in the counterfactual absence of CAIR, emissions would not have evolved differently between power plants in the two sets of states. While no definitive proof of the counterfactual exists, event studies showing zero parallel pre trends have often been used to support the hypothesis, including Barreca et al. (2021) and Fowlie et al. (2018). Figure 7 indicates a clear negative treatment effect for overall sulfur emissions, which suggests benefits on top of the Acid Rain Program reductions acknowledged in Chay and Greenstone (2003) just before CAIR was announced, and more recently in Barreca et al. (2021). Moving on to carbon emissions, the event study shows no significant treatment effect from CAIR. While lag means trend downward following the Car announcement, they never fall outside the 95% confidence interval around the null. This provides more convincing evidence that no concomitant effects were differentially affecting wider abatement decisions among firms in the CAIR region that could also have influenced sulphur emissions.
Finally, Fig. 7 shows the event study for our primary outcome of interest which is a dummy variable indicating whether cross-border sulfur calculated with GAUSS-MOD exceeds 1% of the NAAQS. The event study again shows a negative but less pronounced treatment effect from CAR, where the announcement lowers the average probability that a treated plant transports at least 0.75 ppb to another state. Table 3 displays the regression estimates for the outcomes k in Eq. 4. Unless otherwise specified the models are estimated using the "feols" command in the R "fixest" package for fixed effects OLS with heteroskedasticity robust standard errors clustered at the plant level (Berge et al. 2018). Sulfur, carbon, heat input, generation and operation time are log-transformed to better fit the linear model (a ca 0.1 improvement in R 2 ) versus the original data. As suggested by the event studies, model (1) results in a significant difference-in-differences estimate of − 0.24 interpreted as a ca 24% reduction of sulfur emissions in CAIR states as a result of the policy. When the parallel trends assumption holds, the difference-in-differences estimator can be approximated as the ATT (Kahn-Lang and Lang 2020) and is widely used for program evaluation.
Model (2) shows Eq. (4) with carbon emissions as the outcome. This model was estimated to evaluate the risk of concomitant treatment effects interfering with the supposed causal effect of CAIR. The DD estimate for model (2) is − 0.002 and is not statistically significant. The announcement of CAIR does not appear to have had affected pollutants not regulated by CAIR itself. Models (3) and (4) are the main equations of interest. The outcome in model (3) is average annual cross-border SO 2 (µg/m 3 ). The DD estimate is − 0.02 and statistically significant. The result is that CAIR caused on average a 0.02 µg/ m 3 reduction in cross-border SO 2 but should be interpreted cautiously. Recent research, e.g. Boulton and Williford (2018), has raised concerns about OLS with so-called semicontinuous outcomes where the data contains a large proportion of zeros. Unlike zeros resulting from censoring (Tobin 1958), cross-border sulfur is highly skewed toward zero simply because many plants do not produce any cross-border pollution.
The literature explores two-part models (Duan et al. 1983) and binary logitor linear probability models (Buntin and Zaslavsky 2004) as solutions. Because logit coefficients are less easily interpreted, and the drawbacks of LPM are irrelevant in a difference-in-differences setting (prediction is not an objective), model (4) estimates (3) with LPM. Its binary outcome takes the value one if the average cross-border SO 2 concentration from a plant i in year t exceeds 0.75 ppb, or 1% of the NAAQS (U.S. EPA 2019), zero otherwise. The treatment effect is − 0.03 and significant. The interpretation is that on average CAIR reduced the probability that a plant contributed an excess of 1% of NAAQS to in another state by 3%.   1 3 Table 4 shows three specifications of the triple differences model designed to test hypothesis II. The triple difference estimator in model (1) is the regression coefficient for (G i * CAIR t * C it ) and is positive at 0.23. It suggests that the treatment effect from CAIR on average SO 2 emissions was ca 23% smaller among plants that transported at least 1% of the NAAQS (0.75 ppb) across state boundaries. This result supports rejection of hypothesis II, as the reduction in emissions because of CAIR was less pronounced among plants that transport a meaningful amount of SO 2 across state lines.

Sensitivity analysis: distance to state border
Models (2) and (3) instead estimate the heterogeneous treatment effects among plants less than 10 and 20 km from a state border, respectively. A plant's proximity to a state border is strongly but not perfectly correlated with the likelihood of producing cross-border SO 2 (0.41). It is plausible that moral hazard incentives arise not primarily from the cross-border emissions themselves but from the proximity to another state. For example, polluters may be unaware of their cross-border contribution, which a monitoring system attached to the flue stack cannot estimate and use distance to borders as a proxy. I therefore also report DDD estimates for these two groups in Table 4. For plants within 10 km from a border, the DDD coefficient for SO 2 is positive and statistically significant at 0.33. Irrespective of cross-border SO 2 emissions, the abatement effect from CAIR was less pronounced among plants within 10 km from the border. However, for model (3) the DDD coefficient is null. This heterogenous treatment effect (potentially from moral hazard) does not appear to extend as much beyond 10 km. These estimates arise from data further illustrated in Fig. 8, showing a smaller CAIR-associated treatment effect for plants closer to a state border. This is not due to plants close to the border starting off from a higher base rate of emissions. The correlation between emissions and border proximity is only − 0.003 in the pre-CAIR period. Similarly, the post-CAIR reduction in average SO 2 emissions is lower among treated plants that transport more than 50% of their emissions across state lines, and the divergence with the control group diminishes.

Discussion and conclusion
While the Clean Air Interstate Rule was a regional program, its cap-and-trade mechanism was not spatially targeted. Following a U.S. court ruling against the Environmental Protection Agency in 2008, CAIR was vacated partly on the grounds that its design did not adequately protect downwind states against cross-border pollution. North Carolina v. EPA held that that the CAIR trading program went beyond the mandate of the Clean Air Act because the regional program did not address sources from one specific state contributing to nonattainment in another specific state. EPA designed CAIR to eliminate pollution from out-of-state sources as a group, as  Kruse (2009): "Pollution would be reduced regionally, but any state could buy enough credits to escape the requirement to reduce its impact on other states". However, if cross-border pollution primarily depends on overall emission rates, modelling CAIR on the successful Acid Rain Program may not have been a significant problem in practice.
In this article I have evaluated this hypothesis and provided evidence against the argument that CAIR was ineffective at reducing interstate pollution. Using a novel combination of atmospheric dispersion modelling and difference-in-differences analysis, I support previous findings that CAIR was indeed successful in reducing overall sulfur emissions from covered sources (20-30%) due to a temporary rise in the price of permits, but also report a reduction in cross-border sulfur concentrations and the number of sources that transported sulfur across state lines. CAIR caused an average 2.3-3.7% reduction in the risk of exceeding 1% of NAAQS in a downwind state.
I support previous evidence (Glasgow and Zhao 2017;Heo et al. 2023) that crossborder emissions are partly driven by geographic factors, most importantly the distance of the source from a state border, and also annual weather trends as I discover that there are plants several kilometers from a state border, yet contribute to downwind sulfur pollution in other states. I add to this literature by quantifying cross-border pollution using a custom Gaussian dispersion model and showing that concentrations are universally below the national air quality standards (NAAQS) set by the EPA, although states around the former coal-mining belt of Kentucky, Indiana, Ohio, and West Virginia (see Fig. 6) share many high-emission sources along their borders.
By computing the contribution of cross-border pollution from each plant using GAUSSMOD I uncover that moral hazard may have de-fanged the effectiveness of CAIR for certain plants. The reduction in overall annual sulfur emissions caused by the CAIR announcement was weaker among affected plants that contributed more than 0.75 ppb of cross-border SO 2 concentration. Additionally, this weaker treatment effect extends to plants within 10 km of a state border, even though less than 50% contribute to cross-border non-attainment. A possible mechanism to explain this phenomenon is the way SIPs (see Sect. 2) are applied. States submit SIPs to the EPA outlining their plans to achieve air quality targets within their state and regulations in the SIPs are generally enforced by the state. While Section 126 petitions have increased over the past five years (Gerrish 2020), states may be less motivated to regulate pollution which leaves its borders.
However, my results also indicate that this moral hazard may be primarily driven by proximity to the state border, not knowledge about cross-border contributions itself. My results provide new nuance to the arguments that led to the vacation of CAIR in the 2008 North Carolina v. EPA case. On the one hand, average crossborder SO 2 declined as a result of CAIR. On the other hand, the decline was considerably smaller than that of overall emissions (2-4% versus 24%). In addition, SO 2 emissions from plants that did contribute to cross-border concentrations appear less affected by CAIR, as were plants within10 kilometers from a state border. Moral hazard can be prevented by monitoring not only emissions at the source but also cross-border transport, for example using the EPA's AERMOD dispersion model which inspired GAUSSMOD. A trading ratio can be applied to the permit market in which a purchasing plant faces a higher (lower) price reflecting the relatively higher Environmental Economics and Policy Studies (2023) 25:  (lower) propensity for cross-border pollution vis-a-vis the seller Holland and Yates (2015).
Acknowledging the geographic moral hazard problem is particularly important in settings where regional regulators have less incentives to collaborate. For example, Heo et al. (2023) find that transboundary air pollution from China significantly increases mortality and morbidity in South Korea. Even within China, Cai et al. (2016) find that provincial governments respond to pollution reduction mandates by shifting their enforcement efforts away from the most downstream county, from where pollution is directly transported into another province. A regional cap-andtrade program across East Asia or the ASEAN region would likely raise similar concerns about moral hazard. A permit market with spatially explicit trading ratios based on downwind risk might help manage these concerns.

Appendix A: matched controls
Matching of control plants to treated plants is done on pre-treatment 2004 observations of time-invariant predictors share of cross-border emissions and distance to state borders. These variables most strongly predicted assignment into the treatment group using a generalized linear probability model. Propensity score matching was performed using the "MatchIt" package in R. As not all treated plants could be matched to a suitably similar control, the sample in Table 5 is a smaller balanced panel of 188 plants across 24 years.