Estimating power plant CO2 emission using OCO-2 XCO2 and high resolution WRF-Chem simulations

Anthropogenic CO2 emission from fossil fuel combustion has major impacts on the global climate. The Orbiting Carbon Observatory 2 (OCO-2) observations have previously been used to estimate individual power plant emissions with a Gaussian plume model assuming constant wind fields. The present work assesses the feasibility of estimating power plant CO2 emission using high resolution chemistry transport model simulations with OCO-2 observation data. In the new framework, 1.33 km Weather Research and Forecasting-Chem (WRF)-Chem simulation results are used to calculate the Jacobian matrix, which is then used with the OCO-2 XCO2 data to obtain power plant daily mean emission rates, through a maximum likelihood estimation. We applied the framework to the seven OCO-2 observations of near mid-to-large coal burning power plants identified in Nassar et al (2017 Geophys. Res. Lett. 44, 10045–53). Our estimation results closely match the reported emission rates at the Westar power plant (Kansas, USA), with a reported value of 26.67 ktCO2/day, and our estimated value at 25.82–26.47 ktCO2/day using OCO-2 v8 data, and 22.09–22.80 ktCO2/day using v9 data. At Ghent, KY, USA, our estimations using three versions (v7, v8, and v9) range from 9.84–20.40 ktCO2/day, which are substantially lower than the reported value (29.17 ktCO2/day). We attribute this difference to diminished WRF-Chem wind field simulation accuracy. The results from the seven cases indicate that accurate estimation requires accurate meteorological simulations and high quality XCO2 data. In addition, the strength and orientation (relative to the OCO-2 ground track) of the XCO2 enhancement are important for accurate and reliable estimation. Compared with the Gaussian plume model based approach, the high resolution WRF-Chem simulation based approach provides a framework for addressing varying wind fields, and possible expansion to city level emission estimation.


Introduction
Our ability to predict future climate change requires an improved understanding of carbon dioxide (CO 2 ) emissions and uptake. Top-down inversion approaches that couple atmospheric observations of CO 2 with chemistry transport models have been widely used to estimate biospheric and oceanic CO 2 flux, with the fossil fuel emission often prescribed as a known quantity [2]. In such approaches, errors in the prescribed fossil fuel emission will lead to biases in the inversely estimated biospheric and oceanic fluxes [3][4][5], and large uncertainties exist in the fossil fuel emission inventory data [6,7].
For instance, inverse modeling of anthropogenic CO 2 emissions has been applied to large cities [8][9][10], and power plants [1,11]. These studies can be broadly grouped into two categories. The first category solves for long term average fossil fuel CO 2 emission by assimilating in situ CO 2 mixing ratio measurements. For instance, Nickless et al [10] solved for weekly average emissions on a 101×101 grid over Cape Town, South Africa, and Lauvaux et al [8] solved for 5-day average emissions on a 87×87 grid over Indianapolis, IN, USA.
The second category solves for short term emission rates through assimilating remotely sensed column average CO 2 (XCO 2 ) measurements. This category includes the Observation System Simulation Experiments (OSSE) studies by Pillai et al [12] and Broquet et al [13] using the simulated CarbonSat XCO 2 data [14], by Nassar et al [1] using OCO-2 [15] XCO 2 , and by Krings et al [11] using airborne XCO 2 measurements. These studies utilize the spatial coverage of XCO 2 to distinguish the CO 2 plume caused by localized anthropogenic sources from the background. This is done either by identifying the background XCO 2 before extracting the plume enhancement [1,11], or through simultaneous optimization of anthropogenic emissions and the background XCO 2 [12,13].
Because XCO 2 enhancement from localized anthropogenic sources (large cities and power plants) decreases rapidly with increasing pixel size [14], high spatial resolution imaging is critical for estimating the emission rates of these sources. Although OCO-2 is not an optimal platform for estimating localized anthropogenic emissions due to large gaps between its narrow swaths, a few studies have shown it is possible in certain situations [1,16].
The objective of this manuscript is to estimate the power plant emission rates through OCO-2 XCO 2 data and high resolution WRF-Chem simulations [17,18]. Our method is similar to that of Pillai et al [12] and Broquet et al [13], but differs in a key aspect: instead of Bayesian inversion, we use a maximum likelihood estimation, which obtains power plant CO 2 emissions from OCO-2 XCO 2 data alone without prior emission information.

Methods
In this section, we first describe the WRF-Chem modeling system, followed by the XCO 2 observation operator, Jacobian matrix calculation, OCO-2 XCO 2 data, and finally the maximum likelihood estimation approach.

WRF-Chem modeling system
We use WRF-Chem (version 3.6.1) [18] as the atmospheric transport model. For each inversion case, WRF-Chem is set up with one-way four-level nested domains with horizontal grid spacing of 36, 12, 4, and 1.33 km from the outermost to the innermost domains. Each inner domain is located approximately at the center of its parent domain. The innermost domain (d04) has 97 grid points in the north-south direction and 61 grid points in the east-west direction, and is centered approximately at the power plant location for each inversion case.
For all WRF-Chem simulations, all four of the model domains have 52 vertical levels between the surface and the model top of 100 hPa. For a column where the surface is at the sea level, 19 of the 52 vertical levels are below 2 km. WRF physics packages used for the simulations are: WSM3 microphysics [19], Dudhia shortwave [20] and RRTM longwave radiation [21], the Unified Noah Land Surface Model, the Kain-Fritsch cumulus scheme [22] (only applied to the 36 and 12 km domains), and ACM2 boundary layer scheme [23]. Positive-definite transport is applied to the advection of scalars and CO 2 . For each inversion case, the WRF-Chem simulation runs for 36 h, including a 12 h spin-up. Simulation results are saved at 15min intervals. The simulation result that is closest to the OCO-2 overpass time for a given case is used for its inversion. The WRF-Chem simulation start time, time used for inversion, and OCO-2 overpass time for the seven inversion cases are given in table 1.
Meteorological initial and lateral boundary conditions are from the National Centers for Environmental Prediction (NCEP) Climate Forecast System Version 2 (CFSv2) 1°×1°six hourly products [24]. CO 2 initial and lateral boundary conditions are from the Carbon-Tracker version 2017 (CT2017) [25] posterior mole fraction dataset. The CT2017 1°×1°three hourly posterior fluxes for biosphere, fossil fuel, biomass burning, and ocean are horizontally interpolated to the WRF-Chem domains using bicubic interpolation. They are processed into three hourly emissions files which are ingested by the WRF-Chem model during its integration. In our simulations, CO 2 is treated as a chemically inert tracer and its uptakes by the biosphere is through using the CarbonTracker biospheric fluxes.
2.2. OCO-2 XCO 2 observation operator WRF-Chem simulated three-dimensional CO 2 mixing ratios are used by the observation operator to calculate the modeled XCO 2 . We implement the XCO 2 observation operator following Connor et al [26] and O'Dell et al [27] as: where X m CO 2 is the modeled XCO 2 , X a CO 2 is the prior XCO 2 , a j CO , 2 are elements of the column averaging kernel, h j is the pressure weighting function, x a is the prior CO 2 vertical profile, and x m is the WRF-Chem simulated CO 2 profile interpolated to the OCO-2 XCO 2 pressure levels assuming CO 2 varies linearly with atmospheric pressure [27]. The prior CO 2 profile, prior XCO 2 , column averaging kernel, and pressure weighting function are taken directly from the OCO-2 XCO 2 data. An analysis of the column averaging kernel on emission estimations is given in section 1 of the supplementary material available online at stacks.iop. org/ERL/14/085001/mmedia. . 2 Here, f (e) is a base run and f (e+Δe) is a perturbed run of WRF-Chem. A base run is a 24 h simulation with all its CO 2 flux prescribed using the three hourly CT2017 CO 2 fluxes as described in section 2.2. A perturbed run is identical to its base run counterpart, except that a small extra CO 2 emission De is added to the WRF-Chem grid at the location of a given power plant. Note that while all the CT2017 based CO 2 fluxes are placed at the lowest level of the WRF-Chem grid, the perturbation Δe for power plant emissions is added to the vertical level which corresponds to the smoke stack height. Multiple stacks of a given power plant are treated as a single point source (see Nassar et al [1] for the details of the smoke stack heights). We modified the WRF-Chem code to add the power plant emissions to the appropriate vertical levels based on the model's terrain-following hydrostatic-pressure vertical coordinate. The details of this modification are given in section 6 of the supplementary material.

Maximum likelihood estimation
To obtain power plant emission values from the OCO-2 XCO 2 data alone without prior information, we use the maximum likelihood approach. A maximum likelihood estimation is obtained by minimizing the cost function: Hx y R Hx y , 3 T 1 where the state vector = [ ] b x e T consists of two components: the power plant emission rates vector e (section 2.3), and the background XCO 2 b. The background XCO 2 is similar to that used in Pillai et al [12] and Broquet et al [13]. y is the observation vector whose elements are XCO 2 values from OCO-2 footprints. In this work we use three versions of XCO 2 data: v7 [28], v8 [29], and v9 [30]. H relates the state vector x to the observation vector y, and is formed by the forward model Jacobian matrix K (section 2.3) and the unity: where the unity value represents the background XCO 2 influence on the modeled XCO 2 . R is the error covariance matrix (section 2.5). When the modelobservation residual error is assumed to follow a Gaussian distribution, the maximum likelihood estimationx is obtained as Section 2 of the supplementary material includes an analysis of the residual error of the maximum likelihood estimation.

Error covariance matrix R
The error covariance matrix R should include observational uncertainties and transport model uncertainties: We set the observation uncertainty covariance matrix R obs as s s where s XCO 2 is the XCO 2 posterior error (uncertainty) provided by the OCO-2 data. The correlation between the uncertainty of two OCO-2 footprints is calculated using where t i,j is the time difference between two individual OCO-2 footprints, t max is the maximal time difference, and corr max is the maximal error correlation. This configuration sets the XCO 2 error correlation to corr max for two footprints adjacent in time, and linearly scales to zero when the time difference increases to t max . We set t max =10.0 s and corr max =0.3 based on Worden et al [31]. Transport model uncertainty R is much more difficult to quantify [32,33], so we adopt a simple approach used in Deng et al [34] of inflating the observation error.
where c is a scaling factor. We set c=1.5 and examined its impacts on the estimation uncertainty in section 5.1 of the supplementary material.

Results and analysis
Nassar et al [1] identified seven OCO-2 overpasses that captured emission plumes from middle-to-large-sized coal power plants. We apply the method described in section 2 to these cases to estimate the power plants' daily mean CO 2 emission rate. When equation (5) is used for the estimation, there are different choices for H, y, and R. First, the K component of H represents the forward model Jacobian matrix. Because the WRF-Chem CO 2 transport process is not linear, the value of K varies with the choice of emission vector e when equation (2) is used. Second, there are different choices for the observation vector y from different OCO-2 data versions. Third, R varies with the scaling factor c of equation (10).
Because all of these changes will influence the maximum likelihood estimation results, we assess the emission estimation sensitivity to the above variation in H, y, and R. For the sensitivity to transport model non-linearity, we calculate the K (and thus H) for a set of four emission vectors e: for cases with one power plant (table 1), e is set to [1], [25] ] 100 100 T . We note that calculating K at different emission vectors e in our maximum likelihood estimation is similar to choosing different emission priors for a Bayesian inversion [12,13]. For a Bayesian inversion that seeks to minimize the cost func- . When the prior uncertainty covariance matrix B is set to infinity (or very large values), the prior x b becomes non-informative, and the posterior x a is numerically equivalent to the maximum likelihood estimate = --- [36]. In this regard, using K calculated at different e is similar to using different prior emissions in a Bayesian estimation. This results in a set of four H for each of the seven cases. For the sensitivity to the variation in observation data, we use the three latest versions (v7, v8, and v9) of OCO-2 XCO 2 to form the observation vector y. Between the v7 and v8 data, there are a large number of changes in the OCO-2 XCO 2 retrieval algorithm related to spectroscopy, aerosol treatment, the surface model, and prior meteorology [29]. The differences between the v8 and v9 data are mainly from an update in surface pressure estimation [30]. The three versions of OCO-2 XCO 2 used for the seven cases are shown in figures 1-7 of the supplementary material. This results in a set of three y for each of the seven cases. With the scaling factor set as c=1.5, the above combinations of the different H and y results in 12 estimation results for each case. We list these estimations for the seven cases in tables 2-4.
The 2015-12-04 OCO-2 observation over the Westar Jeffrey Energy Center in Kansas has ideal conditions for emission estimation: the wind direction is aligned with the OCO-2 orbit track and the OCO-2 data were acquired in the nadir mode. Figure 1(a) shows the outermost (d01) WRF-Chem simulation domain for the Kansas case, where the three red color boxes represent the three nested inner domains (d02, d03, and d04), respectively. Figure 1(b) is a zoomed-in view of the 1.33 ×1.33 km innermost domain. There are 444 OCO-2 footprints within this innermost domain and they are used for the estimation (the observation vector y has 444 elements). To show more details, figures 1(c), (d), and (e) are plotted over the the area of the innermost domain around the power plant and its CO 2 plume. The spatial range of these three figures are marked by the red colored box in figure 1(b). Figure 1(c) shows the modeled XCO 2 calculated by the forward model (WRF-Chem and observation operator), with horizontal wind fields at the power plant smoke stack height level superimposed. Figure 1(e) shows the sensitivity of the modeled XCO 2 to the variation of power plant emission rate, which is represented by the forward model Jacobian matrix K. As explained in the last paragraph, we calculate K at a set of four emission rates, and figures 1(c) and (e) are the results from using e= [25] ktCO 2 /day. We note that the modeled XCO 2 (as depicted in figure 1(c)) is not used by the maximum likelihood estimation (equation (5)), but K is used (as part of H matrix).
A visual comparison of figures 1(c)-(e) shows that the WRF-Chem wind field ( figure 1(c)) and the resulting advection generated a power plant CO 2 plume that matches well with the OCO-2 observed plume ( figure 1(d)). Table 2. Inversion results at the seven cases. In the table, the power plants emission rate units are ktCO 2 /day, and background XCO 2 variations are in ppm. The rows labeled with 'Report' are the reported emission, the rows labeled with 'N17' are the estimations from Nassar et al [1], the rows labeled with v7, v8, and v9 are estimations in this paper using OCO-2 XCO 2 v7, v8, and v9 data, respectively. Among the estimations using the three OCO-2 data versions, there exists a fair amount of differences. Because all estimations are based on the same set of WRF-Chem simulations, the differences in the estimations are attributed to the differences among different OCO-2 data versions, including the differences in XCO 2 values, uncertainty, retrieval pressure levels, pressure weighting function, and averaging kernel. For the results using each of the OCO-2 data versions, there are only small differences (∼1 ktCO 2 /day) in the estimation, indicating our estimation approach is only weakly sensitive to the transport model non-linearity. In the Kentucky 2015-08-13 case, the OCO-2 track is about 8 km west of the Ghent power plant (figure 2). Our estimated emission values (9.84-20.40 ktCO 2 /day) are substantially lower than the reported value of 29.17 ktCO 2 /day. These underestimations are primarily caused by the complex and inaccurate horizontal wind field simulation by WRF-Chem. While the plume was located largely north of the river ( figure 2(d)), the WRF-Chem simulations put the plume south of the river (figures 2(c) and (e)). In comparison, Nassar et al [1] achieved an estimation of 29.46±15.58 ktCO 2 . Their estimation applied a 22.5°T  wind direction adjustment to optimize the match between the Gaussian plume model and the plume detected by the OCO-2 data. We note that the simulated low wind speed (∼2 m s −1 ) in this case is more likely to be associated with variability in wind direction, as smaller scale features that are difficult to resolve exert greater influence in the absence of stronger winds.
In the Ohio 2015-07-30 case, two power plants, Gavin and Kyger Creek, are located within 2.5 km of each other (figure 3). We initially attempted to estimate the emission of the two power plants separately, but the resulting emissions show large oscillations with K calculated using different emission vectors e (results not shown). This is most likely because the K vector of the two power plants largely overlap, causing the estimation to be under-constrained. We redesigned the system to estimate the combined emission rates of two plants by using a single control variable in the state vector for both the plants. However, our estimations are still substantially lower than the reported value (50.54 ktCO 2 /day): 13.49-14.17 ktCO 2 /day using the v7 data, 21.18-22.29 ktCO 2 /day using the v8 data, and 30.95-32.52 ktCO 2 /day using the v9 data. That the v9 data results in the closest estimation to the reported value may indicate the improved data quality in this latest OCO-2 data version. Similar to the Kansas case, the spread in the emission estimation among those using the same OCO-2 data version are small (∼1 ktCO 2 /day).  Figure 1(a) shows the outermost WRF-Chem domain (d01), with the three red colored boxes representing the three inner domains (d02, d03, and d04). Figure 1(b) shows the innermost domain (d04), with a red colored box representing the spatial range of figures 2(c)-(e), and a black colored cross marking the location of the Westar power plant. Figure 1(c) shows the modeled XCO 2 by the forward model (WRF-Chem+observation operator), with horizontal wind field at the power plant smoke stack height superimposed. Figure 1(e) shows the sensitivity of the modeled XCO 2 to the Westar emission rate. Figure 1(d) shows the OCO-2 v9 XCO 2 . The estimation of Nassar et al [1] using the v7 data is 48.66 ktCO 2 /day, substantially higher than our results using v7 (13.49-14.17 ktCO 2 /day). There are a large number of OCO-2 footprints with elevated XCO 2 values south of the power plant plume in the v7 data, which are not in the v8 or v9 data (figure 3 of the supplementary material). These elevated XCO 2 values contribute to our higher background estimation (398.03 ppm) than that of Nassar et al [1] (396.64 ppm), thus leading to our lower emission estimation.
There are two OCO-2 overpasses (2014-10-23 and 2014-11-10, figures 4 and 5) over the Indian site, which has five power plants in its vicinity ( figure 4(b)). Like Nassar et al [1], we found that only emissions from the Sasan and Vindhyachal power plants were captured by the OCO-2 observations. Accordingly, the state vector is [e 1 , e 2 , b] T with e 1 and e 2 for the CO 2 emission rates of Sasan and Vindyachal, respectively (  [1] use the reported emission value of Vindhyachal to remove its plume from the OCO-2 XCO 2 before estimating the emission rate of the Sasan plant. Their study estimated the Sasan plant emission to be 63.93±9.98 ktCO 2 /day, and attributed the large estimation uncertainty to the CO 2 from the Vindhyachal plant. The ability of our method to estimate emissions of multiple power plants is an advantage that can be expanded to estimate urban emissions in the future. The India 2014-11-10 observation was acquired in glint mode and with wind largely perpendicular to the OCO-2 ground track ( figure 5(c)). Both factors limit the observation data constraints for emission estimation. Our estimations from the 2014-11-10 data are lower than from the 2014-10-23 nadir

Summary
We developed and tested a method for estimating power plant emissions using OCO-2 XCO 2 data and WRF-Chem simulations. This method is similar to the Bayesian inversion approach used by Pillai et al [12] and Broquet et al [13], in that both use a high resolution transport model to link emissions to XCO 2 observations. However, we use the maximum likelihood approach to estimate power plant emissions directly from the XCO 2 without prior power plant emission information. In this regard, our method is similar to that of Nassar et al [1], but we use WRF-Chem instead of the Gaussian plume model for CO 2 transport simulation. Compared with Nassar et al [1], our method provides a formal framework to address dynamic meteorological fields and possible expansion  to spatially more complex anthropogenic emission sources like urban areas. One disadvantage of our method is that it is less tolerant of observation data errors as its emission results using OCO-2 v7 data are less accurate compared with those of the plume model approach, which can avoid the visually detectable observation data error through separating background and plume enhancement footprints. It should also be pointed out that the narrow swath (10.3 km) and column averaged measurements (XCO 2 ) of OCO-2 pose challenges for accurate emission estimations.
To effectively utilize OCO-2ʼs fine footprint and minimize representation errors, we use 1.33 × 1.33 km horizontal grid spacing in the innermost domain of the WRF-Chem simulations to calculate the Jacobian matrix and modeled XCO 2 . However, high resolution modeling does not necessarily increase simulation accuracy, as demonstrated in Feng et al [37]. Following Deng et al [34], we inflate the OCO-2 XCO 2 observation errors to account for the transport model errors. A comprehensive treatment would require much more rigorous analyses, such as those conducted in [32,33,38,39]. Although a rigorous quantification of transport model error will give a more realistic representation of the true uncertainty of the posterior estimates, a complementary approach is used to improve the transport model accuracy. Such approaches include establishing a suitable WRF-Chem model configuration for each particular site through validations against meteorological measurements [40], assimilating meteorological observations during simulations [41,42], and using ensemble transport model simulations. In addition, variational assimilation approaches can be used to address the sensitivity of the estimation to the transport model non-linearity [43,44].
We have shown that the maximum likelihood estimation using high resolution WRF-Chem simulation is a viable method for quantifying power plant CO 2 emissions from OCO-2 under certain conditions. The ability to quantify CO 2 emissions using existing and future spaceborne XCO 2 observations at the scale of an individual facility, an urban area or at the national level is currently an area of significant scientific interest, but also has important implications for future climate change mitigation policies including the transparency framework of the United Nations Paris Agreement. Efforts are currently underway by Europe [45,46] and multiple individual nations to understand the requirements and thus design spaceborne systems for anthropogenic CO 2 monitoring. The Committee on Earth Observation Satellites (CEOS) is reviewing the state of science in the field, assessing the technical requirements for monitoring, and advising space agencies on the way forward to develop an internationally coordinated constellation for this objective [47]. While at present, the observational coverage of space-based XCO 2 data is the greatest limitation prohibiting routine CO 2 emission monitoring, with recent new missions and more to come in the future as space agencies work toward establishing the system envisioned in the CEOS report [47], it is important for modeling methods to advance in parallel to demonstrate the value of the improved observations and make optimal use of them when they become available.
for making the NCEP/CFSR and CarbonTracker data available for the public. We thank the WRF-Chem development team for making the code available for the public. We thank Dr Chin-I Cheng of the Central Michigan University Statistical Consulting Center for her helpful discussion. This work was partially supported by a Central Michigan University Research Incentive Fund. This is contribution 124 of the Central Michigan University Institute for Great Lakes Research.