Improved spatial representation of a highly resolved emission inventory in China: evidence from TROPOMI measurements

Emissions in many sources are estimated in municipal district totals and spatially disaggregated onto grid cells using empirically selected spatial proxies such as population density, which might introduce biases, especially in fine spatial scale. Efforts have been made to improve the spatial representation of emission inventory, by incorporating comprehensive point source database (e.g. power plants, industrial facilities) in emission estimates. Satellite-based observations from the TROPOspheric Monitoring Instrument (TROPOMI) with unprecedented pixel sizes (3.5 × 7 km2) and signal-to-noise ratios offer the opportunity to evaluate the spatial accuracy of such highly resolved emissions from space. Here, we compare the city-level NO x emissions from a proxy-based emission inventory named the Multi-resolution Emission Inventory for China (MEIC) with a highly resolved emission inventory named the Multi-resolution Emission Inventory for China - High Resolution (MEIC-HR) that has nearly 100 000 industrial facilities, and evaluate them through NO x emissions derived from the TROPOMI NO2 tropospheric vertical column densities (TVCDs). We find that the discrepancies in city-level NO x emissions between MEIC and MEIC-HR are influenced by the proportions of emissions from point sources and NO x emissions per industrial gross domestic product (IGDP). The use of IGDP as a spatial proxy to disaggregate industrial emissions tends to overestimate NO x emissions in cities with lower industrial emission intensities or less industrial facilities in the MEIC. The NO x emissions of 70 cities are derived from one year TROPOMI NO2 TVCDs using the exponentially modified Gaussian function. Compared to the satellite-derived emissions, the cities with higher industrial point source emission proportions in MEIC-HR agree better with space-constrained results, indicating that integrating more point sources in the inventory would improve the spatial accuracy of emissions on city scale. In the future, we should devote more efforts to incorporating accurate locations of emitting facilities to reduce uncertainties in fine-scale emission estimates and guide future policies.

evaluate the spatial representations of emission inventories.
The traditional and widely used method to develop an emission inventory is the proxy-based method, which estimates emission totals at the municipal level (e.g. countries or provinces) and allocates them to fine grids using spatial proxies (e.g. population density, nighttime light, road networks) [1][2][3][4][5][6]. The proxy-based method is based on an assumption that emissions are linearly correlated with the selected proxies, which might introduce biases since the energy consumption and emission level per unit of proxies are spatially heterogeneous [7][8][9][10][11][12][13]. Previous studies have shown that high-resolution (<0.25 • ) emissions generated from the proxy-based method tend to overestimate emissions in urban areas and underestimate emissions in suburban and rural regions [14]. Such biases were also found to propagate into uncertainties in high-resolution CTMs [15][16][17][18][19].
In the past few years, extensive and detailed bottom-up information supports the development of highly resolved emission inventories [14,[16][17][18][20][21][22][23][24], especially for the industrial sector. Emissions are estimated at the factory or even unit level, involving unit-specific activity data, emission factors and pollution removal efficiencies [14,[16][17][18][20][21][22][23][24]. These data are distributed to fine grids in the form of point sources using exact geographical coordinates instead of spatial proxies, which allows emission disaggregation to high spatial resolution (1 × 1 km) [18]. Recent efforts indicate that estimating emissions from point sources exhibit better agreements with satellite observations [15,24] and surface observations [14,16,18]. Sensitivity analysis through CTMs suggests that spatial biases are reduced significantly as the point source shares increase [14]. However, the discrepancies in the city-level emissions between proxy-based and highly resolved emission inventories have rarely been studied and evaluated. Direct evaluation of emissions against independent data sources are needed.
In recent years, satellite remote sensing has become an important tool to validate emission inventories because of its high spatial resolution and wide spatial-temporal coverage. High-resolution measurements (e.g. the Ozone Monitoring Instrument, OMI) provide an independent way to quantify emissions through Gaussian models [25][26][27][28][29][30][31], eliminating the need for CTMs and priori emission inventories. Beirle et al [25] simultaneously derived the NO x lifetimes and emissions of isolated megacities by means of the downwind evolution of OMI NO 2 in different wind directions, which was subsequently modified by Liu et al [28] to be applied to polluted areas [24,30,31]. With the launch of the TROPOspheric Monitoring Instrument (TROPOMI) in 2017 [32], data with unprecedented pixel sizes (3.5 × 7 km 2 ) and signal-to-noise ratios made it possible to resolve plumes with even only one overpass [33,34]. Unlike the long-term (one year or more) OMI data required for NO x estimates, seasonal or subseasonal TRO-POMI data are capable of quantifying NO x emissions, and under extremely special conditions, even data from a single day are sufficient [34]. Beirle et al [33] has demonstrated that NO x emissions from large point sources can be mapped at high spatial resolution based on TROPOMI measurements and the continuity equation.
In this work, we first compare the city-level NO x emissions from a proxy-based emission inventory and a highly resolved emission inventory that contains nearly 100 000 industrial infrastructures over China to study their differences and related impact factors. We investigate how point source shares influence emission allocations from provinces to cities under the premise of equal emissions at the province level. Then we use TROPOMI NO 2 tropospheric vertical column densities (TVCDs) and the exponentially modified Gaussian (EMG) function to derive NO x emissions from Chinese cities. Finally, the city-level NO x emissions from the two emission inventories are evaluated using the satellite-derived NO x emissions.

Emission inventories
The proxy-based emission inventory evaluated in this work is obtained from the Multi-resolution Emission Inventory for China (MEIC, www.meicmodel.org) [35,36] for the year 2017. In MEIC, only the coalfired power sector is estimated as point sources based on unit-level data [14,21], while other sectors such as industry, transportation and residential are regarded as mobile or area sources. The highly resolved emission inventory used here is the Multi-resolution Emission Inventory for China -High Resolution (MEIC-HR), which was compiled by Zheng et al [18] and subsequently updated from 2013 to 2017. The MEIC-HR is built under the framework of MEIC but contains a large number of industrial point sources (i.e. nearly 100 000 industrial facilities). Factory-or facility-level information is combined from three industrial datasets to estimate emissions for each industrial facility. More details about the input datasets and the data processing for MEIC-HR can be found in Zheng et al [18].
The magnitudes of the NO x emissions for each sector are the same at the province level in MEIC and MEIC-HR (table 1), which ensures that the city-level emission discrepancies between these two inventories result only from spatial allocations rather than emission magnitudes. As shown in table 1, anthropogenic NO x emissions from all sectors in 2017 is 20.08 Tg, of which 37% are industrial emissions. Four industrial subsectors (i.e. boiler, iron, cement and glass) apply different emission allocation methods between MEIC and MEIC-HR: the MEIC relies on the county-level industrial gross domestic product  (IGDP), while the MEIC-HR is mapped based on the exact locations of the industrial facilities. We define a variable as industrial point source emission proportions (i.e. industrial PS proportions), which represents the ratio of point-based emissions from boiler, iron, cement and glass in the MEIC-HR to emissions from all sectors. Although emissions from the power sector are also point sources, they are exactly the same in the two emission inventories and are not considered in the point source proportion calculation, as we mainly focus on the differences brought by the four industrial subsectors. Emission disaggregation for other industry, power, residential and transportation in MEIC and MEIC-HR are the same. Although population and IGDP are widely utilized in spatial allocations, it is worth noting that citylevel NO x emissions from industrial sectors are not linearly correlated with IGDP (figure 1). IGDP data and population data used in this study are from China City Statistical Yearbook 2017 [37]. The R 2 between industrial NO x emissions and IGDP is 0.17, and the R 2 between NO x emissions from the industrial sector and the total population is as low as 0.07. As a consequence, uncertainties caused by the linear hypothesis in proxy-based methods need to be evaluated.

Satellite and wind data
TROPOMI is a payload on Sentinel-5 Precursor (S5P), which was launched on 13 October 2017 by the European Space Agency. The instrument achieves daily global coverage with a swath width of 2600 km and ascends across the equator at approximately 13:30 local time. Compared to OMI, the significant improvements of TROPOMI are unparalleled spatial resolution of up to 3.5 × 7 km 2 (upgraded to 5.5 × 3.5 km 2 on 6 August 2019) and 1-5 times higher signal-to-noise ratios, which provide more details on a fine scale and improve the quality and reliability of the captured data. NO 2 is one of the main trace gases measured by TROPOMI, of which NO 2 processor realizes many retrieval improvements [38].
In this study, we use NO 2 TVCDs over China from February 2018 to February 2019 from the official TM5-MP-DOMINO version 1.0.0 L2 offline product (www.temis.nl/airpollution/no2col/data/tro pomi/). Pixels with 'quality assurance value' below 0.75 are filtered to remove cloud radiance fraction greater than 0.5 and erroneous retrievals [38]; then, NO 2 TVCDs are mapped at 0.06 • × 0.06 • . Wind fields below 500 m with 0.36 degrees spatial resolution and six-hour temporal resolution are taken from the ECMWF reanalysis (www.ecmwf.int/ en/research/climate-reanalysis/era-interim). Following the method described in Liu et al [28], the six hourly ECMWF wind fields are interpolated in time to match TROPOMI overpass time, and horizontal wind components are averaged in the vertical direction according to the weight of the layer height.

Fitting the NO x emissions and lifetimes of cities
The EMG model is used to derive the NO x emissions and atmospheric lifetimes of cities following the method developed by Liu et al [28], which is for sources located in a heterogeneously polluted background and applicable to our work. Wind fields are divided into calm and windy conditions with a threshold of 2 m s −1 , as in Beirle et al [25] and Liu et al [28]. The NO 2 spatial distributions under calm conditions represent the spatial patterns of NO x emissions. Changes in the NO 2 spatial patterns from calm to windy conditions indicate how long NO 2 survives in the atmosphere. The function for the lifetime fit is expressed as follows: where N(x) represents a model function for the observed NO 2 line densities (i.e. NO 2 per cm), which are calculated by integration of the mean NO 2 TVCDs perpendicular to the wind direction. The removal of NO 2 can be simply described by a first order loss and thus the truncated exponential decay function e(x) represents the chemical decay of NO 2 . C(x) represents the NO 2 patterns under calm conditions, and a, b reflect systematic deviations between windy and calm wind, x 0 is the e-folding distance downwind, w is the mean wind speed. A nonlinear least-squares fit of N(x) is performed to derive a, b and x 0 , and the e-folding distance x 0 is divided by the average wind speed w to obtain the lifetime τ . The emission rate is obtained in three steps according to the mass balance (equation (4)). First, the total NO 2 mass A NO2 under calm conditions around the centers of emitting sources are quantified (equation (5)). We use a corrected factor f (σ i ) based on the fitted width of the Gaussian plume to scale A NO2 to account for possible losses by cross-wind dilution as Liu et al [28] do (see equation (6)). Then NO 2 is converted to NO x with a conversion coefficient of 1.32, as suggested by Beirle et al [25] and Liu et al [28]. Finally, the NO x mass A NOx is divided by the corresponding lifetime τ to obtain NO x emission rates (equation (4)). The most complicated part of the above processes is the calculation of the total NO 2 mass under calm conditions, and the fitting model is as equation (5): where g(x) is a modified Gaussian function, which is for NO 2 line densities (i.e. NO 2 per cm) under calm wind, i represents the wind directions of rotation, A NOx and A NO2 are the NO x amount and NO 2 amount, respectively, X is the location of inversion sites, ε i and β i indicate the hypothesis of a linear distribution of background concentrations, and σ i represents the standard deviation of g(x). We apply nonlinear least-squares fitting methods to calculate the lifetimes and emissions of cities. In accordance with the set in Liu et al [28], we use 600 km as the fitting interval and 300 km as the integration interval for lifetime fit, and we set a fitting interval of 200 km (300 km for Pearl River Delta) and an integration interval of 40 km (60 km for Pearl River Delta) for total NO 2 mass fit. Strict constraints are set on fit performance, as shown in Liu et al [28]: R ⩾ 0.9, confidence intervals of lifetime <10 h, confidence intervals of the NO 2 amount ⩽0.8 times of itself, lower bound of confidence intervals >0. The fitting model shows robust ability to derive the lifetimes and emissions of cities. Uncertainties resulting from the parameters used in the model  Figure 2(a) compares the city-level NO x emissions from MEIC and MEIC-HR. Although the R 2 value reaches 0.87, most cities deviate from the 1:1 line, among which 96 cities deviate by more than 25%, accounting for approximately 39% of the total cities. This result indicates that different spatial disaggregation methods lead to city-level emission discrepancies considering that provincial NO x emissions in MEIC and MEIC-HR are the same. The differences between city-level NO x emissions in MEIC and MEIC-HR are related to the industrial PS proportions in each city, with cities of high industrial PS proportions below the 1:1 line and vice versa. Figure 2 In the MEIC, IGDP is used as the spatial proxy to allocate industrial NO x emissions from province level to cities/counties, which may induce biases since NO x emissions per IGDP are heterogeneous among regions. Figure 3 presents the discrepancies of citylevel NO x emissions between MEIC and MEIC-HR and their relationships with IGDP and industrial PS proportions in each city. In a certain industrial PS proportion range, the average MEIC/MEIC-HR ratio decreases along with the increase in NO x emissions per IGDP. For example, when industrial PS proportions are less than 10%, the average MEIC/MEIC-HR ratio declines from 1.38 to 1.07 as the NO x emissions per IGDP increase from 35 to over 140 t/10 8 CNY. Compared to the real-world emissions, the assumption that NO x emissions are linearly associated with IGDP tends to allocate more/less emissions to cities with low/high emission intensities, which results in overestimation/underestimation of NO x emissions in the MEIC. Similarly, in a certain range of NO x emissions per IGDP, the average MEIC/MEIC-HR ratio decreases as industrial PS proportions increase. For example, when the NO x emissions per IGDP are between 35 and 70 t/10 8 CNY, the average MEIC/MEIC-HR decreases from 1.28 to 0.65 as industrial PS proportions increase from 10% to over 70%. As the industrial PS proportions defined in our study are from highly polluting industries such as iron and cement, lower industrial PS proportions under a certain range of emission intensity reflect that the IGDP in those cities are contributed more by other cleaning industries. Therefore, using IGDP to allocate emissions from province level to cities tend to overestimate the industrial emissions of such cities.

City-level NO x emissions derived from satellite observations
We apply the EMG model described in section 2.3 and finally derive the NO x lifetimes and emissions of 70 cities in China from TROPOMI measurements (table S1 (available online at stacks.iop.org/ERL/16/ 084056/mmedia)). High spatial resolution and low noise enable TROPOMI to have more effective pixels  and exhibits a significantly better performance of representing spatial variability than any instrument before it [39,40], which increases the possibility of meeting the constraints on model performance (see the last paragraph of section 2.3 for details). As the key for deriving NO x lifetimes and emissions is to ensure strong gradients of the NO 2 signal around the source, the improved estimation of NO x lifetimes and emissions by high-quality satellite measurements is directly reflected in the greater number of derived cities compared to previous works [28].
The fitted NO x lifetimes range from 1.0 h to 9.3 h in this work, which correspond to the results reported in the literature [24][25][26]28]. The city with the highest NO x emissions rate is Pearl River Delta (up to 331 mol s −1 ), while the lowest is Wuzhou (only 2 mol s −1 ). Figure 4   and Sichuan-Chongqing region, which are consistent with the spatial patterns of NO x emissions of the investigated cities in this study.

Evaluation through satellite-based NO x emissions
To evaluate the representations of city-level NO x emissions in MEIC and MEIC-HR through satellitebased estimates, emissions in the two emission inventories are summed over an area of 40 × 40 km around the center of the investigated sites, in accordance with the integration interval used in the model functions for NO 2 total mass. Figures 5(a) and (b) show comparisons between the NO x emissions of 70 cities derived from TROPOMI NO 2 TVCDs and those from the two emission inventories. City-level NO x emissions in MEIC-HR (r 2 = 0.74) show better agreement with satellite-based estimates than those in MEIC (r 2 = 0.63). The improvements are related to industrial PS proportions: for cities with high industrial PS proportions (>60%), the root mean square error (RMSE) is 26.0 and 43.9 mol s −1 for MEIC-HR and MEIC, respectively, and the normalized mean error (NME) is 44.6% and 65.4%, respectively.

Figures 5(c) and (d) further show a comparison
between NO x emissions in emission inventories and satellite-based estimates as a function of industrial PS proportions. When industrial PS proportions are relatively low (<40%), the performance of MEIC-HR and MEIC is comparable because the NO x emissions of cities in MEIC-HR and MEIC are close to each other (section 3.1). As point source emission proportions increase from 40% to over 80%, the RMSE and NME of MEIC-HR are always lower than those of MEIC in the same segments of industrial PS proportions, and the gap between them widens gradually. The results indicate that with the increase in industrial PS proportions, the improvement of spatial representations by bottom-up methods becomes increasingly more distinct than those by proxy-based methods. Because real-world emissions are more disproportionate to the IGDP for cities with high point source emission proportions, greater deviations in emissions introduced by the linear assumption in the proxy-based method are produced. Consequently, the highly resolved emission inventories have better spatial representations of real-world emission distributions than proxy-based inventories, especially in regions with a large amount of emitting facilities. It is crucial to raise point source emission proportions in emission inventories to reduce spatial biases caused by spatial proxies.

Uncertainties
There are several uncertainties in this work. Liu et al [28] systematically evaluated the uncertainties in the EMG model. Here, following Liu et al [28], we estimate the uncertainties of the NO x lifetimes and emissions from six aspects: (a) sensitivity tests are conducted to quantify the uncertainties from the choice of fit and integration intervals. The fitted lifetimes and emissions are found to be insensitive to the choice of the fit and integration intervals. Table S2 shows that for the lifetime fit, every 100 km changes of the fit interval f or integration interval i cause lifetimes to vary by about 10%. For the total NO 2 mass fit, 10% or 20% changes of the fit interval h or integration interval v lead to less than 20% changes of NO x emission rates. (b) The 95% confidence interval and the standard mean error of the lifetimes under different wind directions are used to quantify the fit errors for each investigated site. (c) With the consideration of the height of wind fields and threshold for calm winds, the wind fields from ECMWF lead to an uncertainty of ∼30% for non-mountainous cities. (d) In this study, we use a constant scaling factor of 1.32 to convert NO 2 to NO x , which is a typical value under cloud-free conditions around noon when polluted air masses support the formation of O 3 (i.e. TROPOMI overpass time). Although the difference in O 3 concentrations between upwind and downwind plumes might influence NO x /NO 2 , the effect has been included in the overall uncertainty estimates as we average the fit results for different wind direction sectors, which usually represent different incoming O 3 concentrations. The constant NO x /NO 2 scaling factor brings an uncertainty of ∼10% to the derived emissions (but not lifetimes) based on previous studies [25,28,31,33]. (e) The uncertainty of TVCDs is ∼30% according to previous studies [39,[41][42][43], which directly propagates to NO 2 amount and thus the estimated emissions (but not lifetimes). The uncertainty of TVCDs consists of additive and multiplicative terms. The uncertainty of multiplicative terms comes from tropospheric airmass factor, which depends on profile shape, cloud parameters, albedo and surface pressure [44]. And the uncertainty of TVCDs from additive components owing to the spectral retrieval and stratospheric correction is eliminated by the fitted background in our study. (f) The wind speed is an important impact factor for the fitted lifetime [26], which is assumed to be the same under calm and windy conditions in this work. The systematic difference of NO 2 TVCDs between calm and windy conditions is about 20% (figure S1). We take the factors mentioned above into consideration and conclude that the total uncertainties of lifetimes and emissions are within 42%-91% and 57%-99%, respectively. This uncertainty range is comparable to Liu et al [28] and Liu et al [31]. As illustrated in Liu et al [28] and Liu et al [31], under the assumption that all uncertain factors are independent, the total uncertainty calculated here is rather conservative.

Implications
In this study, we use NO x emissions derived from TROPOMI NO 2 TVCDs to evaluate the NO x emissions on city scale estimated from two bottom-up emission inventories. NO x is selected as an example here because the errors in NO 2 satellite retrievals are relatively small [41,45,46] and the method to derive NO x emissions in cities from satellite data is solid [25,28,31,34,47]. The findings obtained in this work that raising point emission proportions in emission inventory could improve its spatial representation on city scale are also applicable to other air pollutant species (e.g. CO 2 and SO 2 ) from fossil fuel combustion, as they come from the same emission sources.
Recently, some studies have emphasized the importance of spatial allocation method in the development of gridded emissions [7, 14-16, 18, 19]. For example, Zheng et al [18] compile the MEIC-HR inventory with comprehensive industrial-facilitylevel information and find that the point-sourcebased emission inventory substantially improves the WRF/CMAQ model simulations at high spatial resolution of 4 km compared to the proxy-based emissions. Our study adds new evidence from space to prove the accuracy of the spatial patterns represented by such highly resolved emission inventory. However, due to the limitation of the emission retrieval method, only 70 cities are covered in our evaluation analysis. A more comprehensive evaluation in the future might be comparing modeled tropospheric vertical columns from CTMs at fine spatial resolution with TVCDs with unprecedented resolution from TROPOMI, following the method in Geng et al [15].
As reported in previous studies [11,14,[16][17][18], emissions from point sources exhibit inconsistent spatial patterns with those from spatial proxies (e.g. population density) at fine spatial scale, which means the decoupling effect between emission inventory constructed from large point database and spatial proxies. More importantly, the spatial patterns of such highly resolved inventory cannot be reproduced by any spatial proxies that have been frequently used before. Therefore, it is crucial to increase the share of point source emissions by including more detailed point source statistical data [11,14,[16][17][18], conducting field surveys [17,24,48] or identifying missing point sources by means of satellite observations [29,49,50]. For example, three industrial datasets are combined in MEIC-HR to provide a synthesized facility-level database [18]. To develop high-resolution emission inventory, onsite surveys for industrial sources are conducted to obtain key parameters relevant to emission estimation [17,24,48]. Owing to the wide spatial coverage, high temporal and spatial resolution, and timely updates, satellite data can be used to detect emission sources including those that are not captured in bottom-up emission inventories due to lack of available data, especially in developing nations. Mclinden et al [29] use the measurement of SO 2 from OMI and find nearly 40 emission sources missing from conventional inventories, accounting for roughly 6%-12% of the global anthropogenic source. The combination of bottom-up and top-down information might lead to greater accuracy of fine scale emission inventory in the future.

Conclusion
The proxy-based method to allocate emission totals to smaller administrative districts or grids might introduce biases, especially at fine scales. Developing highly resolved emission inventories is an effective way to support fine-scale emission characterization, whose improvement to the spatial representations of emissions needs to be studied. In this work, we evaluate the effect of integrating a large amount of point sources on emission allocations from province level to cities using a proxy-based emission inventory (MEIC), a highly resolved emission inventory (MEIC-HR), and satellite observations. We first investigate the discrepancies in city-level NO x emissions between the two emission inventories and diagnose the related impact factors. City-level NO x emissions derived from TROPOMI NO 2 TVCDs are then applied to evaluate the spatial representations of citylevel emissions by MEIC and MEIC-HR.
We find that the discrepancies in city-level NO x emissions between MEIC and MEIC-HR are affected by the industrial PS proportions and NO x emissions per IGDP. As the industrial PS proportions increase from 10% to over 70%, the median value of MEIC/MEIC-HR ratio decreases from 1.26 to 0.47. When NO x emissions per IGDP are in a certain range, the mean MEIC/MEIC-HR ratio decreases along with the increase of industrial PS proportions. Similarly, in a certain range of industrial PS proportions, the mean MEIC/MEIC-HR ratio declines as NO x emissions per IGDP increase. Therefore, Using IGDP as a spatial proxy to allocate provincial industrial NO x emissions to cities tends to overestimate emissions in cities with lower industrial PS proportions (a less amount of industrial facilities or lower industrial emission intensities). NO x emissions of 70 cities are derived based on TROPOMI NO 2 TVCDs and the EMG model and applied to evaluate the spatial accuracy of the two emission inventories. Citylevel NO x emissions in MEIC-HR (r 2 = 0.74) show better agreement with satellite-based estimates than those in MEIC (r 2 = 0.63). And the improvement is more obvious for cities with high industrial PS proportions (>60%), which is manifested as the RMSE is 26.0 and 43.9 mol s −1 for MEIC-HR and MEIC, respectively, and the NME is 44.6% and 65.4% for MEIC-HR and MEIC, respectively, indicating that incorporating comprehensive point source database will improve the spatial representation of emission inventories.
This work emphasizes the importance of integrating more point sources in emission inventories to improve the spatial accuracy. Our study provides a framework to propagate top-down information from satellite to the evaluations of bottom-up inventories, which could support the development and refinement of highly resolved emission inventories in the future. More efforts should be made to develop highly resolved emission inventories to facilitate atmospheric research and air quality management.

Data availability statement
The data that support the findings of this study are available upon reasonable request from the authors.