Investigating high methane emissions from urban areas detected by TROPOMI and their association with untreated wastewater

Even though methane concentrations have contributed an estimated 23% of climate forcing, part of the recent increases in the global methane background concentrations remain unexplained. Satellite remote sensing has been used extensively to constrain emission inventories, for example with the TROPOspheric Monitoring Instrument which has been measuring methane since November 2017. We have identified enhancements of methane over 61 urban areas around the world and estimate their emissions using a two-dimensional Gaussian model. We show that methane emissions from urban areas may be underestimated by a factor of 3–4 in the Emissions Database for Global Atmospheric Research (EDGAR) greenhouse gas emission inventory. Scaling our results to the 385 urban areas with more than 2 million inhabitants suggests that they could account for up to 22% of global methane emissions. The emission estimates of the 61 urban areas do not correlate with the total or sectoral EDGAR emission inventory. They do however correlate with estimated rates of untreated wastewater, varying from 33 kg person−1 year−1 for cities with zero untreated wastewater to 138 kg person−1 year−1 for the cities with the most untreated wastewater. If this relationship were confirmed by higher resolution remote sensing or in situ monitoring, we estimate that reducing discharges of untreated wastewater could reduce global methane emissions by up to 5%–10% while at the same time yielding significant ecological and human co-benefits.


Introduction
Global methane concentrations increased at the highest rate on record during 2021 [1,2], but trends have been inconsistent in the past leading to divergent hypotheses about emission variability [3]. Methane is the second most important contributor to climate change behind carbon dioxide, and yet significant uncertainties remain in the emission inventories [4][5][6].
Satellite detection of methane has been used extensively to constrain emission inventories [7]. The TROPOspheric Monitoring Instrument (TROPOMI) on board the Sentinel 5 Precursor satellite [8] provides unprecedented resolution and coverage of methane concentrations. Estimates of columnaverage dry-air mole fractions of methane (XCH4) were found to be in agreement with other satellite instruments as well as with ground-based observations from the Total Carbon Column Observing Network [9]. This provides a global record with 4 years of data at a grid resolution of up to 7 by 5.5 km 2 since August 2019.
Inverse modeling has been used extensively to improve emission estimates at the global scale [10] as well as regionally and locally. New satellites have enabled the identification of individual methane plumes and the estimation of emissions from point sources [11] and local oil and gas facilities [12]. Statistical methods have also been used to evaluate emissions, for example by using emission ratios for urban areas [13] and for oil and gas producing regions [14].
Field campaigns measuring methane concentrations in cities in the USA have found that the official emission inventories may be underestimated by a factor of two to three [15,16]. These have focused on cities in the East Coast that are prone to natural gas leaks. Comparisons with ethane concentrations suggest that 60%-90% of methane could come from natural gas leakage [17][18][19]. In Indianapolis, in the Midwest, aircraft and tower measurements suggest that around 50% of methane emissions could be related to natural gas usage [20]. Remote sensing experiments in Los Angeles found that there is a strong seasonal component to methane concentration variability that correlates with commercial and residential natural gas consumption, but not industrial use [21]. A more modest seasonal signal found in the Baltimore/Washington, DC region was also suggested to be consistent with natural gas usage [22]. Similar results have been found in Europe from an intensive measurement campaign targeting identified leaks [23], although large uncertainties remained for distributed sources.
Mobile mapping of urban leaks in 12 urban areas in the USA combined with pipeline infrastructure information from the Pipeline and Hazardous Materials Safety Administration suggest that methane emissions from gas leaks could be five times larger than the estimates in the US Environmental Protection Agency's greenhouse gas inventory [24]. Emissions from gas leaks were further found to be affected by sociodemographic indicators at the level of census tracts indicating inequities in system maintenance [25]. Nevertheless, analysis of top-down and bottomup inventories suggest that some of the discrepancies in the inventories may be due to diffuse methane plumes that could originate 'beyond-the-meter' at the level of individual buildings [26,27].
In addition to the measurements in cities with older natural gas infrastructure, there have been some measurements in cities with less leaky distribution systems. In Paris, mobile mapping of methane leaks found 56% of street-level emissions to be due to gas leaks, and 33% due to emissions from sewage lines [28]. In Bucharest, measurements of isotopes of carbon and hydrogen in methane suggest that a large share of emissions could be associated with wastewater [29]. In particular emissions rates in Bucharest were found to be four times greater than in Hamburg and six times greater than in Paris, suggesting that wastewater treatment infrastructure could be an important variable in estimating methane emissions. Similar isotopic measurements in Cincinnati, Ohio, suggest that half of sites had biogenic methane while only one tenth had thermogenic methane [30]. This highlights the importance of investigating sewers as non-point sources of methane that could account for discrepancies between measurements and inventories described above.
We focus on urban enhancements of XCH4 for large cities around the globe. TROPOMI level 2 retrievals at the original swath resolution were oversampled to 1 km grids over urban areas [31,32]. In 61 cases, there were significant hotspots over the urban center that were distinct from the regional background levels. We use a two-dimensional Gaussian model to estimate the local emissions required to cause the observed XCH4 enhancements and compare them with emissions from the Emissions Database for Global Atmospheric Research (EDGAR) inventory [33] as well as rates of untreated wastewater [34].

Satellite retrievals
The methane total column-average dry-air mole fraction (XCH4) was obtained at the original swath resolution for orbits 373 (10 November 2017) to 20 513 (28 September 2021). We use the SRON RemoTeC-S5P XCH4 product, version 17 [35]. The swath retrievals were oversampled to Lambert Conformal maps of 1 km resolution covering 300 km by 300 km rectangles around each urban area. Average concentration grids were generated using all available data.
Retrievals of XCH4 are sensitive to the land surface albedo. There are known systematic biases in the standard XCH4 products for pixels with both particularly low and high surface albedos [9]. The SRON product uses a correction function that was developed based on reference regions around the globe. This correction led to changes in retrieved XCH4 of up to 2% for low albedos. Urban areas are known to have moderate albedo values which belong to the range where retrieval errors are smaller than for areas with lower albedos [9]. Specifically, analysis over Buenos Aires found that the urban signal in the XCH4 product was robust relative to the variation of land surface properties [36].
The quality assurance (QA) flag for the XCH4 product is either 1 if all filtering thresholds have been met, or 0.4 if one of the filtering thresholds has been exceeded [35]. We tested our method using either only QA values of 1, or QA values of both 0.4 and 1.
All swaths have 215 pixels in the cross-track direction, with the highest resolution in the center of the swath (40 km 2 ) and the lowest resolution (206 km 2 ) on the outside edges of the band. We tested our method using 2 options: option 1 contained all swath pixels, option 2 contained ground pixels from 46 to 171, which corresponds to a resolution of 60 km 2 or finer.
We found that all four QA and swath options showed urban enhancements, but that the option including all QA flags of 0.4 and 1.0, and only the center of the swath gave the best resolution. Including only QA flags of 1.0 seemed to decrease the number of available data points in the city and exacerbated the contamination of the average grids from pixels that had few valid retrievals.
We screened data using the landuse flag, keeping only pixels flagged as 0 (land) or 2 (some water), and rejecting 1 (coasts) and 3 (water bodies). We found that limiting the land flag to keep only land (0) pixels removed too much data. Keeping the pixels with some water (2) on the other hand did not remove enough data. By selecting both 0 and 2, the grids contained many swath pixels that were contaminated by water either along coastlines, along rivers or due to the presence of small lakes, ponds and reservoirs. One solution was to screen the urban grids according to the number of swath pixels available in each gridded cell. By removing the 10% of gridded cells with the fewest retrievals, we avoided most of the outliers due to water contamination without having to throw out all the pixels with a land flag of 2.
To better screen for the presence of water, we used 30 m resolution global surface water maps [37]. We created 1 km resolution water mask by counting all water subpixels within each grid cell. Because the swath resolution is 3.5 by 5.5 km 2 or more, a 1 km grid cell can be contaminated by water coverage in neighboring cells. We therefore tested our screening procedure using different levels of Gaussian smoothing on the water mask. We tested our method with three options: 1. No smoothing and 20% or more water coverage; 2. One pass of smoothing and 10% or more water coverage; 3. Two passes of smoothing and 5% or more water coverage. This removed an increasing number of cells. Our results were found to be robust across these different options (appendix B).

Urban population
We screened for candidate locations by aggregating the TROPOMI data to 0.1 degree resolution maps. Based on these maps, we identified 136 urban areas with the potential for a methane hotspot. We generated 1 km resolution maps of 300 km by 300 km areas around each of these, and found that 61 had a distinct XCH4 enhancement that could be used for estimating methane emissions using the two-dimensional Gaussian algorithm described below. In 14 cases, the XCH4 enhancement included the urban area and the surrounding region.
We used a 30 s resolution map of global population [38] to estimate the population for all the urban areas in the analysis and to plot urban contours.
For the scenario analysis, we identified 385 urban areas with more than 2 million inhabitants using a list of 40 000 urban areas obtained from https://simplemaps.com/data/world-cities. These were agglomerated into 651 clusters using a k-means algorithm and a search distance of 0.5 degrees. Of these, 385 were larger than 2 million inhabitants: 219 in China and 166 in the rest of the world.

EDGAR emission inventory and wastewater database
Methane emissions were obtained from the EDGAR greenhouse gas emission inventory, version 6.0 [33], which uses the IPCC methodology for most sectors [39,40]. The data are available by emission sector on a 0.1 degree resolution grid for the year 2018. We combined energy production (ENE) and fossil fuel production (PRO = PRO_COAL + PRO_OIL + PRO_Gas) into a single fossil fuel (FFL) category. We combined enteric fermentation (ENF), agricultural soils (AGS), manure management (MNM) and agricultural waste burning (AWB) into a single merged agriculture (AGM) category.
Wastewater production, collection, treatment and reuse data were obtained on a global grid with 5 arcmin resolution [34].
For the analysis, estimates of methane emissions and wastewater flow rates were summed for all grid points within an urban area with a radius determined from the XCH4 hotspot and the 2D Gaussian fit.

Emission estimates using a two-dimensional Gaussian model
To estimate the emissions of methane corresponding to a hotspot of XCH4, we need to estimate the amount of methane above the background for the urban area along with a representative life time. The mass of methane was estimated using a two-dimensional Gaussian model [41,42], using equation (3).
where µ x and µ y are the distances of the urban hotspot from the coordinate origin; σ x , σ y determine the length scale of the Gaussian in the x and y directions; ρ is a rotation parameter for the ellipse defined by σ x , σ y ; and a is the integral of XCH4 under the twodimensional Gaussian. A non-linear optimization algorithm was used to determine the free parameters in equation (3), using regularization terms to prevent non-physical solutions: The cost function was augmented to include the square of the distance terms (µ x , µ y ) and the length scale terms (σ x , σ y ). Plotting the magnitude of the fitting parameters versus the magnitude of the model fit as a function of the regularization coefficients usually follows an L-curve. The coefficients are then chosen near the elbow in the curve: not so low that they have no impact on moderating the fitting parameters, and not so high that they prevent a realistic approximation of the XCH4 hotspot. Based on inspection of the results, we determined coefficients of 1 × 10 −6 for the distance terms and 1 × 10 −5 for the length scale terms. No regularization was applied to the background and the magnitude terms.
In some cases, the optimization algorithm drifted too far from the urban area or included an area that was larger than the urban hotspot. To prevent this, we constrained the distance terms (µ x , µ y ), the background term (bkg) and the length scale terms (σ x , σ y ). Some of these limits were adjusted based on visual inspection of the spatial maps and the Gaussian fits.
The distance terms were constrained to be smaller than 10 km in most cases, with some cities restricted to 5 km because of the presence of neighboring hotspots (e.g. Philadelphia, Baltimore, Washington DC, Seoul), while some cities had larger limits because they were part of regional hotspots (e.g. Yangon, Hanoi, Bangkok, Xian, Guayaquil). The background term (bkg) was limited to between the 5% and 50% level of the XCH4 data. The length scale terms (σ x , σ y ) were constrained to be smaller than 30 km for most cases. Some cities had tighter hotspots and hence a smaller limit was used, for example Toronto, Philadelphia, Baltimore and Washington DC. Other cities had more extensive hotspots for which the constraint was relaxed. Cities that were part of regional enhancements had the longest σ values to account for the entire regional hotspot.
A single average radius (r) was determined as the geometric mean of the major and minor axes of the ellipse defined by σ x , σ y and ρ above. The length scale was taken to be l = r √ 2π as this matches the length scale needed to estimated the area under a Gaussian in a box model. In this way, the model can be shown to be mathematically similar to a box model used with GOSAT retrievals [43] and a line integral method used for oil and gas production regions [44].
The average wind speed (ws) was calculated from the ERA-5 10 m winds included in the retrievals for only the grid points used in the analysis. As the lifetime of methane is of the order of years, the time scale required to obtain emissions from the model is effectively the lifetime due to wind transport [41]. The time scale τ was obtained by dividing the length scale by the average wind speed τ = l/ws. The average length scale of the Gaussian match varies from 21 km for Washington, D.C., to 160 km for Tokyo; the average wind speed varies from 1.4 m s −1 for Lagos to 3.9 m s −1 for Minsk (table C3). Wind speed is a function of the local climate, with the weakest winds near the equator and increasing winds in the midlatitudes.
The emission rate was estimated by dividing the mass by the life time: Emiss = Mass/τ . The mass of methane in the hotspot was obtained by scaling a, which is in ppb, to kg m −2 using the mass of the column of air above the hotspot. The column was assumed to extend from the surface to the top of the troposphere (100 hPa).

Evaluation and uncertainty of the emission estimates
The method for deriving emission estimates was applied to three regional source regions that have been characterized in the literature (appendix A). Figure A1 shows the two-dimensional Gaussian fits for the Permian Basin, Texas; the Central Valley, California; and the South Sudan Wetlands. Our method was in agreement with published estimates or the Permian Basin and for the South Sudan Wetlands. Our estimates were higher for the Central Valley by a factor of two, although there is a larger range of emission estimates in the literature in this case (appendix A). Our results were also found to be consistent with Bayesian inversion estimates for four cities [36], albeit with a scaling factor of around 1.5.
As described above, pre-processing the TRO-POMI retrievals for QA flag, landuse, swath position, and water mask was an important step in the analysis. Furthermore, for some cities, we constrained the maximum value of the length scale terms (σ x , σ y ) which introduces a further source of uncertainty. Appendix B describes the uncertainty analysis that we carried out to evaluate the robustness of the emission estimates relative to these factors. We used two levels of uncertainty analysis: For level 1 we performed 36 simulations with varying parameters for each urban area. For level 2 we performed an additional 18 simulations per city based on the optimal parameters determined in level 1. Level 1 was used to identify the 61 urban areas with coefficients of variation below approximately 0.3. Level 2 showed that the median uncertainty in our emission estimates was around 15%.

Urban methane emissions
Oversampled 1 km grids of TROPOMI XCH4 were averaged over the entire time period from 10 November 2017 to 28 September 2021. Clear enhancements were detected over many urban areas, as shown in figure 1 for 6 representative cases. Urban XCH4 enhancements were calculated as the difference between the average XCH4 concentrations based on the diameter of the 2D Gaussian fit around the urban center and the band extending to two times this diameter. The excess methane concentrations reached a maximum of 22.5 ppb in Karachi and 21.1 ppb in Dhaka (table C1).
We found 61 urban areas with a distinct XCH4 hotspot that could be used to estimate emissions using a 2D Gaussian fit out of 136 sites for which we processed high resolution XCH4 grids (table C1, figure 2). Figure C1 shows the averaged TROPOMI  XCH4 retrievals alongside the two-dimensional Gaussian fit. The study period spans 1419 days. For Dhaka, scenes containing some valid data were obtained on 887 separate days, and grid cell averages contained a maximum of 430 data points per cell. These numbers vary according to latitude, meteorology and land surface.
Fourteen urban areas were within regions of high XCH4 concentrations where it was not possible to distinguish between urban and regional emissions (table C2). In some cases this may be because the city is surrounded by rice cultivation, for example in Yangon and Bangkok. In other cases, it may be because of regional urban development. Boston is located in an urbanized region and has natural gas leaks from aging pipelines. However, efforts at fixing leaks have not led to reduced methane concentrations suggesting that other factors may be at work [27].
There was no XCH4 enhancement at 26 of the 136 sites that we screened, despite having TROPOMI XCH4 retrievals over those regions. This suggests that the emissions were below detection levels although other factors could impact the retrievals as described for the urban areas with insufficient data below. Further analysis with higher resolution satellites or in situ monitoring would be required to analyze these sites. As an example, we find no enhancement in TRO-POMI XCH4 over Paris, France, but mobile measurements found natural gas leaks accounting for 56% of emissions and emissions from sewage lines accounting for 33% of emissions [28]. Minneapolis and Milwaukee present interesting cases as they are quite similar to each other, yet Milwaukee has a clear XCH4 signal whereas Minneapolis does not. Further measurements would be needed to determine if this is due to retrieval issues or whether the cities actually have different methane emissions.
The last 35 sites did not have enough high quality data to obtain XCH4 fields, which was mostly due to: 1. persistent cloudiness; 2. the presence of water (coastal and riverine cities and cities with either large or small lakes); 3. mountainous terrain; or 4. desert cities with particularly high and variable albedos in surrounding areas. Regions near the equator had the highest chance of not having a reliable retrieval due to the cloudiness associated with the Intertropical Convergence Zone.
Total emissions for each urban area ranged from 1391 kilotonnes per year (ktpy) for Karachi to 85 ktpy for Rotterdam. The per capita emission rates varied from 342 kg person −1 year −1 for Baltimore to 13 kg person −1 year −1 for Seoul. Some of the largest emissions occurred in Asia, whereas the largest emissions per capita tended to be in North America.

Comparison with EDGAR emissions inventory
To explore the possible sources of urban methane, we compare the TROPOMI estimates with emissions in the EDGAR inventory [33]. We split the 61 urban areas into 4 groups based on a mix of geography and income level from the World Bank classification of countries. The High Income Countries Plus Europe (HICP) group contains 4 cities from Asia, 3 from Oceania, 15 from North America, 5 from Europe as well as 3 European cities from Upper Middle Income Countries. The Asia group contains 17 cities from Lower Middle Income Countries and 1 from an Upper Middle Income Country. The 7 Latin American cities and 5 African cities were combined into geographical groups irrespective of income level. Figure 3 shows that overall the TROPOMI estimates are much higher than the EDGAR estimates: for the 61 urban areas, we estimate around 25 000 ktpy of methane compared with a total of 6700 ktpy in EDGAR, a factor of 3.8 difference (table 1). The corresponding emissions per capita are 57 kg person −1 year −1 from TROPOMI and 15 kg person −1 year −1 in EDGAR. Appendix C contains more detailed plots and tables with individual data points by urban area. Figure C2 shows the scatter plot of the same data along with lines of best fit for the HICP and Asia cities. There is no statistically significant relationship between the estimates. We further show scatter plots with the sectoral EDGAR emissions (figure C3) and do not find any statistically significant correlation between the TROPOMI estimates and individual emission sectors.
We considered the spatial distribution of the sectoral EDGAR emissions and compared them with the TROPOMI XCH4 enhancements. In the case of Dhaka, shown in figure 4, the EDGAR inventory is mostly composed of agricultural emissions spread throughout the domain with the addition of 4 points sources releasing approximately 90 ktpy each, related to natural gas facilities. One of them is within the urban hotspot, one corresponds to an area with a minor enhancement in TROPOMI XCH4 but with limited data availability due to the presence of surface water, and the remaining two do not match any enhancement in the TROPOMI concentrations. These are on the edge of the detection limit for TRO-POMI (10 tonnes per hour [45,46]). In contrast, the urban enhancement of 21 ppb is well above the TRO-POMI detection limit and suggests that emissions in Dhaka are well in excess of those of the surrounding regions.
As discussed above, the temporal averages for Dhaka contain retrievals from 887 days, with more data available when dry winds from the north lead to clear skies. The urban signature is therefore spread out preferentially to the southeast as can be seen in the figure. This is more consistent with the map of population density shown in figure 4(c) than with the total EDGAR emissions. In the absence of more detailed information, EDGAR apportions solid waste, wastewater, transport and residential emissions based on population density [33], as shown in figure 4(d). A population-based distribution of sources is more likely to produce the spatial pattern of XCH4 observed by TROPOMI than the map of total EDGAR emissions which are dominated by fossil fuel sources. Nevertheless the maps suggest that urban emissions still need improved characterization.
In addition to Dhaka, other cities in the dataset also demonstrate that there is a tighter relationship between XCH4 spatial patterns and population density than with total EDGAR emissions (figure 1). For Buenos Aires in particular the XCH4 footprint matches the urban footprint, even as the peak hotspot lies over the Norte III landfill that was estimated to emit 47% of the urban methane [36]. Lahore and Delhi both have XCH4 footprints that follow the urban area and are less connected with the location of the landfills, in line with their lower proportion of estimated urban emissions [36].
Dhaka, Delhi and Lahore are between the Himalayas and the Indian Ocean and hence have more consistent meteorological patterns than many other cities: winds from the northwest bring clear skies whereas winds from the ocean bring cloudy skies and precipitation. Valid satellite retrievals are therefore predominantly from days with wind transport from the northwest which explains why the XCH4 signature extends from the center of the urban areas towards the southeast.
The data become noisier for Houston and Lagos but there is still a discernible enhancement that matches the urban area. In contrast with Buenos Aires, some of the landfills are on the edges of hotspots, but other landfills have no XCH4 signature. Madrid is one of the urban areas with lower emissions as well as with zero untreated wastewater. In this case, the 2 main hotspots match closely the location of 2 large landfills [47].

Comparison with discharge rates of untreated wastewater
Methanogenesis during microbial anaerobic digestions is an important source of methane formation during the degradation of waste in anoxic environments. In the EDGAR inventory, 9.3% of global methane emissions come from landfills and 11.8% from wastewater treatment. To explore the relationship with wastewater, we show the estimated emissions of methane per capita categorized by urban areas with rates of untreated wastewater (WWUT) for different geographical groups based on a global database of wastewater with 5 arcmin resolution [34].
While there was not a statistically significant relationship between our TROPOMI estimates and the EDGAR total or sectoral emissions, we find a strong correlation with rates of untreated wastewater. These are highly statistically significant, with p-values of 0.000 26 for HICP areas and 0.025 for Asia areas (figure C2). In figure 3 we split the HICP and Asia urban areas into two groups according to per capita rates of untreated wastewater (m 3 person −1 year −1 ): Low HICP (0-6.5), High HICP (6.5-16.5), Low Asia (0.1-8.2) and High Asia (8.2-57.8). EDGAR emissions seem to be lower for High WWUT HICP than for Low WWUT HICP areas. The apparent increase in EDGAR emissions with WWUT for Asia is not statistically significant ( figure C3). In contrast, there is a strong increase in TROPOMI estimated emissions with rates of untreated wastewater.
The lack of correlation between TROPOMI CH4 emission estimates and the EDGAR inventory and the strong correlation with rates of untreated wastewater further confirms that there are large uncertainties and discrepancies in the methane emission inventories. The methane emissions could be a result of the wastewater systems themselves. Alternatively they could be due to correlated factors for which wastewater serves as a proxy, for example aging natural gas infrastructure or inadequate garbage collection and disposal. Figure 5 displays the correlation between estimated methane emissions and untreated wastewater by categorizing the urban areas into four levels of untreated wastewater. Table 1 presents the average TROPOMI, wastewater and population metrics corresponding to figure 5. The HICP group in our dataset has 10 urban areas where all wastewater is treated. This group has the lowest average methane emissions of 33 kg person −1 year −1 . The average methane emissions steadily increases to 138 kg person −1 year −1 for the group with the highest rate of untreated wastewater, a factor of 4.2 times the zero group. For the Asia category, the group with low rates of untreated wastewater has emissions of 39 kg person −1 year −1 increasing to 88 kg person −1 year −1 for the highest group, a factor of 2.3 times the low group. Figure C2 shows the individual points within the HICP and Asia categories as a scatter plot with the line of best fit. This shows that there is a slope of around 3.6 kg CH 4 m −3 untreated wastewater for Asia, and around 6.3 kg CH 4 m −3 untreated wastewater for the HICP group.
In Africa, rates of wastewater collection are low and hence rates of untreated wastewater are also low irrespective of the treatment fraction. In contrast, Latin America has very high rates of wastewater production as well as untreated wastewater. Both have similar average emissions of 51 and 54 kg person −1 year −1 respectively, which are close to the overall average of 57 kg person −1 year −1 for our dataset.
China represents an interesting case: our algorithm estimated emission rates for some cities that were part of regional emissions, but otherwise we did not detect XCH4 hotspots specifically focused on urban areas. In the global wastewater database, 65% of wastewater produced in China was collected, and 100% of that was treated. One of the particularities of China is the extensive pretreatment of wastewater due to tight controls on pollutant discharges to municipal sewers [48]. Although the pretreatment of wastewater has been hypothesized to lead to extra methane emissions, we do not see a signature in the TROPOMI retrievals. Given the large population of Chinese urban areas, these emissions should have been detectable by TROPOMI. While there may be ecological and human impacts of uncollected wastewater, we have not found evidence in the TROPOMI signal that it leads to anoxic conditions that would produce methane.

Scenario analysis
As shown in table 1, total TROPOMI-estimated emissions are a factor of 3.8 larger than the EDGAR emissions for the 61 urban areas in the dataset. Whereas EDGAR emissions from our dataset account for only 1.8% of the EDGAR total emissions, the TROPOMI estimates would account for 6.8% of the EDGAR total emissions.
Emission reductions were estimated for different scenarios involving the 61 urban areas as well as for all urban areas with more than 2 million inhabitants (table 2). If per capita emissions of the 33 cities with medium to high untreated wastewater were to be reduced to the mean emissions of the cities in the zero to low untreated wastewater categories, this would reduce emissions by 7633 ktpy corresponding to 2% of the worldwide total. If emissions of all 61 cities were reduced to 10 kg person −1 year −1 , the level of the lowest city detected, that would reduce emissions by 20 926 ktpy which would correspond to a 5.6% reduction of the global total.
The 61 cities in the dataset comprise 442 million people. We estimate that there are 385 urban areas with more than 2 million people, for a total of 2.1 billion inhabitants. Applying per capita emissions ranging from 10 to 40 kg person −1 year −1 would lead to total estimated emissions ranging from 21 061 ktpy to 84 244 ktpy, which would correspond to 5.6%-22.4% of the worldwide EDGAR emissions (table 2). The underestimate of these emissions could account for around 10% of missing emissions in EDGAR. Conversely, reducing the emissions from urban areas around the world to the rates found in cities with complete wastewater treatment could reduce global emissions on the order of 5%-10%.

Conclusions
TROPOMI XCH4 retrievals show significant enhancements over 61 urban areas worldwide. Using a two-dimensional Gaussian model, we estimated emissions of around 25 000 ktpy from urban areas home to 440 million people. These emissions were found to be a factor of 3.8 times higher than the emissions in the EDGAR emission inventory. Furthermore, no correlation was found between the TROPOMI-based emission estimates and either the total or the sectoral EDGAR emissions. This suggests that there are significant discrepancies between actual emissions and current inventories, and that the processes responsible for these are poorly characterized. The underestimation by a factor of 2-3 has been well documented through field studies and has usually been attributed to underestimates in leakage from natural gas use and emissions due to solid waste [15,16,20,22,27]. However, isotopic methane measurements have suggested that a large share of emissions could be associated with wastewater rather than natural gas [28][29][30].
In this study, the TROPOMI-estimated emissions were shown to correlate with rates of discharge of untreated wastewater into the environment from a global database [34]. The TROPOMI estimated methane emissions for urban areas with zero untreated wastewater are around 30 kg person −1 year −1 , which are double the urban emissions in EDGAR using the IPCC methodology [40,49].
The slope of the methane emissions relative to the rates of untreated wastewater shown in figures 5 and C2, and table 1 is between 3 and 7 kg CH4 m −3 Untreated Wastewater. This is much higher than the 0.1-5 g m −3 of methane typically found in wastewater arriving at a treatment plant [50,51], which suggests that methane could be formed after the untreated wastewater is discharged. The TROPOMI estimates presented here suggest that urban areas with medium to high rates of untreated wastewater emit 50-100 kg person −1 year −1 more methane than urban areas that treat all their wastewater.
In the current IPCC methodology, methane emissions are estimated by multiplying the Biochemical Oxygen Demand (BOD) by a Methane Correction Factor which ranges from 0 for well-managed systems with sufficient oxidation and with methane capture, to 1 for systems that release wastewater into the environment where anaerobic processes convert organic material to methane. When emissions occur after discharge to the environment, the BOD is sometimes augmented by 25% in the inventories to account for industrial emissions [40]. Nevertheless, emissions after release are related to eutrophication and depend on a large number of ecological factors, resulting in emission estimates that vary by up to 3 orders of magnitude [52][53][54]. These post-release emissions are not addressed in the inventories.
If verified, our estimates of methane emissions from TROPOMI retrievals suggest that there is methane formation in the environment as a result of releases of untreated wastewater that is much larger than the estimates in current inventories. Conversely, the steep slope in methane emissions as a function of untreated wastewater suggests that some urban areas could reduce their emissions by 50% or more by fully treating all their wastewater.
Bridging the gap between inventories and measurements will depend on improving the understanding of multiple types of sources as well as on better quantifying these factors in official emission inventories. Wastewater systems are known to be a source of methane, but cities with large methane emissions from sewage may also be cities that have larger emissions from natural gas and landfills if, for example, all these factors are correlated with the age of urban infrastructure. Quantifying the relative contributions from these different factors will require more advanced measurements of methane concentrations, for example higher resolution remote sensing or more detailed in situ monitoring. Emission inventories will also need to be based on more refined characterization of these factors including the variability between urban areas.
Given that methane is estimated to account for 23% of climate forcing since 1750 [4], there is a great need to reduce emissions from a wide array of sources [55]. Our results suggest that improved wastewater treatment could lead to a significant reduction in greenhouse gas emissions from urban areas in their quest for carbon-neutrality [56]. This in turn would contribute to the goals of the Global Methane Pledge which currently has 119 participating countries [57]. Along with the climate benefits, improving wastewater management will also lead to ecological benefits [58], societal benefits [59], and human benefits associated with the Sustainable Development Goals [60,61].

Data availability statement
The data that support the findings of this study are openly available as follows: The TROPOMI data processing was carried out on the Dutch national E-infrastructure with the support of the SURF Cooperative, and the data are available at: https://ftp.sron.nl/open-access-data-2/TROPOMI/tropomi/ch4/ The wastewater data are available at: https://doi. pangaea.de/10.1594/PANGAEA.918731 The European Commission Joint Research Center provided the following 3 datasets. The EDGAR v6.0 Global Greenhouse Gas Emissions are available at: https://edgar.jrc.ec.europa.eu/dataset_ghg60 The population maps are available at: https://ghsl. jrc.ec.europa.eu/datasets.php The global surface water data are available at: https://global-surface-water.appspot.com/download Acknowledgments This project was partially supported by two grants from the United States Department of State (SMLAQM19CA2361 and SLMAQM20CA2398). The opinions, findings, and conclusions stated herein are those of the investigators and do not necessarily reflect those of the United States Department of State. We are grateful for the anonymous reviewers whose comments helped to improve the paper.

Appendix A. Comparison with other methods
Our 2D Gaussian method is equivalent to the method used to estimate emissions from the Central Valley (CA, USA), Four Corners (USA), Azerbaijan and Turkmenistan using SCIAMACHY and GOSAT retrievals [43]. The scaling terms can be shown to be analogous, using the entire column of air for the calculation of the mass of methane. The method is also similar to the one used to estimate emissions of oil and gas production regions [44]. Although this latter method uses a line integral of fluxes, it can be shown to be analogous to the box model method by using corresponding scaling terms.
The Achilles' heel of all of these methods is the estimation of the length scale and the life time used in the calculation of the emissions from an estimate of the amount of excess methane associated with a source. As can be seen from the equations above, the emissions estimate is linearly proportional to both the length scale and the wind speed. As there is an irreducibly subjective element to these estimates, we compare the emission estimates of our method for three well-studied regions: the Permian basin in Texas, the Central Valley in California and the South Sudan Wetland. The TROPOMI XCH4 retrievals and Gaussian fits are shown for each domain in figure A1. Whereas we used a single two-dimensional Gaussian for all the urban areas studied, we used multiple Gaussians for the evaluation cases as they are composed of multiple sub-regions, as can be seen in figure A1.
For the South Sudan Wetlands, we estimate emissions of 6030 ± 604 ktpy using a separate Gaussian for the eastern and for the western subregions. Our emission estimates are in agreement with 7400 ± 3200 ktpy obtained using a previous version of the XCH4 retrievals [62] and the mass balance model used for SCIAMACHY and GOSAT [43].
For the Permian Basin, we estimate emissions of 3400 ± 280 ktpy using a separate Gaussian for the Midland basin and one for the Delaware basin. There are many estimates of methane emissions from the Permian basin: 3060 ± 480 ktpy based on a flux divergence method [63], 2010 ± 10 ktpy (GOSAT) and 2680 ± 500 ktpy (TROPOMI) using a Bayesian inversion of fluxes [64]; 2700 ± 500 ktpy using GEOS-Chem inversion [65]; 3200 ± 1130 ktpy using a line integral of TROPOMI retrievals [44]; 1700 ktpy using an airborne study of the New Mexico part of the Permian Basin only [66]; 3400 ± 1000 ktpy using tracer-to-tracer relationships derived from aircraft measurements [67]. Our estimate is on the high side of these estimates but within the range reported in the literature. In the case of the Permian basin, the uncertainty is exacerbated by source intermittency and variations between sources and source types, as seen by an airborne field campaign with a resolution beyond that provided by satellite retrievals [68].
For the Central Valley, we estimate emissions of 4890 ± 700 ktpy using 4 Gaussians to better approximate the spatial distribution of the XCH4 enhancements. Our emission estimates are higher than the 2000-3000 ktpy estimated using GEOS-Chem and GOSAT retrievals [69], which was itself considerably higher than the approximately 1050-1550 ktpy estimated with a mass balance model [43]. Estimates of methane emissions based on TRO-POMI XCH4 hotspots have recently been made for Buenos Aires, Delhi, Lahore and Mumbai with a Bayesian inversion method based on the Weather Research and Forecasting (WRF) model [36]. The method for estimating the emissions is totally different, which provides an independent evaluation of the results. Our emissions are 1.48 larger for Buenos Aires, 1.41 for Delhi and 1.45 for Lahore. This consistency suggests that there is a systematic difference between our methods, which could easily arise from differences in the way that wind speed and the boundary layer are handled in each algorithm. For Mumbai, our estimates are 3.23 times larger, which is to say 2 times 1.62. Mumbai is on the coast, and we do not use XCH4 retrievals over water. Our method fits a Gaussian using the available data, which for Mumbai assumes that the dome of XCH4 enhancement extends around the whole urban area, over both land and over water. This could lead to an additional factor of 2 difference between our methods. While there is a systematic difference between our results for the four cities, the difference is consistent and within the expected uncertainty due to the input parameters of our different methods. Although further research should narrow these uncertainties, the conclusions presented here comparing TROPOMI estimated emissions with the EDGAR inventory and with the untreated wastewater inventory are robust relative to these systematic uncertainties.

Appendix B. Uncertainty analysis
We performed two levels of uncertainty estimates for our emissions. In the first level, we tested the impact of the Quality Assurance flag, the impact of the land flag, and the choice of the swath pixels included in the gridded averages. In the second level, we tested the impact of number of retrievals per grid point, the impact of the water mask and the influence of the maximum value of the length scale terms (σ).
For level 1 uncertainty analysis, we performed the tests for QA flags of both 0.4 and 1.0, or 1.0 alone; and for the entire swath or the central band of the swath. For these four conditions, we further tested three different constraints on σ: using the default value (usually 30 km), a value reduced by one third, and a value increased by one third. We also tested three different limits on the number of retrievals available for each grid point: 1. no limit, all grid points with any data used; 2. remove 10% of grid points with fewest retrievals; 3. remove 20% of grid points with fewest retrievals.
Overall we therefore had nine estimates for each of four conditions. We evaluated the coefficient of variation (standard deviation divided by the mean) to remove cities deemed to have too much uncertainty (above 0.3-0.4). We found that using pixels with all QA flags (both 0.4 and 1.0), along with the central band of the swath gave the smoothest fields of urban enhancements and the most reliable results with lower coefficients of variation.
For level 2 uncertainty analysis, we used QA flags of 0.4 and 1.0 and the central band of the swath, and we tested the uncertainty due to constraints on σ, data availability and the water mask. For σ, we used the three conditions described above. For data availability, we tested the grids with removal of 0% and 10% of the grid points. We did not test with 20% removal because we were also using the water mask to remove grid points, and this would have left too little data for the algorithm. For the water mask we tested the three options described above, keeping grid cells with: 1. no smoothing and 80% land cover; 2. one pass of smoothing and 90% land cover; 3. two passes of smoothing and 95% of land cover.
We therefore had 18 emission estimates for our level 2 sensitivity analysis (3 σ * 2 data availability * 3 water mask cases). As for level 1, we evaluated the coefficient of variation for each site. The median uncertainty was 15%, with the interquartile range being between 10% and 22%. This suggests that the analysis presented in the paper is robust with respect to the unsystematic errors in the algorithm and the specific data selection choices made. Emissions are reported in the paper as the mean of the 18 emission estimates. Figure C1 shows the oversampled 1 km averages of TROPOMI XCH4 retrievals over Dhaka along with the two-dimensional Gaussian fit used to estimate the emissions. Figure C2 compares the TROPOMI-estimated methane emissions with the total EDGAR emissions and the rates of untreated wastewater for the 61 urban areas in the dataset. The lines of best fit show that the correlation is not significant for EDGAR emissions, but the relationship with untreated wastewater is very significant: the p-value is 0.000 25 for HICP and 0.025 for Asia. The untreated wastewater plot is a different representation of the same data shown in figure 5. The line of best fit shows that there is a slope of 3.6 kg CH4 per m 3 Wastewater for Asian cities and 6.3 kg CH4 per m 3 Wastewater for HICP urban areas. Figure C3 is similar to figure C2 with comparisons by EDGAR sectoral emissions. As described in section 3.2: Fossil Fuels include energy production and fossil fuel production; agriculture includes enteric fermentation, agricultural soils, manure management and agricultural waste burning. None of the relationships are statistically significant: the lowest pvalue (shown in the legends) is 0.11 for the 'other' category.

Appendix C. Extended data
Tables C1 and C2 show individual data for each of the 61 urban areas in the study and for the 14 urban areas that are part of a regional enhancement in TRO-POMI XCH4. Each variable is summed for the area given by the diameter of the 2D Gaussian fit matching the TROPOMI XCH4 hotspot. Table C3 shows the parameters used to obtain emission estimates from the two-dimensional Gaussian fit, as described in section 2.4. Figure C1. Comparison of TROPOMI XCH4 concentrations oversampled to 1 km grid (left) with two-dimensional Gaussian fit used to estimate emissions (right) for Dhaka, Bangladesh. Outline of areas with higher population density in magenta. National borders shown in gray. Domain is 300 km by 300 km (same as in figure 4). Figure C2. Scatter plot of TROPOMI-estimated CH4 emissions per capita versus total EDGAR emissions (left) and untreated wastewater rates (right) for the 61 urban areas in the dataset along with lines of best fit for the High Income and European cities (HICP) and Asian cities. Shading shows 95% confidence interval of line of best fit, legends show the parameters of the fits along with outliers removed in the analysis. Figure C3. Scatter plots of TROPOMI-estimated CH4 emissions per capita versus sectoral EDGAR emissions. Lines of best fit shown for the HICP and Asia groups. Shading shows 95% confidence interval of line of best fit. Outliers removed are listed in the subplots along with parameters of the fits.