Diesel passenger vehicle shares influenced COVID-19 changes in urban nitrogen dioxide pollution

Diesel-powered vehicles emit several times more nitrogen oxides than comparable gasoline-powered vehicles, leading to ambient nitrogen dioxide (NO2) pollution and adverse health impacts. The COVID-19 pandemic and ensuing changes in emissions provide a natural experiment to test whether NO2 reductions have been starker in regions of Europe with larger diesel passenger vehicle shares. Here we use a semi-empirical approach that combines in-situ NO2 observations from urban areas and an atmospheric composition model within a machine learning algorithm to estimate business-as-usual NO2 during the first wave of the COVID-19 pandemic in 2020. These estimates account for the moderating influences of meteorology, chemistry, and traffic. Comparing the observed NO2 concentrations against business-as-usual estimates indicates that diesel passenger vehicle shares played a major role in the magnitude of NO2 reductions. European cities with the five largest shares of diesel passenger vehicles experienced NO2 reductions ∼2.5 times larger than cities with the five smallest diesel shares. Extending our methods to a cohort of non-European cities reveals that NO2 reductions in these cities were generally smaller than reductions in European cities, which was expected given their small diesel shares. We identify potential factors such as the deterioration of engine controls associated with older diesel vehicles to explain spread in the relationship between cities’ shares of diesel vehicles and changes in NO2 during the pandemic. Our results provide a glimpse of potential NO2 reductions that could accompany future deliberate efforts to phase out or remove passenger vehicles from cities.


Introduction
Ambient nitrogen dioxide (NO 2 ) pollution is a global concern for public health, particularly in urban areas, and is linked with decreased lung function, cardiopulmonary and respiratory disease, and pediatric asthma, among other adverse health effects [1][2][3][4]. Traffic emissions are often the dominant source of urban NO 2 , followed by emissions from industrial sources and energy production and usage [5]. As such, NO 2 is an effective surrogate for the broad traffic-related mix of pollutants.
Reductions in urban NO 2 during the pandemic (hereafter '∆NO 2 ') varied greatly across the world (e.g. [6][7][8][9]). Direct comparisons of ∆NO 2 among cities are inherently complicated by different meteorological patterns [10], stay-at-home measures, and levels of adherence to these measures in each city.
However, even after accounting or normalizing for these important moderating factors, differences in ∆NO 2 likely remain. With all else equal, one cause of these differences is vehicle fuel type. Reductions in NO 2 have purportedly been larger in regions dominated by diesel vehicles [11]. While a large body of literature has documented NO 2 changes during the pandemic, a smaller portion has explored reasons for intercity differences in NO 2 changes. None, to the best of our knowledge, has specifically examined the role of different vehicle fuel types in causing these intercity differences.
Diesel-powered passenger vehicles emit substantially greater emissions of nitrogen oxides (NO x ≡ NO + NO 2 ) than comparable petrol-(or gasoline-)powered vehicles [12]. For example, real-world measurements indicate that Euro 6 diesel vehicles emit ten times more NO x than Euro 6 gasoline vehicles [13]. Since the late 1990s, European nations experienced a 'diesel boom,' where diesel passenger vehicles were intentionally promoted as an alternative to petrol-powered passenger vehicles on the premise they emit less CO 2 [14]. However, diesel and petrol vehicles have both produced similar realworld CO 2 emissions since the early 2000s [15]. The proportion of diesel-powered passenger vehicles to the total number of passenger vehicles (henceforth 'diesel shares') steadily increased until the Volkswagen emissions scandal was brought to light in 2015. Since then, diesel shares of new car registrations have declined in Europe [16]. Diesel NO x , including emissions in excess of certification limits, has contributed to high NO 2 pollution in Europe (e.g. [17][18][19][20][21]) and adverse health impacts (e.g. [14,22,23]). In several countries outside of Europe such as the United States, Canada, and China, diesel shares are much smaller, and petrol is the primary fuel consumed by passenger vehicles (e.g. [24]).
In this study, we examine how the COVID-19 pandemic can reveal the fingerprint of diesel passenger vehicles on NO 2 pollution in urban areas. The pandemic, which largely affected the transportation sector due to stay-at-home measures, provides an unprecedented natural experiment that allows us to tease out the relationship between urban vehicle fleets and ∆NO 2 . Additionally, we discuss ways that additional air quality, emissions, and traffic data would strengthen future efforts to study clean transit and air quality.

Materials
We select 22 focus cities spanning 17 European countries based on the availability of in-situ NO 2 monitors (text S1, figure S1), city-or country-level traffic trends during the pandemic (text S2, figure S2), and country-level diesel shares (text S2, table S1).
Publicly-available data on diesel shares at a subnational level do not exist to our knowledge, so we choose only 1-2 cities per country in our analysis (text S1). Traffic data come from Apple Mobility Trends Reports [25] and represent traffic volumes relative to baseline volumes. This dataset began in 2020, and we form synthetic traffic data for 2019 using day-ofthe-week proxies (text S2 and S5). As discussed in section 1, diesel shares represent the proportion of diesel-powered passenger vehicles to the total number of passenger vehicles.
The NASA GEOS Composition Forecast Modeling System [GEOS-CF; 26] provides hourly, highfidelity estimates of meteorology and atmospheric composition at 0.25 • × 0.25 • (∼25 km) horizontal resolution globally (text S3). The model's emissions inventories do not account for the impact of COVID-19 on anthropogenic emissions, thus representing a counterfactual, business-as-usual scenario for the COVID-19 period [7]. We sample the surface-level (lowest model level) meteorological fields and pollutant concentrations from GEOS-CF at grid cells colocated with each in-situ NO 2 monitor. Both observed and modeled NO 2 concentrations are obtained for 1 January 2019-30 June 2020.
We also leverage emissions scenarios from the Greenhouse gas-Air pollution INteractions and Synergies (GAINS) model to explore how the contribution of light-duty vehicles to total anthropogenic NO x emissions varies across cities (text S4). Figure 1 illustrates how these data sources are combined within our methodological framework.

Methods
To isolate the influence of emissions changes on NO 2 reductions during COVID-19 for each city, we develop bias-corrected, business-as-usual NO 2 concentrations from GEOS-CF and compare them to observed concentrations. We then aggregate NO 2 observations and collocated GEOS-CF output to cityaveraged daily mean values (text S5).
We first bias correct NO 2 concentrations simulated with GEOS-CF using eXtreme Gradient Boosting (XGBoost) (text S6). Briefly, XGBoost corrects the bias in GEOS-CF NO 2 against observed NO 2 as a time-varying function of air pollutants, meteorology, and traffic (table S2). We build and test this XGBoost algorithm during our 2019 training period, with substantially improved model-observation agreement (figure S3). We then apply the XGBoost bias correction algorithm to modeled NO 2 concentrations in 2020 to estimate business-as-usual NO 2 from 1 January to 30 June 2020. This approach accounts for differences in local meteorology, atmospheric composition, and traffic between 2019 and 2020, as these factors influenced NO 2 concentration independently of fuel type [27]. This approach builds on previous work to estimate business-as-usual pollutant concentrations during the pandemic [7, 28-32]. When calculating ∆NO 2 in a particular focus city, we average over all dates where stay-at-home measures (text S2, figures 1 and S2) are either recommended or required through 30 June 2020 and refer to this period as 'lockdown' .

Results
GEOS-CF captures daily NO 2 variability in our focus cities (figure S7), reinforcing its ability to aid in understanding lockdown-related NO 2 changes. We highlight London to further illustrate GEOS-CF's capabilities and our methods (figure 2(a)). The temporal correlation (r) between modeled and observed NO 2 in 2019 for London is 0.78 (r = 0.60 averaged over all cities; figure S3(b)). Despite the good correlation, there is a negative model bias relative to observations in many of our focus cities (mean fractional bias = −0.60 averaged over all cities; figure S3(a)). GEOS-CF's negative bias is well-documented, especially in Europe and North America where there are publicly available observations [26]. This bias may stem from model resolution; uncertainties in atmospheric transport, boundary layer height, vertical mixing, emissions, and chemistry; and monitor interference with other nitrogen-containing compounds [7,33,34].
Correcting the bias in modeled NO 2 with XGBoost leads to substantially better agreement against observations than the native GEOS-CF concentrations, and the aforementioned negative model bias is greatly reduced. Figure 2(a) illustrates the excellent agreement between business-as-usual and observed NO 2 in 2019 prior to the lockdown. In this example for London, the mean fractional bias in 2019 is reduced from −0.41 with the native GEOS-CF concentrations to −0.02 with the bias-corrected concentrations, and we note similar improvements in other focus cities (figure S7).
We characterize the relative contribution of input variables in generating the business-as-usual NO 2 concentrations with SHapley Additive exPlanations (SHAP) values (figure 2(b), text S6). The absolute SHAP values illustrate the global importance of input variables, and a larger SHAP value for a particular variable means that that variable has a more important impact on the bias correction. Assessing feature importance via SHAP values indicates that local atmospheric transport and species related to basic ozone (O 3 ) chemistry are the most important variables for inferring business-as-usual NO 2 concentrations for both London and all focus cities (figure 2(b)). The partial dependence plots in figure  S6 show how XGBoost's bias correction is affected by individual input variables. This analysis reveals a nonlinear relationship between the input variables and the bias correction, and the predicted bias is largest for meteorological, traffic, or chemical conditions at anomalously high or low extremes.
Traffic emerges as one of the most influential variables in estimating business-as-usual concentrations (figure 2(b)). The relative contribution of traffic in London ranks lower than for the aggregation of SHAP values over all focus cities, but the distribution has right-skew with a wide range for large SHAP values ( figure 2(b)). This result indicates that intraweek traffic variations in London are one of the most important variables in correcting the bias and producing business-as-usual NO 2 concentrations for certain days in our measuring period and particular folds of the k-fold cross validation.
Observed NO 2 concentrations begin to diverge from business-as-usual concentrations in London around mid-February 2020, slightly preceding the United Kingdom's declaration of recommended stayat-home measures (compare figures 2(a) and S2). When averaged over the lockdowns, ∆NO 2 between the observed and business-as-usual concentrations is −28.5% in London. Observed NO 2 concentrations exhibit departures from business-as-usual concentrations in spring 2020 in other cities as well but with varying magnitudes (figure S7). Contemporaneous studies have found NO 2 reductions of similar magnitudes in London and our other focus cities using complementary methods [32,[35][36][37].
Our focus cities span a spectrum of pre-lockdown NO 2 pollution levels and diesel shares ranging from 8.1% in Athens, Greece to 69.2% in Vilnius, Lithuania (figure 3, table S1). Mean 2019 NO 2 in all 22 focus cities exceeded the recently-revised World Health Organization annual mean NO 2 guideline value of 10 µg m −3 (∼5.3 ppbv, assuming an ambient temperature of 298.15 K and pressure of 1013.25 hPa). Even Helsinki, which had the lowest 2019 NO 2 concentration (∼8.4 ppbv) of all focus cities, exceeded this guideline value by 60% (colors in figure 3).
The change in NO 2 during the pandemic (∆NO 2 , equation (1)) averaged across cities is −23.8% (standard deviation = 16.0%), and the precise magnitude ranges by approximately 60% across cities. We next compare ∆NO 2 with cities' diesel shares and see a clear pattern emerge: cities with larger diesel shares tend to have larger ∆NO 2 , while ∆NO 2 is smaller in cities with smaller diesel shares (r = −0.50, p = 0.02; figure 3). For example, the average change in NO 2 (∆NO 2 ) in cities with the top five largest diesel shares (∆NO 2 = −38.1%) is ∼2.5 times larger than the change in cities with the five smallest shares (∆NO 2 = −15.0%). The slope of the linear regression fit between ∆NO 2 and diesel shares provides a succinct summary of our results ( figure 3). This slope indicates that the larger shares of diesel passenger vehicles have stronger impact on the ∆NO 2 during the pandemic; specifically, ∆NO 2 decreased by 5.3% for every 10% increase in diesel shares ( figure 3).
The intercept of the linear regression in figure 3 suggests a very small change in NO 2 for cities whose shares of diesel passenger vehicles are close to 0%. Even cities with these small shares, such as those in North America with mostly gasoline-powered passenger vehicles, experienced substantial decreases in NO 2 . For example, Goldberg and colleagues [10] found a median NO 2 decrease of ∼22% in major North American cities during spring 2020 after adjusting for seasonality and meteorology. In all cities, other sources of urban NO x beyond diesel passenger vehicles (e.g. heavy-duty vehicles, power plants, maritime activity, industry) not accounted for in our experimental design contributed to ∆NO 2 , regardless of the diesel passenger vehicle share.
We next describe sensitivity analyses that speak to the robustness of our results. Testing whether traffic volumes from Apple Mobility Trends Reports can capture weekday-weekend differences in traffic patterns affirms the ability of this dataset to serve as a proxy for the day of the week and XGBoost to capture these intraweek variations (figure S8). The OxCGRT lockdown dates represent country-level dates for stayat-home measures if at least some region of a given country has the restrictions [38]. Responsibility for COVID-related restrictions was often delegated to state or local governments; however, to the best of our knowledge, no globally consistent database with city-specific lockdown dates exists. Given uncertainties associated with these dates, we recalculate ∆NO 2 for a uniform time period extending from 15 March 2020 to 15 June 2020 and find substantively similar results (compare figures 3 and S9). We examine the extent to which ∆NO 2 varied between recommended versus required stay-at-home measures shown in figure S2 and the impacts of restriction type on the diesel share-∆NO 2 relationship. Again, we observe no substantive changes (compare figures 3 and S10).
We test whether including a cohort of additional cities outside of Europe (Mexico City, Los Angeles, Auckland, and Santiago; text S1) from the C40 Cities network leads to consistent conclusions regarding the relationship between diesel shares and ∆NO 2 . C40 Cities is a network of the world's megacities committed to addressing climate change, and the four additional cities included in our study provided data to C40 (see Acknowledgments) after expressing interest in learning from lockdowns to design post-COVID recovery measures that may further support air quality improvements and reductions in NO 2 . These additional cities specifically allow us to test whether our findings are generalizable to cities with different cultural and behavioral practices (e.g. reliance on public transit, adherence to COVID-19 containment measures) and lower diesel shares compared to the European cohort focused on elsewhere in this study.
Given the small diesel shares in these cities (cohort-averaged share= 4.0%; table S1), we expect they would experience small to modest NO 2 reductions. This is indeed the case, and the cohort-averaged ∆NO 2 of −14.8% is markedly smaller than the reduction in many European cities with larger diesel shares (figures S9 and S10). This cohort of C40 Cities also demonstrates some of the challenges associated with inferring business-as-usual NO 2 . For example, Los Angeles has one of the smallest diesel shares of all cities examined (table S1) but experienced markedly larger NO 2 reductions than other cities with small diesel shares. NO x emissions related to the Ports of Los Angeles and Long Beach, one of the largest ports in North America, might inflate ∆NO 2 compared to cities without ports or other large point sources of NO x . The topic of unconsidered moderating influences is further discussed in section 4.
Despite the strong, statistically significant relationship between diesel shares and ∆NO 2 (figure 3), ∆NO 2 does not increase monotonically as the share of diesel passenger vehicles grows. There are several cities with similar diesel shares, yet different ∆NO 2 , and we next explore key factors that could explain the spread among cities' ∆NO 2 given their diesel shares.
One factor to explain the spread in ∆NO 2 is vehicle age. NO x emission rates are not stable over diesel passenger vehicles' lifetimes and increase linearly with age [39]. This increase may result in 'effective diesel shares' that are larger than the ones used in our study, especially for focus cities with older passenger vehicle fleets (table S1). With all else equal, we hypothesize that cities with older passenger vehicles would experience larger ∆NO 2 than cities with newer vehicles.
For brevity, we discuss this role of vehicle age for a few cities: Vienna, Austria; Paris, France; and Madrid, Spain. These cities have among the largest, yet very similar, diesel shares of all focus cities in our study, but there is a spread of ∼40% in ∆NO 2 among these cities. For the aforementioned three cities with large diesel shares, our hypothesis regarding vehicle age is valid: passenger vehicles in France and Spain are 1.9 and 4.4 years older on average, respectively, than those in Austria (table S1). Vehicle age provides a plausible, evidence-based hypothesis to explain some of the intercity spread in our results, although we note it cannot explain all variability. The results of previous studies (e.g. [39,40]) imply that future policies to preferentially remove older diesel passenger vehicles from cities may have outsized impacts compared to removing newer diesel vehicles.
Another factor to explain variability in intercity ∆NO 2 is the contribution of light-duty vehicles to overall NO x emissions. On average, road transportation contributes 47% of total NO x emissions in European cities but ranges from approximately 20%-70% depending on the city [5,41]. We hypothesize that cities with similar diesel shares would likely have different ∆NO 2 if their light-duty vehicle sectors have different-sized contributions to total anthropogenic NO x emissions ( figure 4(a)). To test this hypothesis, we leverage emissions scenarios from GAINS to find the contribution of NO x emissions from light-duty vehicles to total NO x emissions for each focus city (text S4).
Unsurprisingly, diesel shares are correlated with the contribution of light-duty NO x emissions to total NO x emissions (r = 0.57, p < 0.01; not shown), meaning that cities with a larger share of diesel passenger vehicles tend to have a larger proportion of NO x emissions from light-duty vehicles. ∆NO 2 also increases as the overall contribution of light-duty NO x emissions to total NO x emissions grows in all focus cities (r = −0.70, p < 0.01; figure 4(b).
Since our original hypothesis posits that cities with similar diesel shares might have different ∆NO 2 if their light-duty vehicle sector contributes differently to total NO x emissions, we partition cities into groups with similar diesel shares and investigate how ∆NO 2 varies within these groups. We find that ∆NO 2 increases as the light-duty NO x emissions contribution increases among cities with similar diesel shares ( figure 4(b)). For example, cities with 'medium diesel shares' (figure 4(b)) have diesel shares that range from 31.7% to 44.2%. Among these cities, cities where light-duty vehicles contribute a larger proportion to total NO x indeed experienced larger ∆NO 2 during the pandemic, thus affirming our original hypothesis.
The analysis in figure 4 can also shed light on cities with outlying ∆NO 2 values in figure 3. In Vilnius, GAINS indicates that NO x emissions from light-duty vehicles only constitute 13.6% of total NO x emissions, one of the smallest contributions of all our focus cities ( figure 4(b)). It follows that a small ∆NO 2 might be expected in Vilnius even given the large diesel share. For simplicity, we have chosen tertiles to group similar diesel shares, but we have also tested a larger number of groups (e.g. quartiles, quintiles) and found similar results.

Discussion
Major strengths of our analysis include our semiempirical approach that leverages air quality data from monitoring networks as well as our use of a machine learning algorithm, XGBoost, to establish the relationship between NO 2 and local meteorology, atmospheric composition, and traffic trends. By combining XGBoost with GEOS-CF to infer businessas-usual NO 2 during the COVID-19 pandemic, we have further demonstrated how this methodology can be used for emergent research questions for which relying on observations or atmospheric models alone would be challenged by moderating influences, incomplete spatial coverage, and inaccuracies.
Several factors and limitations of our data and methods may impact our results. GEOS-CF's use of 2010 anthropogenic emissions for all following years may under-or overestimate NO 2 , especially in areas undergoing rapid changes in emissions. More upto-date emissions are under development and slated to be included in future versions of GEOS-CF [26]. Our framework does not consider intercity differences in the type of passenger vehicles (i.e. gasoline versus diesel) that remained parked and off the road during the pandemic due to lack of data. The use of national-level diesel shares (text S2) and nationallevel light-duty vehicle and total NO x emissions (text S4, figure 4) is a simplification when examining individual cities but an important first step to estimate how the passenger vehicle traffic fleet contributes to urban NO 2 . There have been efforts to provide gridded (not national-level) inventories for specific types of vehicles and vehicle fuels for regions outside the European Union [e.g. [42][43][44]. Future research on urban transportation and air quality will benefit from the inclusion of these inventories.
While our study incorporated changes in traffic into our machine learning approach, the pandemic impacted many forms of urban activity besides onroad traffic. NO x emissions from the aviation, rail, and maritime sectors plummeted during COVID-19 [e.g. 45]. We have not accounted for trends in these activities within XGBoost as we are challenged by a lack of city-specific time series data. While these other activities can be important contributors to urban NO x emissions, we find a strong relationship of passenger vehicle fuel type on ∆NO 2 , meaning that the impact of fuel type on NO 2 is strong enough to observe through our methodological approach even despite these other sectors. Moreover, recent studies point to on-road traffic, particularly passenger vehicles, as the primary driver of NO 2 reductions during the pandemic [8,46]. An analysis of ∆NO 2 against changes in traffic from the Apple Mobility Trends Reports in our 22 focus cities reveals a positive, albeit weak, relationship between ∆NO 2 and changes in traffic (figure S11). Comparing traffic data from Apple's dataset against in-situ traffic counts and the impact of traffic dataset choice on ∆NO 2 further justifies our use of the Apple's dataset in our study (text S7, figure S12).
We investigated whether the location of in-situ NO 2 monitors or the stringency of mobility restrictions are correlated with diesel shares such that they would bias the observed association between diesel shares and ∆NO 2 in figure 3 towards or away from the null. We did not detect a statistically significant relationship between diesel shares and these factors (figure S13), indicating they are not a major contributor to the diesel share-∆NO 2 relationship.
The number and distribution of in-situ monitors vary from city to city (figure S1), and monitors may be sited in different environments (e.g. traffic, industrial, background). Monitor siting could impact our results if monitors are disproportionately sited in neighborhoods where ∆NO 2 substantially differed from the true city-averaged value. For example, Berlin, Germany stands out given the small ∆NO 2 during the pandemic (figure 3). Less than half of Berlin's monitors are located near traffic ( figure S13(b)), and a recent study showed the statistical significance of pandemic-related NO 2 reductions varied across different environments for NO 2 monitors [47]. We explored whether ∆NO 2 within individual cities varied across traffic and non-traffic NO 2 monitors, expecting to find a larger decrease at traffic sites. Although we did not find a significant difference for ∆NO 2 calculated with traffic versus non-traffic monitors, the magnitude of the diesel share-∆NO 2 relationship was nearly double when ∆NO 2 was estimated using only traffic monitors (9.7% decrease for every 10% increase in diesel shares using traffic NO 2 monitors compared to the 5.3% decrease in figure 3 using all monitors), and ∆NO 2 for different monitors types was suggestive of a difference (figure S14). While non-uniform changes in NO 2 within cities are interesting and have been the subject of other studies (e.g. [48]), the primary goal of our study is to reconcile differences among cities' ∆NO 2 in light of their different diesel shares.

Conclusion
Our study demonstrates that diesel shares played a major role in the magnitude of ∆NO 2 experienced by cities during the COVID-19 natural experiment. The magnitude of ∆NO 2 varies from approximately −3% to −61% across cities, and ∆NO 2 is a factor of ∼2.5 times larger in European focus cities with the top five diesel shares compared to cities in the bottom five. The relationship between diesel shares and COVID-related NO 2 reductions deduced from a sensitivity analysis that considers C40 member cities outside of Europe is in reasonable agreement with our results from Europe and suggests the generalizability of our findings.
By leveraging this unique natural experiment, we are able to observe the relationship between NO 2 and diesel shares. Previous observational and modeling studies have documented the impact of diesel fuel on pollution and health, and our study is the first to investigate the impact of diesel fuel on NO 2 pollution during this natural experiment. The relationship between ∆NO 2 and diesel shares gives an indication of the changes in NO 2 that could be expected if cities decrease their diesel shares through policy, economic forces (e.g. increased affordability of electric passenger vehicles), or social forces (e.g. diesel passenger vehicles viewed unfavorably as a result of 'Dieselgate'). Our results will also aid in understanding why ∆NO 2 varied among urban areas given their different diesel shares.
Our key findings are relevant for present-day and future policies. The temporary NO 2 reductions during the COVID-19 pandemic could be sustained through long-term policies to reduce the number of passenger vehicles in urban areas through, for example, policies such as congestion pricing or those that promote active transportation (e.g. cycling, walking). Should these policies be implemented, our results suggest that cities with larger diesel shares would experience larger NO 2 reductions. Beyond decreasing NO 2 and the associated public health damages, these types of policies would also slow climate change, decrease concentrations of other harmful pollutants such as particulate matter and O 3 , and encourage healthier lifestyles if active forms of transportation replace passenger vehicles (e.g. [49]). Focus cities such as Paris and Berlin are poised to ban most or all diesel passenger vehicles in the near future [50]. We expect that our results will reinforce these efforts in Paris and Berlin and could catalyze other cities to implement similar policies.

Data availability statement
All data that support the findings of this study are included within the article (and any supplementary files).