Improving the understanding between climate variability and observed extremes of global NO2 over the past 15 years

This work addresses the relationship between major dynamical forcings and variability in NO2 column measurements. The dominating impact in Northern Southeast Asia is due to El Niño-Southern Oscillation (ENSO); in Indonesia, Northern Australia and South America is due to Indian Ocean Dipole (IOD); and in Southern China Land and Sea, Populated Northern China, Siberia, Northern and Arctic Eurasia, Central and Southern Africa, and Western US and Canada is due to North Atlantic Oscillation (NAO). That NO2 pollution in Indonesia is modulated by IOD contradicts previous work claiming that the emissions in Indonesia are a function of El Niño impacting upon Aerosol Optical Depth and Fire Radiative Power. Simultaneous impacts of concurrent and lagged forcings are derived using multi-linear regression, demonstrating ENSO impacts future NO2 variability, enhancing NO2 emissions 7–88 weeks in the future, while IOD and NAO in some cases increase the emissions from or the duration of the future burning season as measured by NO2. This impact will also extend to co-emitted aerosols and heat, which may impact the climate. In all cases, lagged forcings exhibit more impact than concurrent forcings, hinting at non-linearity coupling with soil moisture, water table, and other dynamical effects. The regression model formed demonstrates that dynamical forcings are responsible for over 45% of the NO2 emissions variability in most non-urban areas and over 30% in urban China and sub-arctic regions. These results demonstrate the significance of climate forcing on short-lived air pollutants.


Introduction
Present approaches address the one-way impact of changes in large-scale average air pollution on meteorology, or changes in large-scale meteorology on air pollution (Kaufman et al 2002, Knippertz et al 2015, Shen et al 2017, Chen et al 2019b. There are works that have used idealized perturbations of emissions with known past reanalysis or highly-constrained future meteorology and/or climatology (Cohen and Wang 2014, Dewitt et al 2019, or have looked at the impact of well-known dynamical perturbations on fixed emissions, using both past and future meteorology, based on different future climate states (Persad et al 2014, Kim et al 2017. Only a very small number of studies have looked into non-linear interactions of large-scale overall changes that one system may have on the other system (Bollasina et al 2011). One such set of studies have started to look into interactive changes on a season-to-season scale between large loadings of measured aerosols from known urbanscale pollution events and urban-scale climatology such as the wind, boundary layer height, and land surface temperature (Cohen et al 2011, Tao et al 2012, Cohen 2014, Guo et al 2019. A few other studies have looked at actual loadings of black carbon and other absorbing aerosols on various dynamical forcings at larger scale (Lau and Kim 2006, Bollasina et al 2013, Grandey et al 2018, Grandey and Wang 2019.
However, these approaches have not been capable of capturing the largest and most polluted events, such as annually occurring biomass burning in Africa, South America, and Southeast Asia that are so extreme that they lead to pollution levels many times larger than the average pollution levels observed in heavily polluted cities (Ichoku et al 2008, Cohen 2014, Cohen and Wang 2014, Aragão et al 2018, Cohen et al 2018, Lin et al 2020a. As a community of air pollution and atmospheric modeling scientists, our models still cannot reproduce or explain the observed vast differences between the clean and polluted seasons in some regions that have a high annual pollution loading such as China, India, and Southern Africa (Jin and Wang 2017, Chen et al 2019b. At present, successful prediction of extremes of pollution which are observed to have concentrations or optical loadings many times higher than levels observed to regularly occur is extremely limited, with many papers using monthly-scale results in their matches, while weekly and daily scale peaks are not reproduced (As-syakur et al 2014, Koplitz et al 2016, Pan et al 2018, He et al 2020, Xue et al 2020, Lin et al 2020b. This is especially so in regions which are otherwise normally clean and small and therefore would normally not be considered important (Wilkins 1954, Velasco andRastan 2015), but which during these extreme events contribute significantly to the overall atmospheric loading Prinn 2011, Cohen et al 2011). In addition to missing some extreme events, when models do capture extreme events, the modeled timing and duration tend to not match well with observations (Cohen and Wang 2014, He et al 2020, Lin et al 2020b.
One reason models have such gaps is that the underlying meteorological factors contributing to the driving forces behind extreme air pollution events are not yet clear (Chen et al 2019a), or in the cases where the meteorological factors are eventually predictable, the anthropogenic co-factors are not (Shawki et al 2017). It is well known that models are not capable of capturing the actual positive phase of the El Niño-Southern Oscillation (ENSO) (the positive phase is called El Niño) (Dinezio et al 2017), which is known to be one of the most important coupled atmospheric and oceanic phenomena influencing the distribution and interannual variability of heat, energy, and moisture throughout the tropics as well as in mid-latitude zones in southern South America and Australia (Fasullo et al 2018), China and the USA (Diaz and Markgraf 2000). This phenomenon has strong inter-annual variation, leading to what multiple works have characterized as a significant effect on wildfires in Indonesia (Siegert et al 2001, Tosca et al 2011, Reid et al 2012, Sun et al 2013, Cohen 2014, Field et al 2016, although to date there are no known modeling studies that have dynamically linked these events together (Leung et al 2007, Martin et al 2012, Tsigaridis et al 2014.
Two other important dynamical phenomena impacting the underlying energy and moisture systems are the Indian Ocean Dipole (IOD) and the North Atlantic Oscillation (NAO). IOD is known to be a significant source of climate variability in the Indian Ocean, with modulation to the global atmospheric temperature and precipitation also observed (Saji et al 1999). The most significant impacts are found to occur over Southeast Asia and Australia in one phase (Cai et al 2011) and Eastern Africa in the other phase (Saji and Yamagata 2003). The NAO exhibits great inter-annual variation, impacting areas stretching from the Eastern United States to Siberia, and from the Arctic to the Subtropical Atlantic (Hurrell 1995). It displays both inter-annual and inter-seasonal signals (Hurrell et al 2003), including seasonal-scale chemical changes to NO 2 over Western Europe (Pope et al 2018). Likewise, models have not been capable of capturing and reproducing the interactions between these two significant dynamical phenomena and observed large-scale wildfires extremes (Rabin et al 2017, Pan et al 2018, Kahn 2020. This work integrates measurements of the drivers of extreme events from both the chemical and dynamical perspectives, and investigates the magnitude and timing of their coupling (Reid et al 2015, Lin et al 2020b. Unlike previous work using monthly or even lower frequency measurements (As-syakur et al 2014, Koplitz et al 2016, Pan et al 2018, this work uses remotely sensed and reanalysis data products that are analyzed at weekly or even daily frequency. Furthermore, we also use direct measurements of both gaseous and aerosol air pollution which are of an order of magnitude or greater than non-extreme conditions (Cohen 2014, Cohen et al 2018, Lin et al 2020a, as compared with indirect measurements related to land use change and fire radiative power (Koplitz et al 2016, Pan et al 2018. The end result of our approach is capable to capture previously known and new relationships between the extremes in the pollution measurements and dynamics over coherent time scales from weeks to months (Neu et al 2014, Zhang et al 2019. The focus is on those regions undergoing rapid change, or those with significant inter-annual or intra-annual variability, allowing for identification of changes in the emissions/sources of air pollution, dominating those relationships related to in situ chemistry or other atmospheric processing occurring after pollution is emitted into the atmosphere (Olivier et al 1998, Cohen andPrinn 2011). These results elucidate significant changes to the atmospheric mass flux and balance that actually occur as a result of the changes in meteorology (Cohen et al 2011, Cohen andWang 2014). By making such a comparison, it may allow further analysis relating to quantifying co-benefits or accounting for co-costs of changes to emissions and/or the climate system due to changes in large-scale biomass burning and/or new urbanization.

Geography
This study explores the impacts that some dynamical drivers of the climate system have on regions of highly variable air pollution identified over the past 15 years. This approach geospatially groups the world according to the roles played by the meteorological drivers, and the timing and loading patterns of the NO 2 column measurements. The ensuing regions cover most of the globe (where data is present) in a spatially and temporally orthogonal manner. Furthermore, geographical considerations are taken to group patterns with underlying similarities.

Air pollution data 2.2.1. Ozone monitoring instrument (OMI) NO 2
To represent air pollution, a weekly average dataset of remotely sensed, cloud filtered NO 2 columns is used. This product is derived from daily NO 2 tropospheric column measurements, with a spatial resolution of 13 km by 24 km, from the version 3 Level 2 cloud screened OMI product (Levelt et al 2006, Boersma et al 2007, Lamsal et al 2011. Lowquality NO 2 data (including cloud cover >30%) are discarded. This data is analyzed using a variance maximization approach to represent those regions undergoing the most change over the time period from January 2006 through December 2015, as shown in figure 1(a) and table S1 (available online at stacks.iop.org/ERL/16/054020/mmedia).
This combination guarantees the highlighted regions are either interannually varying changes such as fires or long-term changes in urbanization (Cohen 2014, Lin et al 2020a, as observed in the time series of the NO 2 column loading over the specific pointby-point subregions of the 13 regions in this work (figures 1(b)-(f)), where the peak ranges from 1.3 to 15.3 times higher than the median and lasts from 2.8 to 22.1 weeks a year. Peak values are defined by iteratively separating the data more than the mean plus 1.5 (or 2 in some cases) standard deviations (Lin et al 2020b), as demonstrated in figure S1. The observed peaks occur annually and contain a considerable amount of inter-and intra-annual variation. Two exceptions are found in Siberia where there is a lot of missing data during the peak times, associated with high amounts of co-emitted aerosols and those pixels being reclassified as clouds (as found in He et al 2020, Wang et al 2020), and in Western US and Canada where some years have multiple peaks, likely due to a combination of different fire sources started at different times and places and advected downwind into areas sufficiently clean that the imported NO 2 was significant enough as to overwhelm local sources, and subgrid variability in the precipitation and source emissions profiles.

Multi-angle imaging spectroradiometer (MISR) aerosol optical depth (AOD)
To address any bias associated with exclusively using OMI NO 2 , this work includes 0.5 • × 0.5 • version 4 Level 3 MISR daily AOD measurements observed at 555 nm (Martonchik et al 2004, Abdou et al 2005, Kahn et al 2005. AOD is a measure of the total atmospheric optical extinction induced by aerosols, and hence is proportional to the amount of aerosols in the troposphere. The data is aggregated point-by-point and week-by-week over the same regions as NO 2 , as shown in figure S2. The AOD shows significant interand intra-annual variations. In fact, the intra-annual variation is observed to be even more than that of NO 2 , consistent with the fact that aerosols are longerlived than NO 2 , and thus also contain variations due to long-range transport as demonstrated by Lin et al (2013) and Wang et al (2020). Furthermore, AOD may be confused with clouds at very high AOD values, and hence the most extreme source events may not be well-captured (Levy et al 2013).
To investigate the co-variability of aerosols and NO 2 , thereby providing a measure of co-emissions over the respective regions, normalized time series of AOD and NO 2 are analyzed over Indonesia, Northern Southeast Asia, Southern Africa and South America, as shown in figure 2.
The peaks of AOD and NO 2 almost always start at the same time, strength, and period over Southern Africa (R = 0.62) and South America (R = 0.49), validating that the aerosols and NO 2 are co-emitted when extreme episodes take place. Nonetheless, the AOD signal tends to decay less rapidly than the coemitted NO 2 signal, effectively appearing to extend forward. On top of this, the AOD signals also display more peaks and overall variability than NO 2 during non-NO 2 peak periods, somewhat so over Northern Southeast Asia (R = 0.56) and even more so over Indonesia (R = 0.31), with there being a significant set of secondary peaks over these two regions which occurs less frequently than annually. This finding is consistent with aerosols having a longer atmospheric lifetime than NO 2 , and thus represent a mixture of emissions and long-range transport (Cohen and Prinn 2011, Cohen and Wang 2014, Wang et al 2020, Lin et al 2020b).

Climate data
ENSO is represented by weekly data from the NOAA Ocean Observation Panel for Climate (OOPC) dataset, as shown in figures 1(a) and (g). The specific sub-region used is based on the measurements of Sea Surface Temperature (SST) over the tropical regions bounded by 170 • W to 120 • W, and 5 • S to 5 • N, referred to as Niño3.4 (Trenberth and Stepaniak 2001). The strong interannual variation is observed to have two significant peaks (2009 and 2015) and one weak peak (2006) over the period studied here. The IOD is represented by the Dipole Mode Index (DMI) obtained from NOAA OOPC, also shown in figures 1(a) and (g). The DMI is calculated using the SST anomaly difference between the western and southeastern tropical Indian Ocean regions (50 • E-70 • E and 10 • S-10 • N) and (90 • E-110 • E and 10 • S-0 • ) respectively (Saji et al 1999). The variability is greatest over the tropical Indian Ocean, peaking in 2006Ocean, peaking in , 2008Ocean, peaking in , 2012Ocean, peaking in and 2015, and reaching its nadir in 2010 over the past 15 years.
The NAO is computed using the monthly NOAA Climate Prediction Center NAO index, using the leading rotated empirical orthogonal function of the 500 mb geopotential height north of 20 • N, centered over the Atlantic Ocean (Cook et al 2002). The NAO has its largest impact over Greenland and the Subtropical Atlantic Ocean, and exhibits great intraannual and intra-seasonal variability, with the largest peak and trough over this study being in 2011 and 2015 respectively.
The monthly NAO index from 2006 to 2015 is linearly interpolated to a weekly frequency to be consistent with the other datasets used in the work, as shown in figures 1(a) and (g). The spatial distribution employed is based on the July loading map, spanning the three most positive and negative regions, following the NOAA Climate Prediction Center Internet Team (2005). The July map is selected as it represents a compromise between the maximum response (January) and the minimum response (October).
We also use daily NAO index obtained from the Climate Prediction Center (www.cpc.ncep.noaa.gov/ products/precip/CWlink/pna/nao_index.html) in two ways. First, a direct comparison is made with daily NO 2 and AOD data. Second, NAO is aggregated from daily to weekly to compare with the interpolated weekly NAO. Comparisons of the day-to-day normalized NO 2 , normalized AOD, and normalized NAO indices are made over Southern Africa, where there is sufficient data to make a meaningful comparison (figure S3). Overall, the noise in the NO 2 signal is small and the signal is clear, the noise in the AOD signal is moderate and the pattern is still discernable, while the noise in the daily NAO signal is large and no pattern can be discerned, with the day-to-day results switching multiple times from positive to negative during the coherent NO 2 and partially coherent AOD signals. The weekly interpolated data for NAO does not yield such a pattern (figure S4), and hence is more consistent with the peak signal in the NO 2 time series and the partial peak signal in the AOD time series, as well as observations by Cohen et al (2018). More details can be found in the supplement.

Statistical methodology
To consider only those regions undergoing significant change in NO 2 , a normalized standard deviation maximization approach is employed (as described in Lin et  Linear correlation and multiple linear regression, computed using a standard least-squares error minimization, are considered significant only if they pass a t-test with a p-value smaller than 0.05. Furthermore, all cases where the absolute value of R (correlation coefficient) is smaller than 0.10 or where the terms do not contribute at least 10% to the normalized final best fit are also excluded.

Correlations between NO 2 and dynamical forcings
Correlation coefficients computed between the weekly NO 2 column measurements and the concurrent weekly dynamical indices of the three dynamical forcings, containing an absolute value of R greater than 0.1 are given in figure 3. The reason for choosing a value of 0.1 arises from two facts relating to the focusing on extreme events. The first is that these extreme events represent from 5.4% to 42.5% of the total data (average 22.7%), therefore requiring a much lower correlation to correctly match the peak events if there is little to no correlation with the non-peak conditions. Second, these relationships are made between the three individual dynamical forcing events, which are themselves located in different geospatial spaces, and hence the computed correlations may reinforce each other by contributing to the totality of the observed extreme NO 2 events at different times in different spatial regions. There is a correlation found between ENSO and NO 2 loadings over the tropical belt from Northern Southeast Asia to Northern Australia and Central Africa. Furthermore, there is an impact in some extratropical areas including Southern China Land, Eastern China, Japan, Southern Africa, Eastern United States, and Northern Eurasia. However, we do not find a statistically significant impact of ENSO on the NO 2 column loadings in Indonesia (the results do not pass the t-test), which is contradictory to many other previous published works showing ENSO as the major driving force of fires occurring in the Maritime Continent (Reid et al 2012, Marlier et al 2013. This work has determined that the positive phase of ENSO has an impact on the observed extremes of NO 2 over the Maritime Continent in the single year of 2015, as demonstrated in figure S5 and table S2, although the relationship fails in other years. Over all of the regions studied at the times covered in this analysis, ENSO is never the most significant forcing agent, having a contributing strength always weaker than IOD and/or NAO. The IOD is correlated with NO 2 loadings throughout tropical areas stretching from Indonesia, Northern Australia, to Africa and further onto South America. In addition, there is a significant correlation over subtropical areas including Eastern China and Japan. Furthermore, there is a belt over the Indian Ocean as observed in figures 3(a) and (b) in which both ENSO and the IOD have an aligned congruence in phase, possibly associated with long-range transport of smoke from fires, as has been observed in other parts of Asia (Sun et al 2013, Lin et al 2020b. Such an occurrence over the Southern Indian Ocean seems to only be mentioned in passing by a single study (Lin et al 2020b). Moreover, the correlation over Indonesia exceeds 0.19 and dominates over this region, as compared to ENSO. A few factors may be responsible: first, previous studies used monthly dynamical and AOD data, whereas this work uses weekly data; second, this work uses NO 2 , which has a shorter lifetime in situ than aerosols and does not flow as far from the source region as aerosols. The negative phase of the IOD is shown to do a better job at reproducing the length and start time of 3 of the years of extreme NO 2 conditions, as compared to the IOD as a whole, but cannot add any value to the other 7 years on record in Indonesia as also demonstrated in figure S5 and table S2.
NO 2 loadings are correlated in a wavelike pattern with the NAO over most parts of the world, which is not spatially correlated with either of the patterns observed from ENSO or IOD. Where there is overlap, sometimes the phase is the same, and sometimes the phase is opposite, leading to addition and subtraction respectively. In specific, over the Southern China Sea, Northern Eurasia, Arctic Eurasia, Central Africa and Southern Africa, the absolute value of R is larger than 0.25, which are new findings. Over Populated Northern China, the NAO is still dominant albeit slightly less so (R = 0.19), consistent with the impact of the global wave train on pollution in Populated Northern China (Chen et al 2019b).

Developing a multi-linear model of NO 2 concentrations from different concurrent and lagged dynamical forcings
The dynamical time series from 2004 to 2017 is used to quantify the idealized lag time with respect to NO 2 by maximizing the correlation coefficient between the perturbations, subject to a time lag with a maximum of 2 years (table 1). The differences in the best fit for each forcing term, with and without respect to lag are presented in table S3. The ability to reproduce measurements of NO 2 region-by-region is shown to be superior in all cases, when using the lag as compared to not using the lag. In specific, the increase in R ranges from a minimum of 4% (ENSO), 2% (IOD), 7% (NAO) to a maximum of 25%, 23% and 33%.
Next, a multi-linear regression approach is employed over each NO 2 subregion in a point-bypoint manner to quantify the normalized NO 2 as a weighted function of the normalized ENSO, IOD, and NAO indices, with and without the best-fit time lags respectively, yielding a relationship between the dynamical systems and their role in modulating perturbations in NO 2 (and co-emitted aerosols, gasses and heat). The resulting coefficients and R 2 values are given in table 1. Overall, the multi-linear regression fit is superior to the individual correlations over all regions, with the improvement in R 2 ranging from 0.09 in Southern China Land and Siberia to 0.28 in Central Africa.
Based on these results, the positive phase of concurrent ENSO has a strong positive impact on Northern Southeast Asia, and the negative phase has a moderate enhancement on Southern China Land. Lagged ENSO produces a strong correlation with future loadings of NO 2 over all studied regions, ranging from the same year over three of the regions, to 1-2 years ahead over ten of the regions. The negative phase of lagged ENSO leads to strong or moderate enhancement over all studied regions but one. These correlations are consistent with the known localized drying associated with the respective phase of ENSO in each case, leading to reduced soil moisture and hence being more prone to enhanced fire in future years, as well as supplying more biomass in terms of peat available to be combusted. The other effect is that in regions impacted by additional rainfall during the negative phase of ENSO, there will be additional biomass growth, and therefore more biofuel to burn during the subsequent dry season. Hence these results are completely consistent with observations in Northern Southeast Asia and Southern Africa, where the preceding wet season would be wetter under the negative ENSO phase and therefore lead to additional biomass production, and subsequently be dried by the positive ENSO phase and lead to additional NO 2 production.
Secondly, the positive phase of concurrent IOD is shown to have a moderate enhancement over South America, and a strong enhancement over Indonesia and Northern Australia, the latter two being consistent with the known dry-phase extreme at the eastern edge of the Indian Ocean. There is also a negative enhancement observed strongly over Arctic Eurasia and Central Africa, the latter of which is consistent with dry-phase extreme at the western edge of the Indian Ocean (Saji et al 1999). The lagged positive phase of the IOD occurs approximately a year before the enhanced NO 2 loadings over Indonesia and Northern Australia, consistent with the drying conditions impacting biomass burning as described above. The negative phase of the IOD is lagged a few weeks after the NO 2 enhancement over most of the remaining regions, consistent with an extension of the fire burning season as a result of the interplay between the co-emitted aerosols and heat, and extended drying induced by the dynamics. The perturbations of NO 2 over Siberia and Southern Africa have a strong, half-year forward impact on the IOD, which would argue that the large amounts of aerosols emitted over Table 1. The lag time of the dynamical forcings (in columns 3-5); and the best-fit coefficients of the constant term (in column 6), of the concurrent ENSO, IOD and NAO indices (in columns 7-9), of the lagged ENSO, IOD and NAO indices (in columns 10-12), and the associated R squared (in column 13), of the multi-linear regression model. A positive lag means that the NO2 lags behind the dynamical forcings, and vice versa a negative lag. Coefficients of the non-constant terms are marked in red when they contribute 10% or more to the total regression weighting.  this region may be impacting the dynamics of the IOD system. Recent work showing that there is a massive plume of co-emitted CO and aerosols extending from Southern Africa over most of the Indian Ocean is consistent with the measured values of AOD observed from MISR over these regions at these specific times (AOD in the range from 0.8 to 1.5) (Lelieveld et al 2001, Lin et al 2020b. These results in theory will be capable of resulting in a large radiative forcing at the surface, one that may overwhelm any purely dynamical forcing, and lead to induced changes on the SST, consistent with theory as observed in Cowan et al (2015), Stuecker et al (2017) and Lee et al (2013). These results are consistent with a possible non-linear dynamical pathway as well between emissions perturbations in Siberia, Southern Africa and the IOD (Ming et al 2011, Staten et al 2018, which has not been previously identified in the literature. Thirdly, the positive phase of the concurrent NAO enhances NO 2 loadings strongly over Southern China Sea, Northern Eurasia, Arctic Eurasia, and Central Africa, and moderately over Populated Northern China and Western US and Canada, while the negative phase has a strong enhancement over Southern Africa consistent. Considering the time lag, the positive phase of the NAO is found to precede the NO 2 perturbations in Northern Australia by about half a year, while the negative phase of the NAO is found to precede the NO 2 perturbations in Siberia and the Western US and Canada both by less than 1 year. This implies that localized dry enhancement from the NAO likely leads to enhanced biomass burning in all of these areas. Furthermore, the NAO is found to lag more than half a year behind the NO 2 peak measurements over ten of the studied areas, with the three regions of Northern Southeast Asia and Africa having a lagged impact more than 1 year in the future and the other eight having an impact less than 1 year in the future on the NAO. This may indicate additional non-linearities in the dynamical pathways over these regions, which have not yet been found in the literature, via the future lag with respect to the extreme emissions changes in NO 2 . If this is the case, then these observed massive changes in the NO 2 extremes (and the aerosols co-emitted with the NO 2 ) may pose a feedback onto the energy balance, and even the dynamics underlying these forcings, as explained by Zhang et al (2013) and Persad et al (2018).

Region
In addition to elucidating connections between NO 2 perturbations and dynamical indices, the multilinear regression model is found to capably reproduce the normalized standard deviation of the measured NO 2 column loadings over the regions studied, as displayed with respect to time in figure 4 and with respect to space in figure 5.
Those regions in which the multi-linear regression model represents seven or more of the 10 years of observed extreme events in NO 2 (Central Africa, 9 years; Southern Africa and Northern Eurasia, 8 years; South America, Indonesia, Northern Southeast Asia and Arctic Eurasia 7 years) all have an overall total contribution due to the sum of the concurrent and lagged dynamical forcing terms greater than 49%, consistent with previous findings (Ichoku et al 2008, Sun et al 2013, Aragão et al 2018, Lin et al 2020a. Even the remaining regions, which have a total concurrent and lagged dynamical forcing contribution from 32% to 46%, are all capable of reproducing 6 of the 10 years of observed extreme events in NO 2 . One reason for this occurrence is that in general there is a significant amount of non-correlation between the dynamical forcing variables and NO 2 during the non-polluted periods of time, consistent with the fact that the climate system makes a relatively smaller contribution to emissions in highly urbanized regions or under intense agricultural management. A second reason stems from the idea that due to missing data, there are fewer observed peaks than at the other sites (there are 7 years of observed peaks in Southern China Land and 9 years of observed peaks in Siberia), implying that although the R 2 is lower overall, that over these regions, there may actually be an improved ability to represent the remaining fraction of extreme events.
It is also important to draw attention to the fact that the multi-linear regression model does not fully capture the entire peak of the extreme events for two reasons. First, linear models with a time step larger than zero tend to underestimate instantaneous jump functions, such as spikes in fresh emissions (Cohen and Wang 2014). Second, there are anthropogenic forcings occurring simultaneously with the dynamical forcings. As above, this actually signifies an improved ability to represent the impact, as it demonstrates the multi-linear regression approach is not over fitted.
To test the sensitivity of the model design itself, two additional multi-linear regression models are also constructed. One is constructed only using the constant term and the concurrent terms (table S4), and the other using only the constant term and the lagged terms (table S5). Overall, the model with the lagged terms outperforms the model with the concurrent terms as demonstrated in figures S6-S9. In all but four of the regions, the concurrent only model does nearly as well as the model using least squares. In specific, in Southern Africa and Central Africa, the difference between the lagged only and the least-squares model leads to one additional peak matched, and better fits with three peak events. In the case of Northern Southeast Asia and Northern Eurasia, the difference between the lagged only and the least-squares model leads to a better fit with two peak events. Overall, this adds further support to the ideas already explained that the actual lagged forcings are what contribute the most to the extremes in the NO 2 , but that in these four regions, the modulation over both lagged and concurrent time scales is required to capture the actual inter-annual and intra-annual variability.

Discussion and conclusions
Strong relationships are found between some extreme climate events and highly variable perturbations in measured NO 2 columns over the 14 years from 2004 to 2017. ENSO is found to significantly contribute to extremes in NO 2 over Northern Australia, Northern Southeast Asia, Southern China Land, Northern Eurasia, Central and Southern Africa, but always less so than the IOD and/or the NAO. Furthermore, multi-regression results demonstrate that ENSO's lagged impact always maximizes future perturbations of NO 2 , with its most significant impact (about 48%) occurring in Northern Southeast Asia 7 weeks ahead, extending the length of the known local biomass burning period. Conversely, ENSO is shown to not have a significant correlation on the extreme NO 2 events over source regions in Indonesia, even though previous studies have stated that it is the major driving force in this region with respect to individual fire burning points or large-scale aerosol plumes. This combination of findings warrants Figure 5. Map of reconstructed normalized variability using the respective best fit multi-linear regression model from each region. Note that this fraction represents the amount of the total variability of the NO2 column measurements reproduced by the climate variables.
further study to understand more deeply the driving forces between these highly variable and significant phenomena, especially so in the Maritime Continent, where it seems that different phases of the forcings may be able to help represent different years to different extents.
The IOD is found to dominate NO 2 variability in many places in the world, including Indonesia, Northern Australia and South America. Multiregression studies demonstrate that the positive phase of the IOD exhibits a strong enhancement in real time and a further strong enhancement approximately a year later over Indonesia and Northern Australia, consistent with its known dry-phase forcing along the Eastern edge of the Indian Ocean. The largest influence of the IOD is observed over Indonesia (about 20%), when a lag of 50 weeks is applied, the impact being on the next year's biomass burning season. These findings are consistent with reduction of surface moisture, warranting further investigation into cause and effect.
The NAO influences NO 2 column perturbations across the globe in an overall wavelike manner, playing a dominant role over most studied regions, except for Indonesia, Northern Australia and South America. Multi-regression modeling reveals the impact of the NAO is found to occur after the subsequent NO 2 perturbations over almost all studied regions, except for Northern Australia, Siberia and Western US and Canada, where the impact on the sources is more consistent with findings from ENSO and the IOD. Hence, over the other ten regions, coemitted heat and aerosols may pose a feedback on the atmospheric and ocean surface energy budgets that ultimately impact the NAO itself. It is most significantly influenced over Southern Africa (found to contribute 44%) when a lag of 100 weeks is applied.
Over all of the non-heavily populated regions except for Siberia and Western US and Canada, this approach observes climate variability is responsible for 45% to 63% of the total variability in measured NO 2 columns. The contribution in Siberia and Western US and Canada is still over 30%, consistent with the fact these regions either are missing a peak or have multiple peaks in certain years. Overall, the multiregression model employed here is capable of reproducing from six through nine of the observed 7-10 years' worth of peak events in NO 2 over all of the regions around the world. This improvement is especially important in that it requires both concurrent and lagged dynamical forcing terms to be universally consistent, and not merely reproduce a single or small subset of the years' inter-annual and intra-annual variability. Additional or higher quality information reflecting large-scale and local dynamical and chemical forcings, as well as improved data of anthropogenic factors related to NO 2 would hopefully help improve the skill of the multi-linear regression model. Based on these results, it is suggested that the community look deeper into the roles of multiple dynamical forcings simultaneously acting on emissions contributing to known pollution extremes, as well as the possibility that extremes in pollution may have an impact on modulating these global-scale dynamical systems.

Data availability statement
The data that support the findings of this study are openly available at the following URL/DOI: https://doi.org/10.6084/m9.figshare.c.4734440.