Association between NO2 concentrations and spatial configuration: a study of the impacts of COVID-19 lockdowns in 54 US cities

The massive lockdown of global cities during the COVID-19 pandemic is substantially improving the atmospheric environment, which for the first time, urban mobility is virtually reduced to zero, and it is then possible to establish a baseline for air quality. By comparing these values with pre-COVID-19 data, it is possible to infer the likely effect of urban mobility and spatial configuration on the air quality. In the present study, a time-series prediction model is enhanced to estimate the nationwide NO2 concentrations before and during the lockdown measures in the United States, and 54 cities are included in the study. The prediction generates a notable NO2 difference between the observations if the lockdown is not considered, and the changes in urban mobility can explain the difference. It is found that the changes in urban mobility associated with various road textures have a significant impact on NO2 dispersion in different types of climates.


Introduction
In December 2019, the pandemic outbreak of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) emerged in Wuhan, China, after which more than 10 million cases are confirmed and 500 000 deaths are resulted by 1 July 2020, and the pandemic also spread in over 250 countries in the world. Despite the progressive increase in the number of confirmed cases every day, the situation has become relatively stable in some countries after the massive lockdown is adopted and strict travel control measures are taken since March 2020 (Kucharski et al 2020). Due to the global lockdown, the coronavirus disease 2019  has undermined worldwide economies and triggered a global crisis due to great loss in the socioeconomic domain described by the United Nations (UNnews 2020). It was predicted that global trade will decline by 13%-32% and the annual global gross product is projected to decline by 24% (CRS 2020). Additionally, a strict lockdown in China reduced global GDP by 3.5% and Chinese GDP by 21% (Guan et al 2020). Notwithstanding, the socioeconomy is chronically and devastatingly hit by the disease, and the environment has responded to the pandemic instantly (Fan et al 2020). The level of nitrogen dioxide (NO 2 ) is substantially reduced in the wake of COVID-19 (Mahato et al 2019). In Central China, NO 2 emission is decreased by 30% on a yearafter-year basis during the Lunar New Year compared with the period from 2005-2019 (Earth Observatory 2020). The reduction of air pollution including NO 2 is first identified appeared in Wuhan, and such phenomena are found in more and more cities and countries, and eventually it becomes a worldwide phenomenon (Venter et al 2020). Among 31 major cities in the world, a significant decline in NO 2 has been observed in almost two-thirds of the study area after the lockdown, including New York, Rome and Paris, implying that the transportation and anthropogenic activities in the cities mentioned above were massively lessened (Shrestha et al 2020). Epidemiological evidence suggested that the high incidence of respiratory disease and many death tolls should be attributed to the deteriorated air quality (Brauer et al 2010). For example, approximately 4.6 million individuals died from the disease annually due to the low air quality, as reported by the World Trade Organisation (Cohen et al 2017), and the lung tissue can be damaged by prolonged exposure to NO 2 , which is one of the contributors to the morbidity of asthma and lung cancer (Greenberg et al 2016, Khaniabadi et al 2017. Thus, it is of great significance to the enhancement of the atmospheric environment by grasping the evolutionary patterns of NO 2 and its causal factors appropriately (He et al 2020).
Road traffic can be regarded as the major contributor to NO 2 (Palmgren et al 2007, Keuken et al 2009, Mohegh et al 2020. The rise of NO 2 concentration might cause a certain type of diesel particulate produced by heavy vehicles such as buses, whereas nitrogen is produced by the gasoline-fuelled passenger cars, particularly sensitive to long-range transport decomposition (Carslaw et al 2005, Heeb et al 2008. According to an observation of the vehicle emission factors, it is found that the decreasing trend of NO 2 emission is the smallest and insignificant among other pollutants throughout the 12 years study period in Switzerland (Hueglin et al 2006), which suggests that the road traffic emission ratio in NO 2 increased during this period. A similar outcome means that NO 2 from the traffic emission is the key factor for generating ambient NO 2 concentration (Kurtenbach et al 2012). As such, NO 2 concentration will not be reduced substantially if only the NO x exhaust system is improved without a significant reduction of traffic-related NO 2 emission. It is found that NO 2 had diurnal and seasonal variation as a function of traffic volumes alongside a major arterial (Kendrick et al 2015) and the traffic signals were related to the roadside air quality in Tokyo (Minoura et al 2010). Some studies also found that the change in NO 2 concentration is highly dependent on the traffic capacity and fuel decomposition, as well as the vehicle speed and fleet composition (Tang et al 2019). Nevertheless, it is reckoned that the improved air quality is mainly a result of the favourable meteorological conditions rather than the minimized anthropogenic activities because the pollutant diffusivity was an important factor closely related to meteorology (Wang et al 2020).
Another factor that contributes to NO 2 emission is traffic characteristics. By investigating the correlation between NO 2 and its emission factor, researchers speculated that the deviation of the contributing factors NO 2 emission resulted from the variability of the local road condition, the traffic pattern and the fleet composition (Chan et al 2004, Liu et al 2012. The road 300 km or above in length was featured by higher NO 2 concentration as shown by a study conducted in the Netherland (Velders et al 2009), and the study result is consistent with another finding that a higher NO 2 exposure was associated with the road length variability in Shanghai (Meng et al 2015). Moreover, the spatial distribution and the local variability of NO 2 concentration (Wang et al 2020) are analytically explored, and the results show that the local variation was mostly driven by regional differences between the ten most urbanized areas in the United States, which was consistent with the basic conditions of these urban regions, that is, comparatively denser traffic and frequent anthropogenic emission. Therefore, different spatio-temporal locations and the traffic characteristics have a significant influence on the level of NO 2 , even though some of the emission features result from specific industrial activities. In the US metropolitan cities, the local NO 2 concentrations can be improved by reducing the road traffic even if there is a variability of estimation error in NO 2 . Therefore, no effective prediction model has been developed for national NO 2 concentration yet (Lee et al 2019).
As a reliable and accurate method for mathematical modelling, auto-regressive integrated moving average (ARIMA) has been widely applied to short-term time-series trend analysis and its frequent application in the prediction from multiple perspectives. For instance, it was adopted in the study trying to make an accurate forecast of wind speed at wind power plants and ensure the overall stability and security of the power system (Singh et al 2019). Likewise, the time-series trend is explored to predict the capacity of electricity generation (Haiges et al 2017). In other disciplines, the statistical tool was adopted to detect financial crisis events that act as the precursor of the market regime (Faranda et al 2015). Apart from the modelling platform of geographic information systems (GIS) for spatial clustering and statistical analysis, the spatial distribution and the association between infectious diseases and NO 2 , and between the traffic flow and the number of people moving trajectory are also identified (Yongjian et al 2020).
The assessment of the traffic related-GIS parameters such as the lengths of the major road segment and the traffic intensity can help reveal the long-term trend of NO 2 and their emission ratio related to the road traffic activities, and the impact on NO 2 can be visualized using the GIS platform (Shon et al 2011). With the use of predictive data based on ARIMA, the spatial transmissions of the pandemic disease can be forecasted regarding the spatial allocation of medical resources and the implementation of mitigating control measures (Lakhani 2020, Singer et al 2020.
This paper is organized as follows. Section 2 enhances a time-series prediction model to predict NO 2 concentrations accurately and proposes four indices to present the characteristics of road networks. Section 3 introduces data collection, presents the changes of NO 2 concentration after lockdown, and explores the impacts of road networks and regional climates on NO 2 concentration. Finally, section 4 makes a discussion and conclusion.

NO 2 difference between observation and prediction
The NO 2 concentration records before lockdown will be used to train historical patterns that contain seasonal and cyclic time-series information, and the SARIMAX model will be used to predict the NO 2 concentration. In this case, a prediction following historical patterns will not integrate with the new evolution after lockdown measures. Under the circumstance, two definitions are proposed to investigate the evolution: • NO 2 difference denoted by ∆d calculates the difference between the observed and predicted NO 2 concentration at a station after lockdown; and • NO 2 change denoted by ∆c calculates the difference of the mean NO 2 concentration before lockdown and that after lockdown based on observation or prediction.
Particularly, ∆d can be significant as a result of the changes of urban mobility during the implementation of lockdown measures. Thus, we can investigate the relationship between ∆d and the changes of urban mobility (∆m) subject to a baseline before lockdown.

Seasonal autoregressive integrated moving average with eXogenous factors (SARIMAX)
The general ARIMA model consists of three parts, namely autoregression (AR), differencing (I) and moving average (MA), and the model can be presented as follows: where p denotes the order of auto-regressive model, d represents the order of difference, and q is the order of moving average model. Nonetheless, seasonal patterns cannot be established by introducing an ordinary ARIMA model. To address this problem, the seasonal auto-regressive integrated moving average (SARIMA) includes the seasonal parameters for the auto-regressive, differencing and moving average terms, and the time step for modelling the periodic time-series pattern: where P denotes the order of seasonal auto-regressive model, D denotes the order of seasonal differencing, Q is the order of seasonal moving average model, and t refers to the time step parameter for seasonality. Furthermore, Seasonal Autoregressive Integrated Moving Average with eXogenous factors (SARIMAX) can be used to analyse the time-series data with additional variables in the regression operation (Valipour 2015): In this study, the time-series NO 2 measurements of 56 stations in the US were modelled using the SARIMAX model based on the exogenous weather variables including air temperature, air pressure, relative humidity, wind speed and percentage of cloud coverage. Since there might be more than one seasonal pattern for the time-series NO 2 data, six Fourier series with different periods were also included as exogenous variables (number of days = {15, 30, 90, 180, 365}), and 7 days (one week) was deemed to be the seasonal time step t in the SAR-IMAX model. Before the SARIMAX modelling is performed, the randomness and normality were tested with Ljung-Box test and Jarque-Bera test. The Augmented Dickey-Fuller test and the Osborn, Chui, Smith, and Birchenhall tests were performed to determine the seasonally varying parameters (d and D) of the time series respectively. Canova-Hansen test was conducted to determine the time step parameter (t) for seasonality. Then, a stepwise approach (Hyndman et al 2008) was adopted for optimizing the other parameters (i.e. p, q, P, and Q) of the SARIMA model by minimizing the Akaike information criterion (AIC) value. The optimized parameters were introduced for the NO 2 prediction in this study. Python 3.7 and the statsmodels package (State Space Methods 2020) were used for training and prediction.

Impacts of road networks on vehicle emission
As traffic is one of the major sources of NO 2 , in the present study, four indices are adopted to explain the characteristics of road networks in a 3 km-radius circular area centred at each air quality monitoring station (figure 1(a)), which are used for correlation analysis between NO 2 difference after lockdown. The four indices are enumerated as follows: (1) n nd denotes the number of road intersections, (2) c nd presents the number of the road segments at all the intersections, (3) s rd stands for the total length of the road networks, and (4) n rd is the number of the road segments. It is crucial for proposing n nd since vehicles often stop at the intersections because of the traffic lights, which leads to the accumulation of a considerable amount of the emissions during idling (Minoura et al 2010). In comparison, c nd shall be more representative than n nd because the number of roads connecting at each intersection is counted that may indicate temporary stopping more confidently. For example, vehicles are more likely to be parked at an intersection of two roads than a single road with the same traffic volume. s rd also plays a big part since it affects traffic flows fundamentally. The four indices are thus organized as i = {n nd , c nd , s rd , n rd }. In the study, G denote a topological graph of roads that edges E = {e} connect with each other by associating with the nodes O = {o}, getting G = {E, O}. Notably, n nd is different from the number of nodes denotes by num(O) because only two edges connecting through a node means they are the same road essentially, which is caused by the data format. Thus, n nd and c nd are counted when a node at least associates with at least edges.
However, based on the four indices, it is estimated that the dispersion of vehicle emissions at road networks has homogeneous impacts on stations' air quality when the spatial distance between them is taken into consideration. To make a better estimation, in the present study, it is assumed that the dispersion of each index follows a Gaussian distribution from each element of the index to the station (figure 1(b), equation (4)): In the function, y 0 denotes the offset and xc is the centre, both equalling 0 to present a normalized Gaussian. A is the amplitude to denote the magnitude of the element, w implies the width to control the speed of the dispersion, and x stands for the distance from the element to the station ( figure 1(c)). When computing n rd and s rd enriched with the Gaussian function, x is the distance from the middle point of the road segment to the station, as an approximation. Then, the total impacts of all the elements of an index at a station can be accumulated (equation (5)): A set of abbreviations and their definitions are summarized in table 2 for clear presentation of this study.

NO 2 change after lockdown
Then, the SARIMAX model is used to predict daily NO 2 at each station from the start of lockdown to the end of lockdown. Two predictions are made, with one prediction period starting 60 days (n = 60) before the lockdown and ending a number of days after the lockdown (until 6 May 2020), and the other prediction period is starting 30 days (n = 30) before the lockdown and c nd the number of the road segments connecting at all the intersections 8 s rd the total length of the road networks 9 n rd the number of the road segments 10 ∆m rd ∆m rd = s rd · ∆mw ending a number of days after the lockdown. Figure 3 visualizes prediction results of two selected sites that the red curves (observation) present a seasonal and periodical pattern, which are lower than the blue curves (prediction). The difference between the blue and red curves (i.e. NO 2 difference) confirms our hypothesis that the prediction has not incorporated the disruption of lockdown and can be explained by the reduction of mobility. Figure 4(a) provides the distribution of the mean relative errors (e) of the prediction at all the stations. Evidently, the errors before lockdown (b) are smaller than after lockdown (a), with the medians at e = 0.13 for n = 60 and e = 0.12 for n = 30, respectively. It suggests that satisfactory prediction accuracy is achieved prior to the lockdown accordingly. When the lockdown was activated, the medians become larger at e = 0.22 for n = 60 and e = 0.20 for n = 30, which indicates that it is challenging to obtain accurate prediction caused by disruptive lockdown measures.
Besides, both errors for n = 30 are slightly smaller than that for n = 60, probably because n = 60 has a longer prediction time while n = 30 allows an additional 30 day training that better incorporates the most recently seasonal variation into the prediction.
To probe into the impact of the lockdown measures, in the present study, the NO 2 changes before and after lockdown for the 56 stations based on prediction (p) and observation (o) are calculated, respectively. It is found that many stations have undergone a substantial reduction after lockdown in view of the changes of absolute values (∆c) ( figure 4(b)), which become more significant when conversion into percentages that ∆c p equals ∆c divided by the mean concentration before lockdown ( figure 4(c)). Specifically, the observation suggests  that half of the NO 2 measurements decreased at least with −1.45 mg m −3 for n = 30 and −2.41 mg m −3 for n = 60 ( figure 4(b)), which is equivalent to a 23.08% and 30.38% reduction of the NO 2 concentration before lockdown. Meanwhile, the predicted number is slightly larger than the observation.
According to the COVID-19 Community Mobility Reports from Google that provides relative urban mobility subject to a baseline before lockdown in each city, all the involved cities have shown a huge decrease in the mobility for workplace with |∆m w | > 20%, in which Seattle, Ashburn, Burlington, and Minneapolis have witnessed the largest reduction with |∆m w | equalling 41%, 38%, 36%, and 33%, respectively ( figure 5(a)). By inspecting each station, the observation suggests that a considerable reduction of NO 2 after lockdown is realized in most cities, while a tiny increase is found in four cities quite unexpectedly ( figure 5(b)). Notably, the changing trend of |∆m w | is unclear with the decrease of ∆c, which stimulates our motivation to explore the influential factors.

Association between urban mobility and NO 2 difference
Considering the vehicle emission is one of the largest resources of NO 2 , a significant reduction of urban mobility during the pandemic is supposed to have impacts on the changes of NO 2 in cities, which focuses on a detailed investigation by contrast with other studies at regional and national scales. Based on the COVID-19 Community Mobility Reports, the Pearson correlation analysis between the NO 2 difference (∆d) and the averaged mobility changes (∆m) is made, in categories of n = 30 (figure 6(a)) and  n = 60 ( figure 6(b)). It shows that the mobility for workplace decreases significantly in all cities (−40% ⩽ ∆m w ⩽ −15%), which validates a moderate and negative correlation with ∆d in both groups (the Pearson correlation coefficient R =−0.47, p <0.05). Similarly, the mobility for recreation has shown a considerable decrease in a vast majority cities (−30% ⩽ ∆m c ⩽ 1%), having a weak and negative correlation with ∆d (R =−0.37 for n = 30 and R =−0.40 for n = 60). In comparison, the greatest decrease in transit mobility is observed in almost all the cities (−50% ⩽ ∆m t ⩽ 5%), which contributes to a weak and negative correlation (R =−0.32 for n = 30 and R =−0.34 for n = 60) (figure 6).
As an expectation, a small increase of ∆m c = 0.94% and ∆m c = 5.95% is revealed in the recreation mobility for the Cedar Bluff State Park in Kansas and the transit mobility for Murphy Ridge in Wyoming, respectively. It is reasonable since some populations may deliberately choose not to travel to crowded locations and move to rural areas as temporary residences such as parks. Three implications can also be induced by the results. Firstly, the lockdown policy helps reduce urban mobility for recreation, transit, and workplace greatly during the pandemic, leading to a considerable reduction of NO 2 . Secondly, the correlations between ∆m and ∆d, even though not robustly significant, suggest the validation of the hypothesis that the difference of NO 2 has resulted from the changes in urban mobility. Last but not least, other factors such as road characteristics or local climates may also affect the changes of NO 2 since the current correlations are not significantly strong.

Impacts of road networks on NO 2 concentration
As the urban mobility restraint by road networks, in the present studies, it is proposed that the road characteristics, such as the density of road networks, may have notable impacts on emission. Therefore, three static indices are put forth to represent road characteristics i = {n nd , c nd , s rd }: (1) the number of road intersections n nd , (2) the total number of the road segments connecting at all the intersections c nd , and (3) the total length of the road networks s rd . Figure 7 demonstrates the correlations between i and ∆d, categorizing into two groups of n = 30 and n = 60. It is pointed out that both groups obtain somewhat strong and positive correlations that the coefficients R = {0.54, 0.54, 0.60} for n = 30, resulting a better performance than (∆m, ∆d) in figure 6. It can be explained that a larger indicator in i makes a more significant effect on the changes of NO 2 after lockdown, suggesting that road networks' characteristics can greatly affect ∆d. However, the three indicators treat intersections and roads with homogeneous impacts on the air quality stations, regardless of the spatial distance between the roads and stations.
To make the indices more representative, we used a Gaussian function y = ∑ g (A, x, w) to estimate the dispersion of NO 2 from the vehicle emission, in which A represents the magnitude of the road property, x denotes the distance from roads to the NO 2 station, and w indicates the width of the function. In addition, another road index named n rd is initiated, which means that the dispersion associated with the road follows the Gaussian distribution but ignoring the characteristics of the road length to compare with the performance of s rd . Figure 8 shows the coefficient curves between i = {n nd , c nd , s rd , n rd } and ∆d, comparing between n = 30 and n = 60. Overall, all the curves grow steadily and approach the upper bounds gradually with an increase of w. Particularly, c nd and n nd share the same growing trend with the increase of w that their R max < 0.55 (figures 8(a) and (b)). Also, c nd is insignificantly larger than n nd with the same w and the same group n = 30 or n = 60. Comparatively, s rd and n rd perform better than c nd and n nd regarding the correlation coefficients R. Meanwhile, s rd overtakes n rd notably for both n = 30 and n = 60, and s rd shows the most prominent correlation with R max ≈ 0.60 when w = 3 km (figures 8(c) and (d). Two phenomena can be observed from the figure.
On the one hand, s rd obtains the strongest impact on ∆d, which is convincing since s rd incorporates the NO 2 source from roads and NO 2 dispersion following the Gaussian distribution into the correlation. For instance, a long road is likely to produce more NO 2 while a short distance to the station is related to a greater amount of NO 2 . On the other hand, w = 2.25 km is an empirical distance that road networks start to have a rather weak impact on ∆d.

Impacts of regional climates on NO 2 concentration
The present study considers that regional climates can also affect the dispersion of NO 2 . It is based on the evidence that air temperatures were used to describe urban heat islands and climate largely determines the magnitude of urban heat islands (Zhao et al 2014, Manoli et al 2019. As both air temperatures and NO 2 are air properties essentially, NO 2 concentrations may also be influenced by climate. To explore the combined impacts of urban mobility and road networks in partition of different climates, the correlation analysis is performed in different climates, namely, arid and humid climates, hot and cold climates, and coastal and inland climates (figure 9(a)). The correlation analysis is performed between ∆d and ∆m rd = s rd · ∆m w , where s rd follows the Gaussian function with w = 3 km that obtains the largest R and ∆m w is the change of workplace mobility as discussed in figure 6. Generally, results obtained from n = 30 (figures 9(b)-(d)) and n = 60 (figures 9(e)-(g)) are almost the same that (∆d, ∆m rd ) are strongly correlated. Particularly, the correlations are more prominent in the arid, hot, and coastal climates with R = {−0.84, −0.65, −0.77} than in the humid, cold, and inland climates, namely R = {−0.60, −0.64, −0.64} (figures 9(b)-(d)). In contrast, correlations based on n = 60 are slightly weaker than n = 30. That is to say, the arid, hot, and coastal climates tend to facilitate the dispersion of NO 2 based on a given urban mobility and road networks. It is also found that ∆m rd shows the most prominent correlation with ∆d, and the significance of the correlations decreases from ∆m rd with the Gaussian distribution, s rd with the Gaussian distribution (figure 8(d)), s rd (figure 7(c)), to ∆m w (figure 6) when they are in the same condition. It suggests that the dynamic mobility constraint by static road networks significantly affects the changes of NO 2 in a large geographical space.

Discussion and conclusion
An accurate estimation of the time series of NO 2 is essential to predict and investigate the impact of the community mobility on the urban microenvironment. The study refines an established NO 2 prediction model and estimates the NO 2 concentrations without the disruption of COVID-19, which is achieved by incorporating the seasonal and cyclic variations based on years of historical data before 2020. Correlation analysis in this study is performed based on the hypothesis that the predicted level of NO 2 after lockdown is larger than the observed level because the lockdown policy leads to less frequent use of vehicles and thus less NO 2 emission while the prediction still follows historical patterns that have not incorporated the dramatic decrease of NO 2 after lockdown. Therefore, the changes of urban mobility have shown a causal relation with the difference of NO 2 concentrations between the prediction and observation. The study also suggests that the proposed prediction and analysis method can be used to evaluate the environmental impacts when confronting the COVID-19 pandemic and other public health events.
The results suggest that part of the difference between predicted and observed values is the result of the disruptive lockdown measures. During the lockdown period, there are strong and negative correlations between ∆d and ∆m rd in group of different climates because ∆m rd considers the changes of urban mobility, the total length of the roads, and the dispersion following a Gaussian distribution. Two major findings have been generalized as follows. Firstly, a great reduction of urban mobility associated with the recreation, transit, and workplace may result in a considerable decrease in NO 2 in a large geographical area. Secondly, the local climate is also one of the vital factors that have distinct impacts on the dispersion of NO 2 . Specifically, the impacts are more prominent for stations in areas where the arid, hot, and coastal climates prevail, since the three climates' correlations are considerably stronger. It is probably because the arid and hot climates would cause uneven air temperatures, which promotes wind ventilation and reduces the NO 2 density, and it is also the case with coastal cities where there are wind cycles between the land and sea. Some features of local weather in terms of daily wind directions and strengths can also mitigate NO 2 concentrations.
The SARIMAX model considers meteorological conditions by establishing exogenous weather variables, such as wind speed and air pressure, optimized by minimizing the AIC value to achieve accurate prediction. Since this study aims to investigate the impacts of mobility on NO 2 concentration, we do not analyse the meteorological influence in detail. Alternatively, we have categorized the analysis into six climate types, which is used as background climate associating with meteorological conditions, to obtain generic phenomenon at a large geographical extent. The prediction with 30 and 60 days before lockdown suggests that instantly seasonal variation influences prediction accuracy, while their effect is insignificant when associating with mobility indicators to explain NO 2 concentration.
In conclusion, the proposed NO 2 difference between prediction and observation is an effective indicator to explain the improvement of the air quality after lockdown. The proposed ∆m rd can explain the NO 2 concentration comprehensively by considering the source of dynamic urban mobility, the spatial constraints of road networks, and physical dispersion process. The proposed analysis method can be used to investigate other air quality indicators and other disruptive infectious diseases.

Data availability statement
The data that support the findings of this study are available upon reasonable request from the authors.