1 Introduction

In recent decades, air pollution has been one of the most important environmental issues in China (Wang et al. 2017; Feng et al. 2019). Mining the observation data of air quality is useful for air quality controlling and management. Several components in air quality time series can be explored for better understanding the characteristics of pollution processes. These components include the trend (Zhao et al. 2019), the seasonal cycle component (Miao et al. 2015) and the daily cycle component. Besides, another component, which might be overlooked, is the weekly cycle of air quality. Commonly, the weekly cycle can’t be noticed by people because its variation is very weak relatively to other components. However, it is deserved to be explored, as the weekly cycle is wholly the result of human influence and no natural process can lead to it. Thus, the exploration of the weekly cycle of air quality could answer a question about what will happen when there is a regular change of the pollution emission.

The weekend effect of air pollution, i.e., the air quality on weekends is different from weekdays, has been explored since 1970 s by many researches worldwide (Elkus and Wilson 1977; Blanchard and Tanenbaum 2003; Parra and Franco 2016; Atkinson-Palombo et al. 2006; Altshuler et al. 1995; Gough and Anderson 2022; Jones et al. 2008; Marr and Harley 2002). In most cases, researches focus on the weekly changes of O\(_3\). In details, the level of O\(_3\) is higher on weekends than weekdays, and this phenomenon is the result of the lower emission of NO\(_{\text{x}}\) (i.e., NO + NO\(_2\)) on weekends (Elkus and Wilson 1977). The relationship between O\(_3\) and NO\(_{\text{x}}\) is complex and dependent on the VOC/NO\(_{\text{x}}\) ratio. If VOC/NO\(_{\text{x}}\) ratio is low (this is called the VOC-limited regime), the accumulation of NO\(_{\text{x}}\) will limit the formation of O\(_3\) (Parra and Franco 2016). This means that reducing the ozone precursor’s emissions will increase the O\(_3\) concentrations, and this phenomenon has been supported by previous studies (Dan et al. 2020; Sicard et al. 2020). For China, previous studies show that most cities in east China is VOC-limited (Wang et al. 2017, 2010). This leads to the weekend effect is also observed in many China cities. For example, Wang et al. (2014) has shown that there is a similar weekend effect in the Beijing–Tianjin–Hebei region, with a higher O\(_3\) concentration and a lower NO\(_{\text{x}}\) on weekends. Furthermore, the lower fine aerosol concentration on weekends is also observed, and leads to higher ultra-violate radiation, and this might be one of the reasons for the high O\(_3\) concentration on weekends (Wang et al. 2014). However, Li et al. (2015) shows that patterns of the weekend effect from different cities in China are not consistent, which is one of the motivations of this study.

It should be noted that, in China, exploring some short-term events such as 2008 Beijing Olympics (Wang et al. 2010) and the lock-down of the COVID-19 pandemic (Liu et al. 2021; Wang et al. 2020) can also provide the insight of the effect of changes of the emission scenario. However, the disadvantage of exploring these events is that these events are influenced by both the meteorological factors and the emission condition. This issue leads to difficulty to separate the effects from each other. The advantage of exploring weekly cycle is that we have observed the air quality corresponding to every emission pattern (i.e., each day of the week) many times. Thus, averaging the result of multiple weeks can remove the influence of meteorological factors, and weekly cycle can be treated as a natural experiment for us to examine how pollutants change under different emission patterns (Altshuler et al. 1995; Elkus and Wilson 1977).

Although the concept of weekend effect has been widely accept and discussed, we must note that there is no guarantee that the emission is lower on weekends in cities of China. However, this assumption seems to be pre-assumed to be true in previous studies in China (Wang et al. 2014). Considering this issue, the common used measure of the weekend effect, i.e., the percentage of (weekend – weekday)/weekday, might be misleading and needed to be further reviewed. In this study, what we focus is not the weekend effect but the weekly cycle effect. This task is through the definition of a new metric, i.e., the weekly cycle anomaly (WCA), which is used to describe the possible weekly cycle. The key point of WCA is that, for a given time of the hourly time series (i.e., a given hour), we calculate the departure of the mean value of the 24-hours window from the mean value of the 168-hours (the length of a week) window to grasp the contribution of the possible weekly cycle effect.

From 2014, the monitoring network of air pollution in China has been constructed and monitored the six pollutants, i.e., PM\(_{2.5}\), PM\(_{10}\), CO, SO\(_2\), NO\(_2\), and O\(_3\), and the results are released hourly in the official website of Ministry of Ecological Environment. These data provide an opportunity to explore the weekly cycle of air quality in many cities. Data of 367 cities is explored in this study, and spatial interpolation is used to explore the spatial pattern of the weekly cycle.

This study is the first time to explore the weekly cycle effect of air pollution based on strictly statistical methods such as time series filtering methods, non-parametric statistics, and tools for multiple comparison tests. The aim of this study can be summarized as follows. First, through the analysis of the weekly cycle of different pollutants, we can observe the weekly characteristic of the human activity. Second, from the analysis of the weekly cycle, the O\(_3\) concentration at different emission scenarios can be gotten, and this can be used as the reference of the pollution controlling strategy. Third, this study also aims to provide a framework for analyzing the weekly cycle of atmospheric variables based on a much solid base, utilizing the technologies of time series analysis and the multiple significant tests.

2 Data and methods

2.1 Dataset

Hourly records of the concentration data of six pollutants (PM\(_{2.5}\), PM\(_{10}\), CO, SO\(_2\), NO\(_2\), and O\(_3\)) from the time range of 2015–2020 are explored in this study. The data is downloaded from the real time report of the Department of Ecology and Environment of China. Data of 367 cities are used considering the relatively complete records for these cities. For each city, the data of all stations are aggregated to get the mean value of the city.

The positions of the cities investigated in this study have been shown in Fig. 1. For getting a better understanding of the geographic distribution of these cities, the Hu Huanyong Line (Hu’s line) is also shown in the figure. Hu’s line is a famous line that divides the whole China into two subregions with different climate characteristics. Since ancient times, the economy of the east-south side of Hu’s line has been based on farming, and this region is now with the high population density and large anthropogenic influence. Figure 1 shows that most cities located at the east-south side of the Hu’s Line. As we will show in this paper, higher pollutant concentration also occurs at the east-south side of the Hu’s Line.

Fig. 1
figure 1

The spatial distribution of the 367 cities explored in this study. Some cities that will be explored in the following sections are shown in the figure. The Hu Huanyong Line (Hu’s line), proposed in 1934 by China geographer Hu Huanyong, is also shown in the figure

2.2 WCA and WCAP

For a given time of the hourly time series, a frequently used method to assess the weekly cycle effect is to measure the departure of the air pollutant concentration from the mean value of that week. However, a more proper method is to explore the departure from the mean value of the week window whose center is at that time (hour). Normally, for the hourly air pollution time series \(X_t\) (such as PM\(_{2.5}\)), we define the weekly cycle anomaly (WCA) and weekly cycle anomaly percentage (WCAP) of the time t based on the following method:

$$\begin{aligned} \begin{aligned} \text{WCA}_t&= Y_t - Z_t \\ \text{WCAP}_t&= \frac{\text{WCA}_t}{Z_t} \cdot 100\% \end{aligned} \end{aligned}$$
(1)

where

$$\begin{aligned} \begin{aligned} Y_t&= \frac{1}{24} \sum _{i=-11}^{12} X_{t+i} \\ Z_t&= \frac{1}{168} \sum _{i=-83}^{84} X_{t+i} \end{aligned} \end{aligned}$$
(2)

In short, \(Y_t\) is the mean value of the day window (the width of the window is 24 h) with the center at the time t, and \(Z_t\) is the mean value of the week window (the width is 168 h, i.e., 7 days) with the center at the time t. WCA\(_t\) and WCAP\(_t\) are the corresponding anomaly and anomaly percentage of the mean value of 24 h window from the 168 h window. It is obvious that the effect of the weekly cycle exists in the time series of WCA or WCAP. Figure 2 shows time series expressed in Eq. (1) – (2), i.e., \(X_t\), \(Y_t\), \(Z_t\) and the final WCA\(_t\) and WCAP\(_t\). The result in Fig. 2 is based on PM\(_{2.5}\) of Beijing from Jan 01 to Dec 31 in 2016.

Fig. 2
figure 2

Results of each steps for calculating the weekly cycle anomaly (WCA) and weekly cycle anomaly percentage (WCAP), which takes PM\(_{2.5}\) of Beijing from Jan 01 to Jun 30 in 2016 as an example. a The original hourly time series X; b The times series Y (i.e., moving average series based on the day window), and Z (i.e., moving average series based on the week window). c the WCA series. d the WCAP series

The procedure for calculating WCA can also be viewed by the decomposition form of time series. Suppose that the hourly time series can be decomposed as the additive form:

$$\begin{aligned} X_t = G_t + C_t^a + C_t^w + C_t^d + e_t \end{aligned}$$
(3)

where \(G_t\) models the non-periodic trend changes (for example, a linear trend), \(C_t^a\) is the annual cycle, \(C_t^w\) is the weekly cycle, \(C_t^d\) is the daily cycle, and \(e_t\) is the random term, which is mainly driven by meteorological conditions. We assume that all cycles have zero means. That means if \(C_t^a\), \(C_t^w\) or \(C_t^d\) has non-zero mean, we can subtract its mean value and add it on \(G_t\). As the daily cycle \(C_t^d\) in \(X_t\) can be fully removed by a moving average filter with the window of 24 h, \(Y_t\) can be expressed as:

$$\begin{aligned} Y_t = \frac{1}{24}\sum _{i=-11}^{12} G_{t+i} + \frac{1}{24}\sum _{i=-11}^{12} C_{t+i}^a + \frac{1}{24}\sum _{i=-11}^{12} C_{t+i}^w + \frac{1}{24}\sum _{i=-11}^{12} e_{t+i} \end{aligned}$$
(4)

The series \({Z_t}\) is the time series smoothed by the moving average filter with the window of one week. That means both the weekly cycle \(C_t^w\) and the daily cycle \(C_t^d\) in \({X_t}\) have been removed, and the final \(Z_t\) is:

$$\begin{aligned} Z_t = \frac{1}{168}\sum _{i=-83}^{84} G_{t+i} + \frac{1}{168}\sum _{i=-83}^{84} C_{t+i}^a + \frac{1}{168}\sum _{i=-83}^{84} e_{t+i} \end{aligned}$$
(5)

As the moving average based on the 24-hour or 168-hour window (it is a low-pass filter) has almost no influence on low-frequency changes like \(C_t^a\) and \(G_t\), that means \(\frac{1}{168}\sum _{i=-83}^{84} C_{t+i}^a \approx \frac{1}{24}\sum _{i=-11}^{12} C_{t+i}^a\), and \(\frac{1}{168}\sum _{i=-83}^{84} G_{t+i} \approx \frac{1}{24}\sum _{i=-11}^{12} G_{t+i}\). Thus, the difference between \(Y_t\) and \(Z_t\), i.e., WCA\(_t\), can be expressed as:

$$\begin{aligned} \text{WCA}_t \approx \frac{1}{24}\sum _{i=-11}^{12} C_{t+i}^w + \frac{1}{24}\sum _{i=-11}^{12} e_{t+i} - \frac{1}{168}\sum _{i=-83}^{84} e_{t+i} \end{aligned}$$
(6)

The first term on the right side of (6) is the smooth version of the weekly cycle component by a 24 h moving average filter. The second and third terms reflect that WCA is still influenced by other factors such as meteorological factors. Note that, based on WCA, we can avoid disturbs from the daily cycle and the annual cycle, and focus on the weekly cycle. This fact is very important when we want to make a significance test to compare values of different hours of the week. When using the original values for the comparison, the difficulty is that for a given hour of the week, the values are not independent and identically distributed data but time series with trend and seasonality, which does not meet the precondition for the significance test. Using WCA can avoid this problem.

Note that one advantage of WCAP than WCA is that the unit of WCAP has been removed from the data, thus it is convenient for comparisons among different pollutants. In the following text, most analysis is based on WCA. However, in Fig. 7, we use WCAP for better comparison between different pollutants.

2.3 Averaged WCA and WCAP

If we have obtained the WCA and WCAP series, the averaged weekly process based on WCA or WCAP (denoted as WCA\(_{\text{Ave}}\) and WCAP\(_{\text{Ave}}\)) can be calculated for each hour of the week (i.e., 0-167 h).

$$\begin{aligned} \begin{aligned} \mathrm{WCA_{Ave}}(j)&= \frac{1}{N}\sum _{i=1}^{N} \text{WCA}_{168(i-1) + j} \\ \mathrm{WCAP_{Ave}}(j)&= \frac{1}{N}\sum _{i=1}^{N} \text{WCAP}_{168(i-1) + j} \end{aligned} \end{aligned}$$
(7)

where i is the index of the week, N is the number of weeks, and j is the hour of the week (0–167). This procedure is illustrated in Fig. 3a, which shows the curves of WCA for all weeks and the corresponding WCA\(_{\text{Ave}}\) process based on PM\(_{2.5}\) of Beijing. The curve of WCA\(_{\text{Ave}}\) can be used for visualizing the evolution of the weekly cycle process of a given city.

From Fig. 3, it is apparent that the ratio of the variance can be explained by WCA\(_{\text{Ave}}\) is very low, i.e., the weekly cycle can only explain a small amount of variance in the original time series. In details, the range of the WCA is from \(-\)151.33 to 307.72 μg/m3 , while the range of WCAP\(_{\text{Ave}}\) is only from \(-\)3.04 to 7.12 μg/m3. This fact indicates that the main controlling factor of air pollution in short scales is the meteorological factors, leading to high variations in WCA. Furthermore, this fact leads the calculated WCA\(_{\text{Ave}}\) has a high variation, i.e., the slight change of the samples will lead to a large change of the final result.

Fig. 3
figure 3

a Weekly cycle anomaly (WCA) from different weeks and averaged WCA (WCA\(_{\text{Ave}}\)), which are calculated based on PM\(_{2.5}\) of Beijing; b The WCA\(_{\text{Ave}}\) curves calculated based on 10 bootstrap samples

Bootstrap, i.e., sampling with replacement to get a series of sample sets, can be used for assessing the stability of the results (Davison and Hinkley 1997). For further exploring this issue, Fig. 3b shows the calculated WCA\(_{\text{Ave}}\) curves based on different bootstrap samples (in details, weeks are sampled with replacement to get a bootstrap sample set). It can be seen that although all the final 10 curves of the bootstrap version of WCA\(_{\text{Ave}}\) processes show relatively high values on Saturday, some secondary patterns are not consistent, especially the patterns located from Monday to Friday. Thus, one should care about the pattern shown by the curve of WCA\(_{\text{Ave}}\) for a given city, although the WCA\(_{\text{Ave}}\) curve is indeed a useful tool for visualizing the pattern of weekly cycle. Considering this uncertainty of the result for a given city, in this study, data of many cities are analyzed and shown on maps, and this can facilitate us to get a whole picture of the weekly cycle characteristics of China. Furthermore, significance test are also applied on all cities for exploring the distribution of p values, as will be described in the following Section.

One advantage of the WCA\(_{\text{Ave}}\) process for further analysis is that the meteorological influence has been removed. Thus, the WCA\(_{\text{Ave}}\) for different hour of the week reflects the mean air conditions under different emission scenarios. For understanding the importance of this issue, one can consider the following problem. If we explore the relationship between concentration of pollutants such as NO\(_2\) and O\(_3\) in some temporal scale (such as the hourly scale), we always face the fact that there might be some meteorological factors controlling both them, making it is difficult to explore the causality relationship based on these observation data. That is the advantage of using WCA\(_{\text{Ave}}\) to make inferences.

2.4 Significance test of multiple tests

As what we have stated, the concept of the weekend effect might not be proper, as there is no guarantee that the emission of precursors of O\(_3\) is lower on weekends than weekdays in China. Thus, the analysis of this study is not based on the comparison between the mean values of weekends and weekdays of the O\(_3\) concentration, but between the scenarios of the time of the peak and valley emission of the week. The peak and valley time of the NO\(_{\text{x}}\) emission (i.e., the hours of the week with the highest and lowest NO\(_{\text{x}}\) emissions) is inferred based on the WCA\(_{\text{Ave}}\) of NO\(_2\), and we use the corresponding peak and valley hour of WCA\(_{\text{Ave}}\) of NO\(_2\) as the time of peak and valley emission of the week.

For a given hour of the week, the count of observations of the WCA of O\(_3\) is just the number of weeks. For example, in each week, WCA of the 36th hour (i.e., the 12:00 in Tuesday) can be calculated, and all WCA values can be aggregated as a sample set corresponding to the 36th hour of the week. We treat WCA of O\(_3\) on the peak hour and the valley hour of the emission as two sample sets corresponding to different emission scenarios. Considering the assumption of normality might not be true, the Wilcoxon–Mann–Whitney test is used for testing whether WCA of O\(_3\) at different emission scenarios are from the same distribution. For a given city, we can obtain a p value to represent the strength of evidence of the difference. The detail of the Wilcoxon–Mann–Whitney test can be found in Wilks (2011). In this study, this test is carried out based on the R function wilcox.test.

Simply using a cutoff of p value (such as let \(p<0.05\) indicates the significance) to determining the significance is difficult to control the false discovery rate (FDR), i.e., the false positive rate in the tests that we call significant. Thus, we calculate the p values for all cities, and the distribution of p values are explored to get a final conclusion. The analysis method is from Storey and Tibshirani (2003), which is summarized as follows.

If m tests have been done and all p values have been obtained. Suppose that there are \(m_0\) tests that are truly null, and we define the ratio of truly null tests as \(\pi _0 = m_0 / m\). If the histogram of p values is plotted (Fig. 4), the p values of truly null tests will be the uniform distribution, i.e., under the horizontal level \(f = \pi _0\) in the figure. On the other hand, an \(\alpha\) value will be selected to indicate the significance, i.e., \(p<\alpha\) indicates the test is called significant. Thus, the histogram of the p values can be divided into four subsets, i.e., true positive (TP), false positive (FP), true negative (TN), and false negative (FN). Based on the above notations, the false positive rate (FDR) can be expressed as:

$$\begin{aligned} {\text{FDR}} = \frac{\text {FP}}{\text {TP}+ \text {FP}} \end{aligned}$$
(8)

That means FDR is the false positive rate in the tests that we call significant. For estimating FDR, \(\pi _0\) should be estimated firstly. The key point is that the distribution of p values from truly null tests is the uniform distribution, while p values from truly significant test will be located near 0. Based on this fact, p values with a flat pattern on the right side of the graph can be used for calculating the \(\pi _0\). Based on the estimated \(\pi _0\), if we choose a threshold \(\alpha\) to conclude that \(p<\alpha\) is significant, one can calculate the FDR as follows:

$$\begin{aligned} \widehat{\text{FDR}}_\alpha = \frac{\hat{\pi _0} m \alpha }{\#\{p_i \le \alpha \}} \end{aligned}$$
(9)

The q value is defined as, for a given test, the minimum FDR we can get when we call this test significant:

$$\begin{aligned} \hat{q}(p_i) = \min _{\alpha \ge p_i}\widehat{\text{FDR}}_\alpha \end{aligned}$$
(10)

For a given test with \(p = p_i\), if we call it significant, we must select the threshold larger than \(p_i\), i.e., \(\alpha \ge p_i\). The q value of this test is the minimum FDR we can get if we call this test significant. Thus, for a given test (i.e., a city in this study), we can get both of the p value and the q value. Just like the p value is used for controlling the false positive rate (\(\mathrm FP/(FP+TN)\)), the q value is used for controlling the FDR.

Fig. 4
figure 4

The explaination of the distribution of p values and the estimate of \(\pi _0\)

2.5 Spatial interpolation

Spatial interpolation is used in this study to generate grid values based on observations of cities, as grid data is facilitated to be illustrated on maps. Kriging is a frequently used method for making spatial map based on scatter points in environmental science (Li and Heap 2014; Wong et al. 2004). The principle of Kringing is to use the weight average of the values of neightborhood samples to predict the interpolated point.

The kriging interpolation is carried out based on R package gstat. Note that the result of the spatial interpolation is only for better showing results in maps. As only cities data are used for generating the maps, thus they only reflect the data of urban stations. The result is not aimed to making spatial prediction for rural locations.

3 Results and discussion

3.1 Spatial and temporal trend

Before exploring the weekly cycle of the air pollution, we investigate the spatial and temporal trend of six pollutants. Figure 5 shows the concentration of pollutants for all cities and all pollutants. PM\(_{2.5}\) and PM\(_{10}\) show similar spatial patterns, and both of them show two peak regions. The first is located at the North China, including Beijing–Tianjin–Hebei; the other is located at Xinjiang Province. CO and SO\(_2\) show similar spatial patterns, and reach higher values at the North China. For NO\(_2\), the figure shows that there are three high regions, i.e., Beijing-Tianjin-Hebei, the Yangtze Delta, and the Pearl River Delta. This is consistent with the NO\(_{\text{x}}\) emissions data shown in Wang et al. (2017).

Fig. 5
figure 5

The air pollutant concentrations of all cities explored in this study. The unit of CO is \(\mathrm mg/m^3\), and the unit of all other pollutants is \(\mathrm \mu g/m^3\)

For exploring the trend of pollutant concentrations from 2015–2020, Fig. 6 shows the time series of mean values of all air pollutants. For all pollutants except O\(_3\), there are apparent decreasing trends, while the concentration of O\(_3\) is increasing before 2018. The linear increasing rate for all pollutants are \(-\)3.14\(\mathrm \mu g/(m^3 \cdot yr)\)(PM\(_{2.5}\)), \(-\)5.26\(\mathrm \mu g/(m^3 \cdot yr)\)(PM\(_{10}\)), \(-\)0.071\(\mathrm mg/(m^3 \cdot yr)\)(CO), \(-\)3.05\(\mathrm \mu g/(m^3 \cdot yr)\)(SO\(_2\)), \(-\)0.79\(\mathrm \mu g/(m^3 \cdot yr)\)(NO\(_2\)), and 1.18\(\mathrm \mu g/(m^3 \cdot yr)\)(O\(_3\)). These results reflect the success of the managing strategy, and is consistent with previous studies such as (Li et al. 2019; Feng et al. 2019). The upward trend of O\(_3\) is consistent with the result from Feng et al. (2019), who find a dramatically increasing trend of O\(_3\) from 2013 to 2017. Furthermore, the increasing trend of O\(_3\) is also consistent with the results from (Sicard et al. 2020), which indicates that the O\(_3\) concentration of cities worldwide is increasing from 2005–2014.

Fig. 6
figure 6

Annual mean concentrations of air pollutants of China. The values are calculated by averaging concentrations of all cities

3.2 Weekly cycle characteristics

In this section, we analyze the results of averaged WCAP (denoted as WCAP\(_{\text{Ave}}\)). The definition of WCAP\(_{\text{Ave}}\) has been described in Sect. 2.3. WCAP\(_{\text{Ave}}\) can reflect the evolution process of the concentration in the week, and is more convenient for comparison between different pollutants than WCA\(_{\text{Ave}}\) as the value of the latter have different scales for different pollutants.

Figure 7 shows the WCAP\(_{\text{Ave}}\) for 12 cities in different regions of China. The cities in the first row is located in the Beijing–Tianjin–Hebei region (known as Jing-Jin-Ji in China); the cities in the second row is located in the Yangtze River Delta; the cities in the third row is located in the Pearl River Basin; and the cities in the last row is located in the urban agglomeration in the middle reaches of Yangtze River. The positions of most of these cities have been illustrated on Fig. 1. It can be seen that the curves of WCAP\(_{\text{Ave}}\) of different cities show very diversified patterns. Even among cities in the same region such as Beijing and Tianjin, the difference might be quite large. Several reasons might be used to explain this fact. Firstly, as we have shown in Fig. 3, the values of WCAP\(_{\text{Ave}}\) are estimated from the samples with high variances, leading to the uncertainty of the pattern we have obtained. Secondly, this fact reflects that different cities have variance characteristics of human activities. An important fact shown in many cities is that WCAP\(_{\text{Ave}}\) of pollutants such as NO\(_2\) is still positive on weekends, especially on Saturday. This fact indicates that using the value of the weekend (i.e., the averaged value of Saturday and Sunday) to represent the scenario of low emission is misleading in many China cities.

One important feature shown in Fig. 7 is that the cycle phase of O\(_3\) is normally the inverse phase of the NO\(_2\), and this is true for almost all cities shown in Fig. 7. Take Beijing as the example. From Wednesday to Saturday, the WCAP\(_{\text{Ave}}\) of NO\(_2\) is positive, while for O\(_3\) the value is negative. A strict significant test will be provided in Sect. 3.3.

Fig. 7
figure 7

The averaged WCAP processes of 6 pollutants for 12 cities in China

Exploring the spatial distribution of the weekly cycle is useful as it will provide a global insight of the pattern of the weekly cycle. Figure 8 shows the time (represented as the hour of the week) corresponding to the peak and the valley of the WCA\(_{\text{Ave}}\) process, respectively. The result still reflects the diversity of the weekly cycle pattern. The peak and valley time of the WCA\(_{\text{Ave}}\) shows strong spatial autocorrelation pattern, i.e., nearby cities have similar time of the peak or the valley. The reason causing this spatial pattern cannot be fully answered as the authors cannot get the traffic emission data. A key fact should be noted is that for some regions, the weekend is not the days with lower concentrations. For east China, the time of NO\(_2\) peak is at Saturday. For PM\(_{2.5}\), the region with the peak on weekends is larger. From Fig. 8b, it can be seen that most regions get valley at weekdays. All of these factors indicate that it is not proper to use the concept “weekend effect” in China.

Fig. 8
figure 8

The occurring time of the peak and valley value in the week, denoted as the hour in the week. (The contour line is 120 h, which is the border of the weekend and weekdays)

3.3 O\(_3\) response to different emission scenarios

In this section, we focus on the response of O\(_3\) concentration to different emission scenarios. In previous studies on the weekend effect, this is based the assumption that the weekend has a lower emission than weekdays. As the NO\(_2\) concentration can reflect the strength of the NO\(_{\text{x}}\) emission, we define the hour of the week with the highest WCA\(_{\text{Ave}}\) of NO\(_2\) as the peak emission scenario (the hour of the peak is denoted as \(\mathrm h_{peak}\) in the following text), while the hour with the lowest WCA\(_{\text{Ave}}\) of NO\(_2\) is treated as the valley emission scenario (the hour of the valley is denoted as \(\mathrm h_{valley}\)). As in the VOC-limited regime, a higher emission of NO\(_{\text{x}}\) will limit a lower concentration of O\(_3\), the Wilcoxon test based on WCA of O\(_3\) is used here, and the alternative hypothesis is that the WCA of O\(_3\) at \(\mathrm h_{valley}\) is larger than \(\mathrm h_{peak}\).

Figure 9a illustrates the distribution of WCA of O\(_3\) at \(\mathrm h_{peak}\) and \(\mathrm h_{valley}\), taking Beiing as an example. The probability density curves of WCA of O\(_3\) for the two groups are calculated by the kernel density estimation (Bishop 2006). Although the data of Beijing reflects the slight higher values of O\(_3\) at \(\mathrm h_{valley}\) than \(\mathrm h_{peak}\), the difference is very weak (Fig. 9a). The p value corresponding to Fig. 9a is 0.011, indicating the difference is not significant at the level of \(\alpha = 0.01\). Figure 9b shows the distribution of the p values of the Wilcoxon test for all cities. The distribution of p values shows a very ideal pattern, i.e., similar with an exponential fall pattern when the p value increases. When the p value is larger than 0.7, the density is very low and flat. This feature leads to a very low value of \(\hat{\pi _0} = 0.054\) (note that \(\pi _0\) is the overall rate of truly null, i.e., the cities which truly has no difference of O\(_3\) between \(\mathrm h_{peak}\) and \(\mathrm h_{valley}\)). Figure 9c shows the distribution of the q value. Note that only two cities have q values larger than 0.05. Thus, if we control the q value less than 0.05, we can get 365 cities to be significant (\(q \le 0.05\)). This means that in these 365 cities, there are only less than \(365 \times 0.05=18.25\) cities are false positive.

Fig. 9
figure 9

a The density curve of O\(_3\) of Beijing at the NO\(_2\) peak time and the NO\(_2\) valley time. The p value of the Wilcoxon test is 0.011. b The histogram of p values of all cities. All p values are calculated based on the Wilcoxon test with the alternative hypothesis “O\(_3\) at the peak time of NO\(_2\) is less than valley time”. The value of \(\pi _0\) is 0.054. c The histogram of q values of all cities. d The relationship between the q value and the p value

Figure 9 d shows the relationship between q value and p value. It can be seen that even a very large p value is corresponding to a q value less than 0.05. That means, if we simply treat a city with \(p > 0.05\) as insignificant, we will seriously underestimate the number of cities with the weekly cycle effect. The advantage of exploring the distribution of the p value is that we can estimate the value of \(\pi _0\). The very low value of \(\pi _0\) of the above result indicates the widespread existence of the weekly cycle of O\(_3\) in Chinese cities.

For assessing the strength of the weekly cycle for each city, we define two metrics as follows:

$$\begin{aligned} \begin{aligned} D_{\mathrm{NO_2}} = \mathrm{WCA_{Ave}} ~ \text{of} ~ \text{NO}_2 ~ \text{at} ~ \mathrm{h_{peak}} - \mathrm{WCA_{Ave}} ~ \text{of} ~ \text{NO}_2 ~ \text{at} ~ \mathrm{h_{valley}} \\ D_{\mathrm{O_3}} = \mathrm{WCA_{Ave}} ~ \text{of} ~ \text{O}_3 ~ \text{at} ~ \mathrm{h_{peak}} - \mathrm{WCA_{Ave}} ~ \text{of} ~ \text{O}_3 ~ \text{at} ~ \mathrm{h_{valley}} \end{aligned} \end{aligned}$$
(11)

It must be noted that as \(\mathrm h_{peak}\) and \(\mathrm h_{valley}\) are calculated based on WCA\(_{\text{Ave}}\) of NO\(_2\), thus, \(D_{\mathrm{NO_2}}\) is the amplitude of WCA\(_{\text{Ave}}\) of NO\(_2\). However, \(D_\mathrm{O_3}\) is not the difference between the peak and valley value of O\(_3\) itself, but the difference of WCA\(_{\text{Ave}}\) of O\(_3\) between the peak and valley of NO\(_2\). Figure 10 shows the scatter plot of \(D_{\mathrm{NO_2}}\) vesus \(D_{\mathrm{O_3}}\). There is an apparently negative correlation between \(D_{\mathrm{NO_2}}\) and \(D_{\mathrm{O_3}}\), i.e., the stronger weekly cycle of the NO\(_2\), the stronger weekly cycle of O\(_3\) with a reverse phase. The Pearson correlation coefficient between them is \(-\)0.19, and the corresponding p value is less than 0.0005. As shown in Fig. 10, 81% cities have the negative values of \(D_{\mathrm{O_3}}\), which is the weekly pattern we expected. However, how to view the other 19% cities (70 cities) with \(D_{\mathrm{O_3}}>0\)? Are these cities have the reverse weekly pattern? The key point is that, even for cities with \(D_{\mathrm{O_3}} \ge 0\), the Pearson correlation coefficient between \(D_{\mathrm{NO_2}}\) and \(D_{\mathrm{O_3}}\) is \(-\)0.22 (p value is 0.073). That means the effect of the weekly cycle of O\(_3\) of these cities is consistent with other cities, although it is too weak to be observed when only a single city is explored. Exploring all cities can provide a much solid evidence.

Fig. 10
figure 10

The scatterplot of \(D_{\mathrm{O_3}}\) and \(D_\mathrm{NO_2}\). The definition of \(D_{\mathrm{O_3}}\) and \(D_{\mathrm{NO_2}}\) can be seen in Equator (11). Both of them are calculated by WCA\(_{\text{Ave}}\) of \(\mathrm h_{peak}\) minor WCA\(_{\text{Ave}}\) of \(\mathrm h_{valley}\)

Note that almost all cities with \(D_{\mathrm{NO_2}} \ge 3\) show negative \(D_{\mathrm{O_3}}\). One might wonder which regions have the strongest weekly cycle of NO\(_2\), i.e., higher values of \(D_{\mathrm{NO_2}}\). Figure 11 shows the cities with the \(D_\mathrm{NO_2}\) values larger than 2 and 3. It can be seen that there are four regions with the highest value of \(D_{\mathrm{NO_2}}\), i.e., the Beijing–Tianjing–Hebei region, the Shandong Peninsula, the Yangtze River Delta, and the Pearl River Delta. This spatial pattern is very consistent with the distribution of NO\(_2\) shown in Fig. 5, indicating that cities with high levels of the concentration tends to have stronger weekly cycles.

Fig. 11
figure 11

The cities with strong NO\(_2\) cycles

Above results suggest that in most regions in China, the generation process of O\(_3\) is VOC-limited, and the decreasing of the emission of NO\(_{\text{x}}\) will enhance the concentration of O\(_3\). This result is consistent with (Wu et al. 2018), which indicates that in most regions of China except some rural stations, the generation of O\(_3\) is VOC-limited. Thus, for controlling the concentration of O\(_3\), priority should be given to controlling VOCs emissions in the short term.

4 Summary and conclusions

One of contributions of this study is providing a framework for analyzing the weekly cycle of the air pollutant time series. The technique used in this study is based on the weekly cycle anomaly (WCA) and the weekly cycle anomaly percentage (WCAP) defined in this paper. The key here is, for exploring the influence of the weekly cycle, one should remove influences from other component in the hourly time series such as trend, seasonality and the daily cycle. Based on WCA (WCAP), the WCA\(_{\text{Ave}}\) (WCAP\(_{\text{Ave}}\)) process (from 0–167 h of the week) can be calculated by averaging WCA (WCAP) for all weeks. That means, the WCA\(_{\text{Ave}}\) (WCAP\(_{\text{Ave}}\)) process only reflects the response of concentration to different emission scenarios, and the influence of meteorological factors has been removed by averaging. Thus, exploring the pattern of WCA\(_{\text{Ave}}\) (WCAP\(_{\text{Ave}}\)) can provide deep insight of the characteristic of the response of O\(_3\) concentration to different emission scenarios.

The conclusions of this study can be summarized as follows:

  • The result of this study indicates that there are very complex patterns of the weekly cycle for different cities in China. Although the concept of the weekend effect, i.e., the lower emission on weekends lead to higher concentration of O\(_3\), has been explored since 1970 s, in this study, we have shown that this framework is not proper at least for cities of China. The reason is that the concentration (such as NO\(_2\) and PM\(_{2.5}\)) on weekends, especially Saturday, is not lower than weekdays in many cities. For NO\(_2\), a large region in the east China, including Hebei, Shandong and Anhui, reaches the peak value on weekends. For PM\(_{2.5}\), the corresponding region is much larger. Thus, if the framework of weekend effect is used, the conclusions from different cities are inconsistent, i.e., some sites have positive weekend effect, while other sites have negative ones. Thus, it is better to analyze the whole weekly cycle.

  • The task for exploring the weekly cycle pattern is to identify a weak effect from very noisy data, and the ratio of variance explained by the averaged process is less than 1% in most cities for all pollutants. Thus, one should take care about the evolutionary patterns shown in the WCA\(_{\text{Ave}}\) or WCAP\(_{\text{Ave}}\) process, and the result of a given city is very uncertain. For many cities, investigation of the p value will not lead to a significant conclusion, as the effect is very weak. The exploration of the distribution of p values for all cities will provide a global insight of the weekly effect, and the ratio of cities truly have the weekly effect can be estimated. Our result suggests that almost all cities in China have a weekly cycle, in which higher O\(_3\) concentration at the lower NO\(_{\text{x}}\) emission scenario, and vice versa. However, the lower NO\(_{\text{x}}\) is not located on weekends in many cities.

  • This study, which is based on data from 2015 to 2020, still shows that, in most regions of China, the ozone is still located in the VOC-limited regime. That means the reduction of the NO\(_{\text{x}}\) emission will enhance the O\(_3\) formation. This fact indicates that the selection of a proper controlling strategy is important for future air pollution controlling.

  • It can be seen that there are four regions with the strongest weekly cycle of NO\(_{\text{x}}\), i.e., the Beijing-Tianjing-Hebei region, the Shandong Peninsula, the Yangtze River Delta, and the Pearl River Delta. This characteristic is very consistent with the spatial distribution of NO\(_2\), i.e., the level of the concentration is consistent with the strength of the weekly cycle.