Chinese blue days: a novel index and spatio-temporal variations

As part of the Blue-Sky Protection Campaign, we develop the Chinese Blue Days Index based on meteorology data from 385 stations in China during 1980–2014. This index is defined as the days with no rain, low cloud cover ≤75th percentile, and visibility ≥15 km at 2 pm. The spatio-temporal variations and possible driving factors of Chinese Blue Days (CBD) are further investigated, revealing a steadily rising rate of 1.6 day (d)/10 year (y) for the nationally averaged CBD during 1980–2014. At regional scales, the CBD exhibit an increasing trend >4 d/10 y in western China and a decreasing trend <−2 d/10 y in southeastern China, northwestern Xinjiang, and Qinghai. The minimum/maximum trends (−7.5/9.5 d/10 y) appear in Yangtze–Huai River Valley (YHRV)/southwestern China (SWC). The interannual variations in CBD are highly related to wind speed and windless days in YHRV but are closely associated with wind speed, rainless days and relative humidity in SWC, suggesting that the two regions are governed by different meteorological factors. Moreover, a dynamic adjustment method called partial least squares is used to remove the atmospheric circulation-related CBD trend. The residual CBD contributions for the total trend in summer and winter are 43.62% and 35.84% in YHRV and are 14.25% and 60.38% in SWC. The result indicates that considerable parts of the CBD trend are due to the change of atmospheric circulation in the two regions.


Introduction
As the largest developing country in the world, China is troubled by serious air pollution with the accelerated industrialization process, (Zhang et al 2012, Han et al 2014. Haze pollution has been a fatal problem, affecting people's daily lives and causing serious economic losses (Ramanathan andRamana 2005, Gultepe et al 2007). As such, the variations in haze days have been studied intensively (Ding and Liu 2014, Chen and Wang 2015, Su et al 2015, Zhang et al 2015, Han et al 2016, Cai et al 2018.
When discussing blue skies, climatologists only consider cloud cover, while environmentalists mainly focus on air quality. A blue day, which means a day with blue sky and clean air, combines conceptually the effects of the two factors. Although it occurs frequently, it is easily ignored by researchers. Up until 2015, as one of the ten clean air keywords published by the Beijing Environmental Protection Publicity Center, the phrase 'Beijing blue' has attracted people's attention. On March 5th, 2016, the 18th CPC National Congress put forward a brand-new idea of building a beautiful China. The Chinese government rolled out Original content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence.
Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. a three-year plan called the Blue-Sky Protection Campaign in 2018 (State Council of China 2018) to eliminate haze and avoid air pollution, and finally construct a blue China. To examine the effectiveness for this blue-China plan, a direct indicator needs to be developed from studying the blue days based on a long-term record.
Up until now, researchers have only focused on typical blue events, such as 'Olympic blue' (Streets et al 2007, Wang et  . Effects of emission controls and meteorological conditions on the occurrence of blue sky have been hotly disputed. However, there is neither a clear definition for Chinese blue days, nor a quantitative attribution of its long-term trend. Thus, this study is innovative in focusing on a phenomenon neglected by both climatologists and environmentalists.
Unlike researches on air quality, the cloudy days with clean air are removed to focus on the Sunny days. Understanding the spatio-temporal distributions of Chinese blue days could help us to provide a reference for policy-making and the selection of time for largescale events, such as the Olympic Games and Asian Games. Additionally, it favors studies on the relationship between mortality rates (disease) and weather (Pope et al 2002, Pope and Dockery 2006, Lim et al 2012. This would benefit determining liveable cities and national energy strategy and formulating a better energy-saving emission reduction inventory. Meanwhile, the change of blue days greatly affects the living arrangements, mood, and health of residents. Overall, studying the natural and anthropogenic contributions to blue day changes helps to examine the implementation of sustainable development more directly. In this study, we develop a novel Chinese Blue Days Index (CBDI) to represent the variations of Chinese Blue Days (CBD), and investigate their spatiotemporal distributions, and then analyze possible driving factors during 1980-2014.

Data and method
2.1. Data Daily average meteorological observations, including precipitation, wind speed at 10 m above the ground, temperature, relative humidity, cloud cover, low cloud cover and visibility (observed four times per day at 2:00, 8:00, 14:00, 20:00 BJT) during 1980-2014 from 385 ground stations are used to identify the blue days in China. The dataset was obtained from the National Meteorological Information Center (http://data.cma. cn/), which has been under strict quality control  2.2. Method 2.2.1. CBDI index definition and validation There are two criteria to determine CBDI. First, the selected day should have a blue sky, with no rain, which represents a sunny day. Second, it should have clean air, which represents good air quality. The details for the definition are shown as follows: (1) First, we identify a sunny day as a day with daily precipitation0.1 mm and low cloud cov-er75th percentile. The criteria are selected based on a statistical analysis of the daily weather phenomenon record from 2011-2014 (see supplementary, section 2) in 32 provinces. After selecting all sunny days (recorded as sunny and no rain) in this dataset and comparing their meteorological factors on that day, the results shown in figure 1 prove that the excursion of the total cloud cover is too large to be used in the definition, and the outliers of low cloud cover belonging to frequently cloudy and rainy areas (like Beihai, Guangzhou) show that a percentile index can better represent change in all stations rather than an absolute index. In all, the AX30 (low cloud cover30%), TX70 (low cloud cover70th percentile), and TX75 (low cloud cover75th percentile) account for 93%, 95.5%, and 96.5% of low cloud cover on recorded sunny days, respectively. According to visibility ranks for distance in table 1 and the median visibility in figure 1, we choose 'great' visibility (20 km) as part of our definition. Taking cloudy and rainy areas into consideration, we also choose the lower quartiles visibility (15 km) in figure 1 as a comparison. Then, we obtain 6 alternative CBDI indexes (shown in table 2).
(2) Early studies find the relative humidity (RH) affects the visibility, indicative of an error in relation to human observation ( where RH is in percent and VIS is the observed visibility (km).
(  (2007). The accuracy of prediction is calculated using formula (2)-(4): the rate of missing reports (PO) and the empty reports (FAR) are: The results (table 4) show that all 6 indexes fit into the category of good air quality in summer and autumn. After using the percentile index, the TS improves largely in Guangzhou and Kunming, almost to 75% (not shown). Furthermore, among all 6 indexes, the low75_2 index has the highest PC (65.18%) and lowest PO (22.79%). Additionally, the FAR is 19.29%. We must pot out that the frequent dust activities and consistently sufficient moisture lead to  Although there are some subjective parts of our index due to the lack of long-time actual observation records for sunny days and AQI, it is still the first time we try to combine pollution data with meteorology data to define a reasonable blue day, which demonstrates good agreement with reality. Therefore, finally, we put out the following CBDI definition: no rain (daily precipitation0.1 mm), low cloud coverTX75, and the observed visibility in 14:00 15 km.

Statistical methods
Student's t-test is used to examine the significance of linear regression at the 95% confidence level. Following Yue and Wang (2004), a revised non-parametric Mann-Kendall test is used to assess the abrupt change. The details of the method can be found in supplementary section 3. The partial least squares (PLS) method (Abdi 2010) is applied to remove the components of CBD related to atmospheric circulation and the residual parts are also discussed.    2(d)). The result shows that there is a significant interdecadal shift in CBD near the mid-1990s.

Climatology and long-term trends of Chinese blue days in China
To analyze the regional characteristics of CBD variations, two typical subregions, which share the most significant variations, are selected based on topographic distribution and results of EOF (figure S2). One subregion is the YHRV (30°N-35°N, 115°E-122°E ), including 35 stations, and the other is southwestern China (SWC; 26°N-34°N, 98°E-107°E), including 40 stations. Their scopes are shown in figure 2(c) in black.
The time series of average CBD during 1980-2014 in the two areas are shown in figure 3(a). In YHRV, the averaged value of CBD changes from 148.44 d/y in 1980-1996 to 134.41 d/y in 1997-2014, suggesting that CBD in this region likely has experienced a notable interdecadal change around 1996. In SWC, the annual CBD increases gradually from 1980 to 2014 at the rate of 9.5 d/10 y, which is likely related to the severe and sustained droughts in SWC during these years (Niu et al 2014. With more rainless days and lower relative humidity, the low visibility events caused by meteorological conditions decrease obviously, which may increase CBD. Figure 3(b) shows the month-year evolutions of CBD in YHRC. The highest occurrence of CBD happens in autumn, with an occurrence rate of 44.92%, and it shows an increasing trend in the 1980s (8.41 d/ y) and then turns to a decreasing trend after that (−5.81 d/y). CBD in winter has the lowest occurrence (36.12%), and it increases from 24.85% to 41.33% in the 1990s and then reduces to 20.70% by 2014; the average trend is −3.04 d/y. CBD in summer slightly increases in the 1980s and then maintains a decreasing trend of −3.74 d/y. In spring, CBD in YHRV is stable before the 2000s; it starts to increase at the rate of 9.29 d/y and then decreases rapidly after 2010 (−28.8 d/y). Overall, the annual average CBD in YHRV exhibits a decreasing trend of −7.5 d/10 y.
For SWC (figure 3(c)), where it is perennially wet, a steadily increasing trend is observed. CBD mainly occurs in winter and autumn, with an average occurrence of 51.02% and 37.98%. In winter it exhibits an increasing change at 10.69 d/y in the 1980s, 7.795 d/y in the 1990s, and 3.62 d/y from 2000 to 2014; the annual average trend is 3.23 d/y. In autumn and spring, CBD grows stably in 2.82 d/y and 4.29 d/y, and it increases rapidly at the rate of more than 20 d/y after 2000. However, in summer, it exhibits a decadal trend, that is, it increases in the 1980s (6.85 d/y) and after the 2000s (5.10 d/y) but decreases in the 1990s (−6.06 d/y).

Impact of meteorological conditions on CBD
Four climate factors, including RH, wind speed, windless days, and gale days are chosen to analyze the correlations with CBD. The summary of the Pearson correlation results is shown in table 5. In the annual mean, correlation coefficients with all these factors pass the 95% confidence level at national scales, showing a close relationship with CBD in China. The correlations of wind speed and gale days with CBD are −0.43 and −0.41, respectively. The negative correlation is somewhat counterintuitive, as large wind speed is generally considered to be favorable for blue sky (Fu and Dan 2014). This means that under adverse wind conditions, some other factors play a role, such as wind direction in a specific region. RH is negatively correlated with CBD at the national scale, with a correlation coefficient of −0.40. Reducing RH may decrease the growth of hygroscopic condensation nucleus (CNN), which is a key for the formation of aerosol particles, thereby increasing the possibility of sunny days. In summer and winter, CBD is weakly correlated with wind conditions but is obviously negatively correlated with the RH.
In YHRV, CBD is significantly and positively correlated with wind speed and number of gale days, and negatively correlated with windless days all year round, with correlation coefficients of 0.5, 0.39, and −0.52. This means that weakened wind conditions does lead to decreasing CBD. However, RH is not associated with the change of CBD annually. In summer, CBD in YHRV is strongly correlated with all factors. Whereas in winter, it is only negatively correlated with RH with a correlation coefficient of −0.59.
In SWC, there is a drop in RH yearly and seasonally (summer and winter), with high correlation coefficients of −0.78, −0.81, and −0.87, respectively. Thus, increasing RH leads to increasing CBD. Wind conditions show a completely inverse relationship between the two regions at the annual scale, which agrees with the results of Cheng et al (2013). They found an 'east plus west minus' distribution between visibility and wind speed in China, and obviously, visibility is positively correlated with CBD. In summer and winter, wind speed shows almost no correlation with CBD, and windless days and gale days show weak correlation coefficients. Considering the special basin topography and mountainous terrain, CBD in SWC should be studied further.

The reasons for CBD trends
Previous studies (Chen and Wang 2015, Jian et al 2016, Cai et al 2017 show that the change of haze days in China is significantly modulated by atmospheric circulation. Does atmospheric circulation contribute to the change of blue days in China? To address the question, we apply a dynamic adjustment method developed by Wallace et al (2012) to isolate the role of atmospheric circulation on the trend in CBD according to the following procedure.
Following Hu et al (2018), SLP is seen as an indicator of atmospheric circulation. First, a PLS regression is performed by correlating the time series of seasonally averaged CBD T(n) at each station with standardized SLP in the domain of East Asia (20°N-60°N, 70°E-150°E) to obtain a one-station regression map, and the regression map is used as an SLP predictor for the CBD trend. Then, we project the standardized SLP field to the correlation pattern, weighting each station by the cosine of the latitude, and obtain a score index S(n), which shows the relative score with which the predictor is expressed. Next, we use the least squares method to compute the residual T 1 (n) and SLP trends by removing the linear component associated with the S(n). The above steps are repeated for the residual T(n) and SLP until we find the variance in residual T(n) explained by the third predictors is too small to be ignored (not shown). Thus, we only use two SLP trend predictors to dynamically adjust the CBD trend at each station, and the residual CBD T 2 (n) is considered to be the adjusted CBD trend. The procedure is the same as that in Wallace et al (2012) and Hu et al (2018). We also apply 500 hPa geopotential height as a predictor and find similar results, which are not discussed here. Finally, the raw CBD is analyzed for comparison. The results are shown in figure 4.
Both the spatial patterns of the raw CBD trend in summer (figure 4(a)) and winter ( figure 4(b)) resemble the annual CBD trend pattern ( figure 1(b)). In both   winter and summer, most stations in West China show a positive trend, while considerable stations in East China show notable negative trends. After dynamic adjustment, the residual CBD trend is weaker than the raw one nationwide in summer (figure 4(c)). Some stations in northern Yunnan and the Pearl River Delta even reveal an opposing trend. The national average trend is 0.39 d/35 y and 0.28 d/35 y for raw and adjusted CBD, respectively. In winter (figure 4(d)), the residual CBD trend also becomes weak in most stations after dynamic adjustment. The results suggest that the observed CBD trends in summer and winter are partly caused by atmospheric circulation. Figure 5 provides an overview of the historical change of the raw and adjusted CBD change in SWC and YHRV in summer and winter during 1980-2014. In SWC, the average contributions of residual parts for the total CBD trend in summer (figure 5(a)) and winter (figure 5(c)) are 14.25% and 60.38%, indicating the dominant role for atmospheric circulation in summer. As shown in figure 5(b), the raw summer CBD in YHRV features a notable decreasing trend during 1980-2014. After adjustment, the summer residual CBD in YHRV still decreases although its magnitude is weaker than the raw trend. Additionally, it is clear that the residual CBD in YHRV (figures 5(b) and (d)) still exhibits notable long-term changes, suggesting that other factors such as air pollution may contribute to the change of CBD. Therefore, in figure 6, we depict the contemporaneous variations in anthropogenic emissions of SO 2 , NO X , and VOC in YHRV, which are the main precursors of fine particles. We find that the downtrend in the residual CBD mainly appears after 1990, which is coincident with the rapid increase in emissions of SO 2 , NO X , and VOC after 1990 (figure 6(a)), suggesting that air pollution may also contribute to the decrease in summer CBD in YHRV. Similar results are obtained in winter (figures 5(d) and (b)). Overall, the average contributions of residual CBD to total CBD trend in summer and winter are 43.62% and 35.84% separately, indicating that atmospheric circulation contributes to a larger part of CBD variations in YHRV. Generally, PLS is effective in reducing atmospheric circulation-induced variability and revealing the role of anthropogenic emissions in YHRV.

Summary and discussions
In this paper, we define a CBD index to analyze the spatial-temporal variations of CBD. The results show that the averaged CBD in China increases at the rate of 1.6 d/10 y during 1980-2014. However, the trends vary among different regions. An overt increasing trend in China is observed at 42% of stations, and most stations in the west of 107°E reveal an increasing trend exceeding 4 d/10 y, while stations in YHRV, SNCP, northwestern Xinjiang and Qinghai Provinces in southern China show an opposite trend below −2 d/10 y.
Overall, the average CBD in China is strongly associated with wind speed, rainless days and RH, and their correlation coefficients are −0.43, 0.35, and −0.40, respectively. For SWC, CBD mainly occurs in winter and maintains an increasing trend at 9.5 d/10 y annually. We find that the drop in RH and surge in rainless days do contribute to its upward trend, and their correlation coefficients with CBD are −0.51 and −0.78. Meanwhile, the highly negative relationship between CBD and wind speed needs further study. CBD in YHRV, with the most obvious decreasing rate of −7.5 d/10 y, has the highest occurrence of 44.92% of CBD in autumn. Meanwhile, the annual variations of CBD are closely related to wind speed and windless days, with correlation coefficients of 0.50 and −0.52, but are unrelated to RH.
Using a dynamic adjustment method called PLS, we find that the trend of CBD in many stations is related to the change of atmospheric circulation. After removing atmospheric circulation-induced variations of CBD trends, the residual trends in most stations are weaker than the raw one both in summer and winter. Specifically, the residual CBD contributions for the total trend in summer and winter are 43.62% and 35.84% in YHRV and are 14.25% and 60.38% in SWC.
The results indicate that the change in atmospheric circulation plays an important role in CBD change in China. Understanding the change of atmospheric circulation may help us to make projections of the change of CBD in China, which deserve further study in the future.
In addition to atmospheric circulation, we found that the changes in the emissions of SO 2 , NO X , and VOC also coincide with the trend of CBD to some degree. This suggests that reducing the emissions of air pollutants will help increase CBD in China.