Correlation Analysis between Meteorological Factors and Pollutants Based on Copula Theory

In recent years, the continuous acceleration of industrial and agricultural production has aggravated the deterioration of the environment and seriously affected people’s health. Gas pollutant index has become an important indicator to measure air quality. Taking a region of China as the study subject, this paper firstly uses the monitoring data of air pollutants in 5 years and meteorological data (rainfall, temperature and wind speed) of meteorological observation stations in the same period. Then, the autocorrelation function is selected to analyze the autocorrelation of pollutants, and on this basis, the significant correlation level between meteorological factors and pollutants is analyzed by Copula theory. The results show that the ACF values of pollutants are all higher than 0.6, which means that the autocorrelation of pollutants within 2 hours is relatively significant. PM2.5 has a significant negative correlation with wind speed in four seasons, but a low correlation with rainfall. SO2 is negatively correlated with most of the three meteorological factors. In addition, the study also finds that there are significant differences in the impact of meteorological factors on pollutants at different time scales. This paper proposes a correlation analysis method between meteorological factors and pollutants, which provides corresponding support and guarantee for China’s economic and environmental protection policies.


Introduction
Since entering the 21st century, the world economy has developed rapidly and the industrial structure has been constantly adjusted. However, at the same time, due to imperfect control measures and other reasons, the random discharge of various pollutants has aggravated the environmental pollution, especially the air pollution [1]. In the process of rapid social development in recent years, many countries must deal with the core issues of economic development and environmental protection when they are faced with economic planning and environmental policy formulation [2][3]. In this paper, air pollutants are taken as the research object. Because a variety of air pollutants exist at the same time with a high concentration, complex interactions occur, resulting in the formation of atmospheric complex pollution and haze and other phenomena, there may be a certain degree of autocorrelation [4]. At the same time, the environmental interaction of the earth system also makes pollutants and meteorological factors closely related, and the concentration of air pollutants in the same time period also shows a certain correlation with meteorological factors. With the continuous improvement of modern environmental monitoring system, real-time monitoring of pollutants in most key regions has been realized, which provides a lot of valuable data for this study [5][6]. This paper also combines with the meteorological monitoring data website corresponding to the meteorological factor data period, and then carry out correlation analysis. The aggravation of environmental pollution has attracted extensive attention. Many scholars have conducted in-depth research on the relationship between air pollution and meteorological factors. These articles analyse the characteristics of air pollutants in 14 locations in Egypt, and makes a correlation between air pollutants and meteorology [7][8]. In order to further grasp the correlation model between meteorological factors and pollutants [9], this paper uses the observation database of daily meteorological elements and pollutants in the past 10 years [10]. Through detailed comparison, it is proved that their long-term cross-correlation behavior in rural areas is more obvious than that in urban areas. In addition, the influence of meteorological factors on the multifractal characteristics of pollutants is also studied [11]. Some scholars have studied the relationship between ozone concentration and meteorological factors, and find that there is an inverse correlation between atmospheric pressure tide and ozone concentration, and the long-distance transportation of favorable meteorological conditions (i.e., low relative humidity, high temperature and solar radiation, zero rainfall) will increase the ozone concentration [12][13]. Zhou et al. [14] showed the characteristics of PM2.5 and O3 atmospheric composition monitoring data and meteorological data, and used the empirical orthogonal function method to classify the types of PM2.5 and O3 pollution in winter and summer in China from 2015 to 2019. The results show that there is a significant positive correlation between the distribution of extreme pollution weather and the distribution of meteorological field [15].
To sum up, most studies focus on the correlation analysis between meteorological factors and pollutants, but do not study the autocorrelation characteristics, and deeply explore the correlation relationship within the time period. First of all, this paper uses the pollutant data of meteorological monitoring stations, takes gaseous pollutants (PM2.5 and SO2) as the research object, and combines the meteorological factor data of the same period to carry out data filtering. Secondly, the autocorrelation analysis of air pollutants is carried out to grasp their own change characteristics. Finally, a correlation analysis method between air pollutants and meteorological factors based on Copula theory is proposed and verified by simulation.

Autocorrelation Analysis Method Based on Pearson Theory
Based on previous studies, Autocorrelation Function (ACF) is selected in this paper to analyze the autocorrelation of variables [16], and its formula is shown as follows: where ρ represents the autocorrelation coefficient; Xt represents the sample data of variable X at time t; Xt-λ represents the sample data lagging behind Xt and with a lag time λ; μ and σ represent the mean and standard deviation of the sample sequence, respectively.

Correlation Analysis Method Based on Copula Theory
Nowadays, with the advancement of Copulas theory research, it can more accurately describe the correlation between variables. Compared with the traditional linear correlation coefficient of Pearson it has more excellent results on the relationship between various nonlinear. The Copula function can construct the joint distribution function without obtaining the specific form of the edge distribution of each study variable [17].  2 3  1  1  2  2  3  3 , , , , = , , For the random variables of meteorological factors and pollutants in this paper, when X-F(X), Y-G(Y), and corresponding Copula function C(u, v) exists, let u = F(x), v = G(y), u, v∈ [0, 1], Kendall rank correlation coefficient τ can be expressed as: The Kendall rank correlation coefficient τ is uniquely determined by the Copula function theory, and then the correlation between meteorological factor X and pollutant Y is calculated. However, there are many kinds of Copula functions, and the rank correlation coefficients obtained by different Copula functions are different. Therefore, the accuracy of correlation measure depends on the goodness of fit of selected Copula functions.
The tail correlation coefficient can be expressed by the Copula function:

Statistical Variation Law of Meteorological Factors and Pollutants Data
In this paper, a county in Jining City, Shandong Province, China is selected as the research area, and a total of 5 environmental monitoring stations are selected. The monitoring frequency is once an hour, and the 24-hour continuous monitoring lasted from January 1, 2016 to December 31, 2020. The research object of air pollutants includes SO2 and fine particulate matter PM2.5, and the mean value represents the monthly concentration.
The monitoring data of meteorological factors come from the Website of China Meteorological Administration, and the time interval and resolution correspond to the environmental monitoring data one by one, mainly including rainfall, temperature and wind speed.

Statistical Variation Law of Meteorological Factors
Meteorological factors are used to reflect the function of atmospheric system, which can reflect the trend of climate change and the impact scale of pollutants in this region to a certain extent. In this section, based on the monitoring data from environmental monitoring stations, the curve of average concentration change of meteorological factors in this area in five years is obtained, as shown in figures 1 and 2.   As can be seen from figures 1 and 2, in the statistical data in the past five years, the monthly average temperature presents a single-peak distribution characteristic, with the peak appearing in July and August and the highest reaching 25℃, and the lowest in January and December. However, the wind speed varies little from season to season. The annual change has a certain time series trend.

Statistical Variation Law of Pollutants
Through the monitoring of the monitoring sites for five consecutive years, the monitored mass concentration of PM2.5 and SO2 showed certain statistical rules between years, as shown in figures 3 and 4.  It is found that the annual variation of the average concentration of different pollutants is different. PM2.5 and SO2 showed a single-peak distribution in the year, with the peak value mostly appearing in January and February, and the trough appearing from June to August.

Autocorrelation Analysis
The autocorrelation analysis of PM2.5 and SO2 is carried out by using equation (1). The ACF values of the two pollutants under different lag times can be obtained from figures 5 and 6. The ACF values of the two pollutants decreased slowly with the growth of lag time, showing typical ACF trailing characteristics, so the concentration of PM2.5 and SO2 has obvious autocorrelation. Meanwhile, when the lag time is less than 2 hours, the minimum ACF value is still greater than 0.6, and in the first half hour, the ACF value is greater than 0.7, indicating that there is a strong autocorrelation between air pollutants in each period in the first hour.  Figure 6. ACF curve of SO2.

Correlation Analysis
After the previous analysis, the research area of atmospheric pollutants is obvious since the correlation, and closely related to pollutant concentration and the seasonal change, although also has certain relation to the intensity of industrial and agricultural production, but the influence of meteorological factor on air quality is much significant, and the pollution sources in the same season changes are also smaller. Therefore, when studying the correlation between pollutants and meteorological factors, this section chooses season as the scale to further significantly study the cross-correlation relationship. As can be seen from table 1, PM2.5 concentration and meteorological factors show significant correlation differences in four seasons. PM2.5 has a positive correlation with air temperature in spring and summer, but it is not obvious, and the correlation is weak in autumn and winter. The correlation between PM2.5 concentration and rainfall is weak, and the correlation coefficient is generally lower than 0.15. PM2.5 and wind speed show a strong negative correlation in a year. From the perspective of weather system, when wind speed is higher, there is less cloud cover and air flow accelerate, which is conducive to the diffusion of PM2.5. As can be seen from table 2, the monthly average concentration of SO2 is negatively correlated with air temperature, rainfall and wind speed in most cases, which are more obvious in summer and autumn. This is because when the area is controlled by low pressure, the higher the temperature, the faster the wind speed and the more rain, the pollutants are easy to diffuse under the control of such pressure field, which can quickly reduce the concentration of pollutants and improve the air quality.

Conclusion
Based on the concentration of PM2.5 and SO2 pollutants and the meteorological factors (temperature, rainfall and wind speed) of environmental monitoring stations, and the autocorrelation analysis of pollutants by the Principle of Person coefficient, this paper uses Copula theory to conduct correlation analysis with meteorological factors of the same period, and draws the following conclusions: (1) By calculating the values of ACF, it is found that the ACF curves of PM2.5 and SO2 have obvious trailing characteristics, and the values are generally greater than 0.6, highlighting the strong autocorrelation within the time period.
(2) The correlation coefficients of two kinds of pollutants and three kinds of meteorological factors in different seasons are calculated and analyzed. PM2.5 has a significant negative correlation with wind speed within a year, but has a low correlation with precipitation and a significant seasonal difference with temperature. SO2 showed negative correlation with most of the three meteorological factors, and the correlation is strong.