Air Quality Index Prediction Based on Improved PSO-BP

For air quality index prediction problem, this paper puts forward the optimization based on improved PSO - BP algorithm, the method using particle swarm weights and threshold of BP neural network is optimized, and the update each particle’s position and speed of the weight of adaptive adjustment strategy, to balance the global optimization and local optimization ability, and it has been verified by experiment that the improved PSO - BP neural network model is compared with PSO -BP and GA - BP and BP on prediction accuracy improved.


Particle swarm algorithm
PSO was first put forward by Kennedy and Eberhart [5] in 1995. When birds are praying, the easiest way to find food is to search the area around the bird closest to the food.The PSO initializes a group of particles in the feasible solution space, and each particle shows a potential optimal solution of the extremal optimization problem. The characteristics of the particle are represented by three indexes of position, velocity and fitness. The fitness value is derived from the fitness function, and the quality of the value represents the quality of the particle [6]. In formula [7], we can find the new update position and speed of particles: k ; i and n are iteration times,particle number and dimension separately; k n i v , is the velocity of the i particle on the n dimension after the k iteration; k n i p , is the optimal position of the i particle in the n dimension after the k iteration; k n g is the position of the global optimal in the n dimension after the k iteration; w is the inertia weight coefficient,usually between with [0.4~0.9],whose value can adjust the convergence speed of particle swarm;  In order to prevent the  blind search of particles, its position and velocity are usually limited within a certain range. PSO has the following characteristics: it can remember the best position in history and transfer it to other particles;It relies on the particle speed to complete the search. In the iterative process, only the optimal particle information is passed to other particles, so the search speed is fast.Less adjustable parameters and simple structure.

BP neural network
The basic principle of the BP [8] is: input a series of neurons in the input layer into the hidden layer after weighting, hidden layer neurons after summarizing all the input minus the threshold, and then produced by transfer function response output, through the next layer connection weights weighted loss to the output layer, summary output layer neurons after all inputs to produce a response output.By comparing it with the expected value, the error value is returned, and the weight and threshold of the neural network are adjusted according to the predicted error value, so the predicted output of BP approximates the expected output.We can find the BP is topological structure in Figure 1   The input value and predicted value of the BP are the independent variable and the dependent variable of the function respectively, which are non-linear functions.The BP should be trained before prediction. Through training, the network has associative memory and prediction functions.

PSO-BP
Due to the BP has the weak points of low convergence efficiency and easily trapped in local minima, compared with the genetic algorithm and particle swarm algorithm without the process of "crossing" and "variation", is simpler, and has global search optimal, bright characteristics such as high precision and fast convergence speed, to enchance the prediction accuracy, this paper will combine PSO and the BP [9] their separate advantages, establish and optimize AQI prediction model. PSO -BP model is based on PSO to optimize the BP weights and thresholds, each particle represents the weight and threshold of BP. The optimal initial weights and thresholds of the network are found by particle optimization, and the values obtained by PSO algorithm are assigned to the neural network for training and prediction, it can make PSO -BP [10]

IPSO-BP
When the speed weight value w is large, the global optimization ability is strong but accuracy is low, while the local optimization ability is weak and the convergence speed is fast.When the value w is small, the local search ability is strong, but the global search ability is weak and the convergence speed is slow.Therefore, an IPSO-BP based on adaptive weight [11] is proposed. Adaptive weight value w is adopted for the position and velocity of each particle to balance global optimization and local optimization. The steps of this method are as follows: (1) Calculate the average fitness value of the population particles avg f ; (2) When the fitness value f is greater than the average fitness avg f , the weight value w changes with avg f , min f linearly between the maximum and minimum speed weights. The updating formula [12] based on adaptive weight w is shown in Formula 3 below.
Among them, f . avg f and min f are divided into the current fitness value, average fitness value and minimum fitness value, w , max w and min w are the current speed weight, maximum speed weight and minimum speed weight respectively. The prediction model of AIR quality index [13] was built according to IPSO-BP neural network. The specific flow chart is shown in Figure 2. In the setting of the parameters of the prediction model, for achieve rapid convergence and prevent overfitting, the learning rate of the network is set as 0.1, the error is set as 0.00001, and the training times are set as 100.The IPSO-BP is similar to PSO-BP,with three layers: input, hidden and output layer. The hidden layer adopts the method of double hidden layer and sets the number of nodes as [31,1]. Compared with the single hidden layer, the IPSO-BP neural network has strong generalization ability and high prediction accuracy.The average absolute error [14] is used to evaluate the prediction The algorithms used in the full text are all implemented in the MATLABR2019b environment.

Data collection
The data used in this paper are all from the Air Quality Data Enquiry Network.The specific data of PM 2.5 . PM 10 . SO 2 and NO 2 (numerical unit :) for a total of 334 days from January 1 to November 29, 2020 in tianjin are used. This data is true and reliable, and there is no invalid data.In order to obtain reasonable prediction results, the contents of PM 2.5 , PM 10 , SO 2 and NO 2 were studied and predicted respectively, and then the comprehensive error was considered, without considering the influence of other factors.We can find the original data in Table 1.

Data preprocessing
For the original monitoring data, in order to improve the processing effect of the model, a normalized processing method is adopted. The specific processing formula [15] (5) The normalized data interval is between [0.2,0.95], y is the processed data, y is the original data, min y is the minimum value in the sample data, max y is the maximum value in the sample.The normalized data after data processing are shown in Table 2. For the 334 groups of data in Table 2, the first 317 groups and the last 17 groups are training and test samples separately.

Experimental results and analysis
In this paper, 317 groups of data collected from 334 groups were set as training samples, and 17 groups of data were set as test samples.Through the IPSO-BP prediction model, the MAE of PM2.5, PM10, SO2 and NO2 and the MAE of these four groups of overall samples were respectively obtained, and the prediction results of BP, GA-BP, PSO-BP and the IPSO-BP were compared. We can find in Table 3. The trend of the actual measured and predicted values of air quality index (PM 2.5 . PM 10 . SO 2 . NO 2 ) in Tianjin from January 1 to November 29 on the IPSO-BP training set and test set is shown in Figure  3 and Figure 4 below. The experimental results show that the IPSO-BP has a smaller mean error value, which reflects the actual prediction error, and further verifies the superiority of the IPSO-BP.

Conclusions
So far, people pay more attention to environmental quality issues and the monitoring of important AQI. Therefore, it is important to effectively predict AQI.The prediction model based on BP [3] has achieved good results, but it has some defects and needs to be improved.In this paper, the particles search the optimal initial value and threshold value of BP, the advantage of global searching of particle swarm to the initial value and threshold value of BP, and build an IPSO -BP forecasting model, apply it to Tianjin AQI prediction, the experimental data show that by this way the MAE is smaller, and enchances the prediction accuracy.