Ambient Air Quality Classification by Grey Wolf Optimizer Based Support Vector Machine

With the development of society along with an escalating population, the concerns regarding public health have cropped up. The quality of air becomes primary concern regarding constant increase in the number of vehicles and industrial development. With this concern, several indices have been proposed to indicate the pollutant concentrations. In this paper, we present a mathematical framework to formulate a Cumulative Index (CI) on the basis of an individual concentration of four major pollutants (SO2, NO2, PM2.5, and PM10). Further, a supervised learning algorithm based classifier is proposed. This classifier employs support vector machine (SVM) to classify air quality into two types, that is, good or harmful. The potential inputs for this classifier are the calculated values of CIs. The efficacy of the classifier is tested on the real data of three locations: Kolkata, Delhi, and Bhopal. It is observed that the classifier performs well to classify the quality of air.


Introduction
Air pollution is a critical issue that influences the health of urban population. The problem becomes prominent with an exponential increase in the population and continuous industrial development. Many problems, namely, deforestation, waste management, solid waste disposal, and the release of toxic materials, contributed to and influenced the quality of air around us. With this concern, air quality assessment has become a potential area of research. The Air Quality Index is a numerical indicator used by agencies to assess the concentration of various pollutants in the air [1]. Table 1 shows the effect of various pollutants on human health.
In recent years, various methods have been developed by the researchers to assess air quality. A few of them are based on the calculation of indices. A rich survey of these indices was presented in [1]. The main objective of the index formulation is to transform the concentration of the major air pollutants into single numerical value which can be used as a representation of the air quality. Along with this, index should be easily understandable, based on the National Ambient Air Quality Standards (NAAQS).
The principal component based regression technique was proposed for the prediction of air quality in [2]. The authors developed a combination of the principal component based Multiple Linear Regression (MLR) model to forecast the air quality in Delhi [2]. A Monte Carlo dispersion model was proposed in [3]. The work addressed low wind speed meandering conditions, where the Gaussian prediction models loose relevance. The Artificial Neural Network (ANN) based prediction was performed by Elangasinghe et al. [4]. They employed ANN to predict NO 2 concentration near Auckland Highway. The metrological variables, namely, wind speed, wind direction, solar radiation, temperature, and humidity as well as time frames, were also considered by the author to generate accurate predictions. Forecasting and prediction of troposphere ozone episodes were recently reported through MLR for Delhi [5]. A comparative study of different topologies of neural networks, namely, Layer Recurrent Neural Network (LRNN), Feed Forward Backpropagation (FFBP), and many more, was performed in the prediction of ground level ozone concentration in many researches. Different meteorological factors, namely, relative humidity, temperature, wind speed, and wind direction, were 2 Journal of Environmental and Public Health The presence of high levels of SO 2 has an adverse effect on human health. The exposure to high level of SO 2 may lead to bronchitis, heart issues, respiratory illness, and asthma.
(1) This gas is an outcome of oxidation of sulphur.
(2) The major source of this gas is petroleum products, coal, and volcano eruptions.
The presence of high levels of NO 2 in the air leads to acid rains. This corrodes metal structures like bridges, destroys buildings, with harmful effect on aquatic life due to acid formation.
(1) This gas is an outcome of oxidation of nitrogen monoxide.
(2) The major sources of NO 2 are agricultural processes, burning of fossil fuels, biomass burning, industries, human sewage, and atmospheric deposition.
SPM (PM 2.5 ) (suspended particulate matter) The presence of high levels of SPM leads to cardiovascular diseases and respiratory issues, namely bronchitis, asthma, and lung cancer.
(1) Particulate matter is a term used for solid particles and liquid droplets found in the air.
(2) Major sources of SPM are fuel combustion, power plants, and emission from diesel buses and trucks RSPM (PM 10 ) (respirable suspended particulate matter) The presence of high levels of RSPM leads to cardiovascular diseases and respiratory issues, namely, bronchitis, asthma, and lung cancer [6,7].
(1) The term RSPM is a composition of dust particles, industrial waste, and combustions.
(2) PM10 is term used to describe tiny particles in the air, made up of a complex mixture of soot, organic, and inorganic materials having a particle size less than or equal to 10 microns' diameter. employed to forecast the ozone concentrations. It is observed that supervised learning based approaches are more relevant and accurate for forecasting the pollutant concentrations because of their ability to do parallel data processing with high accuracy and fast response and the capability to model dynamic, nonlinear, and noisy data. It is interesting to observe that the approaches employed in this direction are based on the accuracy of the forecasting engine performance. It is also worth mentioning here that the accuracy of the forecasting engine reduces when the span of forecast increases. Environmental problems and their solutions should be based on long term forecasting of the air quality. However, there is less work reported on the classification of air quality. The classification rules are based on the numerical values of the concentration of pollutants. In [2] the concentrations of SO 2 , NO 2 , PM 2.5 , and PM 10 were employed to derive a classification rule which can classify the quality of air and further forecast the quality. It is easy to classify the air quality on the basis of individual pollutants. However, there is an acute need of an index, which can cumulatively denote the quality of the air. In this paper numerical values of Air Quality Index (AQI) of SO 2 , NO 2 , PM 2.5 , and PM 10 are computed and a mathematical framework is presented to form a Cumulative Index (CI). The linear discriminant functions for these mathematical combinations are decided through SVM.
On the basis of critical literature review, the following are the research objectives for this study: (1) To present a mathematical framework in the formulation of CI by employing the numerical values of Air Quality Indices of (SO 2 , NO 2 , PM 2.5 , and PM 10 ) (2) To present a comparative analysis of the proposed CI with existing air quality indices In the next section, the details of CI are incorporated. This section also discusses AQI. Section 3 describes the design of support vector machine along with the technical details of GWO. Section 4 exhibits the numerical results along with the classification efficiency of different classifiers. Section 5 presents the major highlights of this work in a conclusive form.

CI and Mathematical Framework
In the past, several indices have been proposed to indicate the quality of air through numerical values. Pollution index has been proposed by Cannistraro and Ponterio [10]. The index is based on the concentration of two critical pollutants. It computes the mean values of the concentration of these pollutants. The problem with this index is that it does not include concentration of all pollutants in computation. Air Quality Depreciation Index (AQDI) [11] has been proposed to determine the depreciation in the quality of air as compared to the standard values. The value of this index is not easy to calculate due to its dependence on weights. An Integral Air Pollutant Index (IAPI) has been proposed by Bezuglaya et al. [8]. The calculation of IAPI is based on subindex values of pollutants; the function used in this index is ambiguous in nature and can lead to an index in hazardous category which may be a false alarm. Table 2 shows the calculation of this index. It has been observed that, due to its dependence on the subindex values of pollutants, the index is less sensitive to pollutant concentrations. Aggregate Air Quality Index (AAQI) has been proposed by Kyrkilis et al. [9]. Although this index is considered the effect of five pollutants at a time, the heavy computation burden on processing engines questioned the feasibility of this index. Fuzzy Air Quality Index (FAQI) has been proposed by Mandal et al. [12]. This index was based on the relationship between AQI values and output parameters but the index inherited the pitfall associated with fuzzy logic. In this table, we have shown a few samples of pollutants concentrations from different cities of India [13]. Table 2 presents a comparative analysis of different pollution indices; it has been observed that the values of IAPI are the same for a few samples although the concentrations of pollutants are different from each other. This indicates that this index is less sensitive towards pollutant concentrations and, hence, is not able to define a crisp classification boundary for air quality. In the calculation of AAQI we have observed that, with the variation in values of from 2 to 50, the values of this index fall in a narrow range. However, no clear methodology has been presented for the choice of .
We have experimented with the fractional values of and it has been observed that the values of this index are higher than the previous values where was the integer. With these pitfalls in the existing air quality indexing system, a new index is required to draw a crisp classification boundary for an ambient air quality considering all pollutants concentrations equally at the same time.
We propose a new Cumulative Index, which has the following attributes: (i) It is easy to understand and follow the NAAQS (as it is based on AQI of the pollutants).
(ii) It does not suffer from eclipsing and ambiguity. This index can be used as an alert system as it is based on valid air quality data monitored from various air quality measurement stations located in densely populated cities of India.

(iii) This index is computationally efficient and puts less burden on computational engines
In the work five classes 1 , 2 , 3 , 4 , and 5 are proposed on the basis of concentrations of pollutants [2]. AQI for any pollutant is given by the following mathematical equation: where pollutant is the actual concentration of any pollutant, Pollutant hi and Pollutant lo are the high and low values corresponding to the break point hi is the subindex value corresponding to Pollutant hi , and lo is the subindex value corresponding to Pollutant lo of the given pollutant. The range values with superscript "upper" denote the upper limit of the concentration of the pollutant. Similarly, the range values with superscript "lower" denote the lower limit. The classification rule is given below: AQI Pollutants ∈ 1 ⇒ the concentration is healthy.
AQI Pollutants ∈ 4 ⇒ the concentration is highly unhealthy.
AQI Pollutants ∈ 5 ⇒ the concentration is dangerous. Now, we calculate CI for different cases where AQI Pollutants belongs to different classes. In this paper we employ four major pollutant concentrations (SO 2 , NO 2 , PM 2.5 , and PM 10 ). The value of CI should be such that it increases with the increase in individual pollutant and undergoes sharp growth when one or more pollutant concentrations lie in harmful range. We propose a mathematical function, which possesses both the qualities and can be used as an efficient indicator of air quality.

Design of Supervised Learning Model
In recent years application of SVMs in classification problems has increased due to its capability of segregation of datasets by the best hyperplane. SVMs are applied for multidimensional data classification [14], classification of microarrays [15], wind speed prediction [16], voltage stability monitoring [17], classification of power quality events [18], and contingency ranking [19]. The main reason behind this popularity of the SVMs as a classifier is that SVM can handle large feature space. The efficacy of the classification is not hindered by large dimension input feature space. However, on the other hand the efficiency of the other classifiers is dependent on the dimensions of the input feature space [17]. This distinct feature is a primary reason for the employment of the SVMs in large classification problems. Kernel functions are employed to transfer the input space of data to nonlinear high dimensional data. A sparse prediction function is generated by choosing a selected number of points and these points are support vectors (SVs). Two noteworthy features of SVMs are structural risk minimization and a fair tradeoff between empirical error and model complexity.
Let the dimensional inputs ( = 1, 2, 3, . . . , ), where is the number of samples, belong to class 1 and class 2. Associated labels are = 1 and = −1. A linear hyperplane which separates this data can be written as ( ) = 0, which can be determined by the following equation: where is dimensional vector and is scalar. These two parameters determine the location of hyperplane. The constraints are ( ) ≥ 1 if ≥ 1 and ( ) ≥ −1 if ≥ −1. The separating data plane which generates the maximum distance between the nearest data and the plane is called optimal separating hyperplane. Geometric margin ‖ ‖ −2 and insensitive loss function are the most important parameters in SVM design. Let the errors between predicted results and targets be visualized by ; that is, The convex optimization problem is now converted as the minimization of the geometric margin subjected to minimization of the error between predicted and simulated output; that is, Since the problems of air quality classification are data specific and depend upon the location, it is necessary to add a nonnegative slack variable . The convex optimization problem can be converted as follows: To find the optimal values, problem (7) can be rearranged by Lagrangian saddle point method; that is, Optimization problem is solved with respect to the primal variables , , , and . In general RBF is the most common choice as the mercer kernel function because of its Gaussian function form.
On the basis of ambient air quality of India reported in [13] along with the CI proposed in the previous section, we design a supervised learning based model for classification of the ambient air quality. For this, an example dataset has been created by varying the concentration of pollutants under normal distribution. The range for the concentrations is tabulated in Table 3.
First this dataset has been employed to calculate CI, and then these values are employed as the input of the supervised learning model. The target data are the numerical  values (1, −1) for good and harmful air quality, respectively. To train this model 1000 data points are considered; out of these data points 70% are used for training. The remaining 15-15% are used for testing and validation purpose. Figures 1-4 show the AQI variations of SO 2 , NO 2 , PM 2.5 , and PM 10 of example dataset. It is observed that the values taken in this set are a close replica of the concentrations as per the Indian scenario. Table 4 shows the calculated values of CI for the example dataset. All the concentrations given in figures and tables are in g/m 3 .
It is observed that the samples exhibited in Table 4 are a close replica of actual air quality in industrial and residential areas. The following points emerged from these samples.
(1) Out of 1000 samples 16 extreme cases are exhibited in this table. As per the data of air quality of Nizamuddin, Delhi, these values are realistic as the range for pollutant PM 2.5 falls within 14-300, and for PM 10 it is 18-890 [13]. This vast range motivates us to employ normal distribution for these two particular pollutants. The concentrations of the pollutants SO 2 and NO 2 fall in safer range. With the help of this dataset, we propose a classifier based on SVM. As it is a known fact that original SVM is a two-class separator, we employed this model to segregate air quality into two types: good and harmful. The classification rule is derived by the fact that either one of the four pollutant concentrations must lie in range 3 or higher or at least concentrations of two major pollutants lie in range 2 or higher. The kernel parameter and bias are calculated through Grey Wolf Optimizer (GWO) Algorithm. An optimization routine is established in order to calculate the kernel parameters. The aim of this optimization routine is to maximize the classification efficacy of the classifier.
where is regularization parameter and is kernel parameter.
The following section presents the details of GWO.

Grey Wolf Optimizer
A recent population based swarm intelligence technique, called Grey Wolf Optimizer, inspired by the nature of grey  wolf is discussed here. This technique was proposed by Mirjalili et al. [20] in 2014. In GWO the leadership hierarchy and the hunting behavior of grey wolf are mimicked. GWO overcomes the possibility of local optimal solutions and has greater exploration and shared information about the search space. Grey wolves are basically categorized into four groups, namely, alpha, beta, delta, and omega, for the simulation of leadership hierarchy. The three important steps of hunting, searching for prey, encircling the prey, and attacking towards prey, are employed to carry out the optimization. Alphas are the leaders of the pack. Alphas are decision makers regarding hunting, sleeping place, time to wake up, and so forth, and that decision is followed by the pack. Hence, alpha wolf is also known as the dominant wolf. Alpha is not essentially the strongest member but good in the organization and at discipline of the pack. Beta comes at the second level on the hierarchy of grey wolves. Betas help alpha wolves in the decision making and the activities of the pack. Betas are the best candidates to get the position of alpha in case alpha wolves pass away or become very old. The beta supports alpha's command throughout the pack.
Omega wolves have the lowest ranking in the pack. They always have to surrender to all other dominant wolves. Omega is not a main member, but, in a wolf pack, loss of an omega wolf causes the internal issues.
If a wolf does not fall in the above specified levels then he/she is delta wolf. Delta wolves have to submit before alpha and beta but they dominate omega. Scouts, elders, hunters, sentinel, and care takers belong to this group. According to Muro et al. [21] the main stages of grey wolf hunting are as follows: (i) Tracking, chasing, and approaching the prey (ii) Pursuing, encircling, and harassing the prey (iii) Attacking towards the prey.
In the mathematical modeling of social hierarchy of wolf, alpha ( ) is considered as the fittest solution, and beta ( ) and delta ( ) are the second and the third best fittest solutions, respectively, in the designing of GWO. The rest of the candidates solutions are considered as omega ( ). The hunting is guided by , , and . The wolves follow , , and wolves. Recently the application of GWO algorithm has been conducted on Automatic Generation Control [22] and Multilayer Perceptron Training [23] and many more.
where represents current iteration, → and → are coefficient vectors, → is the position vector of the prey, and → is the position vector of grey wolf. 8

Journal of Environmental and Public Health
The vectors → and → can be calculated as follows: The components of → are decreased linearly from 2 to 0 over the course of iterations and 1 and 2 are random vectors in [0, 1].
(b) Hunting for the Prey. During hunting, the first three best solutions ( , , and ) obtained are saved and coerce the other search agents (including the omega) to update their positions according to the best search agent. The following are the proposed formulae: → ( + 1) = 1 + 2 + 3 3 .
(13) Figure 5 shows the updating position of search agent according to the alpha, beta, and delta. It can be observed that alpha, beta, and delta estimate the position of the prey and other wolves update their position stochastically around the prey and the final position is random within the circle.
(c) Attacking the Prey. When the prey stops moving, the grey wolf finishes their hunt by attacking. Mathematically, while approaching towards the prey, the value of → decreases. Hence the fluctuation range of → is also decreased by → . (d) Searching for the Prey. The searching of grey wolves depends on the position of the alpha, beta, and delta. For searching, they diverge from each other. Mathematically → varies with random values greater than 1 or less than −1 to oblige the search agent to diverge from the prey. This brings out the exploration and allows GWO algorithm to search globally. If | | > 1, grey wolves diverge from the prey to find the fitter prey.
4.1. Methodology. As described in (10) the SVM design is carried out by GWO with the aim of maximizing classification accuracy. The major parameters of this optimization process are choice of kernels, kernel parameter, and bias parameter. These parameters decide orientation and placement of the support vector in hyperplane. The GWO searches parameter space for an optimal solution by choosing proper kernels; in this work we have taken four kernels, namely, Radial Basis Function (RBF) and linear and quadratic and polynomial kernels. This optimization process has a stopping criterion of 1000 iterations along with the error tolerance 1 −3. The values of parameters, calculated for population size 30, are = 10 and = 0.3 along with RBF as the choice of kernel. Figure 6 shows leadership hierarchy of grey wolves. To see how GWO solves the optimization process few points are described below: (a) The search process starts with creating a random population of grey wolves. Over the course of iterations alpha, beta, and delta wolves search for candidate solutions and in this work the solutions are in terms of choice of kernel and bias parameter and kernel parameters.
(b) GWO are based on the philosophy "follow the leader." The shown hierarchy in Figure 6 assists GWO to save the best solutions obtained in terms of bias, kernel parameters, and choice of kernels over the course of iterations.
(c) Encircling mechanism of GWO defines a circle shaped neighborhood around the solutions, which can be extended to higher dimensions.
(d) Exploration and exploitation are guaranteed by adaptive values of and .

Simulation Results
This section presents the classification results of air quality by the proposed supervised learning model. The efficacy of the proposed model is tested over the real data of the state of Madhya Pradesh (Bhopal), West Bengal (Kolkata), and Delhi. The historical data of ambient air quality is taken from [13]. It is observed that the proposed classifier is able to classify the quality of air. The modeling of the system and simulation studies are performed over Intel5 core6, i7, 2.9 GHz 4.00 GB RAM processor unit. Figure 7 shows the comparative analysis of different pollutant concentrations along with the example dataset. The Central Pollution Control Board (CPCB), India, is executing a nationwide programme of ambient air quality monitoring known as National Air Quality Monitoring Programme (NAMP). The network consists of three hundred and forty-two (342) operating stations covering one hundred and twenty-seven (127) cities/towns in twenty-six (26) states and four (4) Union Territories of the country [24].
To determine the concentration of NO 2 , chemiluminescence technique is used to measure total oxides of nitrogen (NO ) by passing the sample over a heated catalyst to reduce all oxides of nitrogen to NO. Ultraviolet (UV) fluorescence technique is used to determine SO 2 concentrations. This technique is based on the fact that SO 2 molecules absorb UV light and become excited at one wavelength [25]. The intensity of UV light is proportional to SO 2 concentrations. For the calculations of concentrations of SPM and RSPM Beta-Ray Absorption Light Scattering technique is used. The beta-ray absorption method is the most popular method of SPM measurement. The measurements of the beta-rays are performed with the help of beta-ray analyzer [25]. The absorption rate of beta-rays is proportional to the concentrations of SPM. An indicated value as a mass concentration is obtained from the increase of the absorption amount of beta-rays due to particles collection on filterpaper. With the enhancement of microprocessor technology, the modern gas analyzers have become sophisticated data collection nodes. These analyzers possess features like being self-automatic, self-monitoring, enabled with programmable calibration routines, and enabled with alarm circuits and are associated with high memory for data collection [25]. Tables 5, 6, and 7 show the classification results for the state of Madhya Pradesh, West Bengal, and New Delhi. Table 5 shows the classification results of air quality of Govindpura, industrial area Akun, Bhopal (latitude is 23 ∘ .15 N and the longitude is 77 ∘ .27 E) for the year 2015. This industrial area possesses manufacturing industries, chemical industries, Grahudyog, electroplating industries, fabrication industries, and many more. Only a few samples are exhibited due to limitation of space. It is observed that, as per the results of supervised learning engine, the results with negative denominator indicate poor air quality. The values of CIs are higher: 1722 and 1568.93. It is also observed that deriving factor in this analysis is the concentration of PM 10 . Tables 5 and 6 show the results for Nizamuddin, Delhi (Residential Area) (latitude 28 ∘ 35 N and longitude 77 ∘ 14 E), Shahzada Bagh, Delhi (latitude is 28 ∘ .40 N and the longitude is 77 ∘ .16 E) (industrial area), and Moulali Kolkata (22 ∘ .33 N and the longitude is 88 ∘ .21 E) for year 2015.
The highest concentration of PM 10 and PM 2.5 is the primary reason for higher values of CI in Table 6. For clarity, both the industrial area and residential areas of Delhi are chosen. Shahzadabagh industrial area houses plastic industries, clothing industries, manufacturing industries, footwear industries, and many more. Similarly, Nizamuddin area possesses vehicular pollution and it is also overcrowded. From Table 7 it can be observed that according to the values of CIs the values with negative sign are critical and concentration values of the pollutants are is also verified by the concentration values of the pollutant. It is again observed from Figures 8 and 9 that AQI of PM 10 go along with CI.   Total 139 samples have been chosen for testing the supervised engine in case of Delhi. It can be observed from Figure 9 that pollutant concentrations are quite hazardous. It is observed that although the concentration of PM 10 is a major driver in this calculation, the concentrations of other pollutants which are not in the safer limit contribute to the values of CIs. Hence, CI is a representative of overall concentrations of the pollutants. Moreover, the policy makers can take a collective decision on the basis of the values of CIs at different areas. For prompt actions the proposed  are harmful for respiratory system, has become a major motivation for this paper. A supervised learning model based on the example dataset has been prepared and tested over three different meteorological stations. The following are the conclusions: (a) A mathematical framework for CI is proposed to employ the AQIs of different pollutants. This index is computationally efficient and understandable. (b) Supervised learning model based on SVM has been developed with the help of example dataset and the values of CIs calculated. A two-class SVM has been designed. To design the SVM module GWO has been employed for parameter estimation with the aim of maximizing classification accuracy. (c) The proposed architecture has been tested over the real data of Delhi, Bhopal, and Kolkata. It has been observed that the values of CIs and the classification results obtained from supervised learning model are aligned.
To develop a forecast engine for predicting the concentration on the basis of CI values lies within the scope of future work.

Conflicts of Interest
The authors declare that they have no conflicts of interest.