A Fuzzy Soft Model for Haze Pollution Management in Northern Thailand

In this article, we propose fuzzy soft models for decisionmaking in the haze pollutionmanagement.)emain aims of this research are (i) to provide a haze warning system based on real-time atmospheric data and (ii) to identify themost hazardous location of the study area. PM10 is used as the severity index of the problem. )e efficiency of the model is justified by the prediction accuracy ratio based on the real data from 1 January 2016 to 31 May 2016.)e fuzzy soft theory is modified in order to makemodels more suitable for the problems. )e results show that our fuzzy models improve the prediction accuracy ratio compared to the prediction based on PM10 density only. )is work illustrates a fuzzy analysis that has the capability to simulate the unknown relations between a set of atmospheric and environmental parameters. )e study area covers eight provinces in the northern region of)ailand, where the problem severely occurs every year during the dry season. Seven principle parameters are considered in the model, which are PM10 density, air pressure, relative humidity, wind speed, rainfall, temperature, and topography.


Introduction
Pollution problems are inevitably a global concern of the 21 st century. Over the past decade, polluted haze has become a major problem in the northern region of ailand and surrounded countries. In March 2019, the problem reached a crisis when the daily average PM2.5 and PM10 (particulate matter of 2.5 microns and 10 microns in diameter or smaller) density rates were well beyond the national standard of 25 μg/m 3 [3]. is situation has occurred every year on dry season, from January to May, and generally reached its peak in March. During this period, a large amount of particulate matters are released into the atmosphere, including carbon monoxide, carbon dioxide, volatile organic compounds, and carcinogenic polycyclic aromatic hydrocarbons [4]. e main emission source is biomass open burning, such as forest fires, solid waste burning, and agricultural residue field burning [5,6].
is problem has a significant effect on human health, local traveling industry, and the economy as a whole, especially in Chiang Mai province, a popular tourist destination. e public health ministry of ailand has reported an increase in bronchial asthma and respiratory diseases in people living in these areas. In addition, these fine particles contain carcinogenic polycyclic aromatic hydrocarbons that can induce lung cancer [7]. e smoke haze episodes also reduce visibility and cause a variety of environmental effects which eventually leads to decline in various economic sectors such as tourism, transportation, and agriculture. ai government has launched various policies to get the smoke haze problem under control. However, the problem still continues to grow, even with the enforcement of outdoor burning ban issued by ai government during February to April period.
Apparently, the atmospheric parameters and topography play the key parts of the problem. e air pollutants are trapped near ground level due to the meteorological conditions (e.g., stagnant air), and the basin-like topography surrounded by high mountain ranges results in restricted pollution dispersion. Moreover, low rainfall in dry season also adds on to the severity of the haze problem. For this reason, the leaching of smoke or dust particles in the air is low [6]. ese conditions caused the air pollutants to flow out difficultly and the particle cannot be easily escaped from the area. Notably, there are some technologies that mitigate the pollution problem. However, the costs of devices are considerably expensive.
Undoubtedly, an efficient warning system would become a major help in the haze problem management. e system will significantly improve public safety and mitigate damage caused. e Goddard Earth Observing System Model Version 5 (GEOS-5) is currently one of the widely used pollution prediction models developed by NASA's research team.
In this article, the potential use of fuzzy soft set theory in real-time haze warning is investigated. e main aims of this research are (i) to provide a haze warning system based on real-time atmospheric data and (ii) to identify the most hazardous location of the study area.
e benefits are to create the awareness for people in the affected area and to suggest the location to establish pollution mitigation devices. Molodtsov [8][9][10] initiated the concept of soft set theory as a new mathematical tool for dealing with uncertainties. Soft set theory has rich potential for applications in several directions, a few of which had been shown by Molodtsov [8].
e idea of applying fuzzy soft set theory in atmospheric models is already considered concerning the applications to air pollution management [11][12][13][14] and water management [15][16][17][18][19][20]. However, it is believed that air pollution models may be different for each region due to many several factors [21,22]. erefore, existing models still need to be restudied. Up to our knowledge, there are only a few prediction models in the region of study since the main concerns are on the site of environmental science. e prediction results from GEOS-5 model are popular choices to be used as benchmarks for environmental scientist. e regional-developed models include a logistic regression model [23] and Geographic Information System-(GIS-) based model [24]. erefore, our model would offer an alternative prediction model for the haze pollution problem. e study location covered eight provinces in the northern part of ailand where haze problem has severely occurred: Mae Hong Son, Chiang Mai, Lamphun, Chiang Rai, Phayao, Lampang, Phrae, and Nan. e density of PM10 is used as severity index of the haze pollution level. Additionally, seven principle parameters are considered in the model: six are atmospheric parameters-PM10, air pressure, relative humidity, wind speed, rainfall, and temperature-and the other one is the topographic parameter. All atmospheric data are obtained from the Pollution Control Department [1]. e obtained data period is from 1 st January 2016 to 31 st May 2016. e rest of this article is organized as follows. In Section 2, we explain the methodology and present some examples. In Section 3, we describe the setup of the model, which includes the study location, the data, and the parameters. en, we present our decision-making results and discussion in Section 4. Finally, the conclusion is given in Section 5.

Fuzzy Soft eory.
In this section, we provide useful notations of soft sets and fuzzy soft sets. Let U � L 1 , L 2 , . . . , L m be an initial universal set and let E � P 1 , P 2 , . . . , P n be a set of parameters.
Definition 1 (see [8]). Let P(U) denote the power set of U and A ⊂ E. A pair (F, A) is called a soft set over U, where F is a mapping given by F : A ⟶ P(U).
Example 1. Let the initial universe U � L 1 , L 2 , . . . , L 8 be the eight selected provinces in the northern region of ailand: Mae Hong Son, Chiang Mai, Lamphun, Chiang Rai, Phayao, Lampang, Phrae, and Nan. Moreover, let E � P 1 , P 2 , P 3 , P 4 be atmospheric parameters: PM10 density, air pressure, relative humidity, and wind speed, respectively. en, an example of possible soft set is (1) Note that each approximation has two parts, predicate p and approximate value set. For example, the predicate is PM10 density and the approximate value set is L 1 , L 2 , L 3 for F(P 1 ). Additionally, the summary information of this soft set is represented in Table 1.
Definition 2 (see [25]). Let Ψ(U) denote the set of all fuzzy sets of U and let A i ⊂ E. A pair (F i , A i ) is called a fuzzy soft set over U, where F i is a mapping given by Example 2. We consider the same setup as in Example 1. An example of a fuzzy soft set is (2) Table 2 provides the summary information of this fuzzy soft set.
Definition 3. For a given fuzzy soft set with a universal set U and parameter set P, we denote l ij as the membership value of L i in F(P j ).

Advances in Fuzzy Systems
Definition 4 (see [25]). For a given fuzzy soft set, the choice value of L i is defined by Definition 5 (see [25]). For a given fuzzy soft set, the comparison table is the n × n table, in which the entry e ij is the number of parameters for which the membership value of L i exceeds or equals the membership value of L j . Both row and column of the table are labelled by the elements of the universal set.

Remark 1
(1) Each main diagonal element of a comparison table is always equal to n.
(2) 0 ≤ e ij ≤ n for all i, j Definition 6 (see [25]) (i) Impact indicator of L i is the sum of all values on row i on the comparison table. is can be calculated by the following formula: (ii) Divider indicator of L j is the sum of all values on column j on the comparison table. is can be calculated by the following formula: (iii) e score value of L i is defined as Both values can be used as evaluations in a decision making. However, according to Kong et al. [26], it is possible that these values may lead to different decision results. erefore, they introduced grey relational grade, a new evaluation indicator that combines both information of score values and choice values, to make the decision making more robust. e calculation algorithm of grey relational grade is briefly presented.
(1) Input the choice value sequence (c 1 , c 2 , . . . , c m ) and the score sequence (s 1 , s 2 , . . . , s m ) where c i and s i are the choice value and the score value of L i , respectively.
(2) Calculate grey relational generating values: (3) Calculate grey difference information: (4) Calculate grey relative coefficients:  Label PM10 density Air pressure Humidity Wind speed Advances in Fuzzy Systems 3 (5) Calculate grey relational grade: Optimal choices may have more than one if there are more than one element corresponding to the maximum.
Decision making based on score values and choice values relies on the assumption that the parameters are equally important. However, in some decision-making problems, some parameters can be more important than the others. erefore, we propose new definitions of choice values and score values based on weight information. Note that idea of weighted score value is briefly discussed in Maji et al. [9].
Define a weight w :� (w 1 , w 2 , . . . , w n ) as weight sequence of parameters where w i is the weight associated with the parameter P i .

Definition 7.
For a given fuzzy soft set and a weight w, the weighted choice value of L i is defined by Definition 8. For a given fuzzy soft set and a weight w, the weighted comparison table is the n × n table, in which the entry e ij is calculated by the following formula: where 1(·) is an indicator function defined by In other words, this is the weighted sum of parameters which the membership value of L i exceeds or equals the membership value of L j . Both row and column of the table are labelled by the elements of the universal set.

Remark 2
(1) Each main diagonal element of a weighted comparison table is always equal to the sum k w k .

Particle Swarm Optimization.
e particle swarm optimization (PSO) algorithm is a metaheuristic algorithm based on the concept of swarm intelligence. e algorithm was proposed in 1995 by Kennedy and Eberhart [27]. PSO is metaheuristic as it makes few or no assumptions about the problem being optimized and can search very large spaces of candidate solutions. Also, PSO does not use the gradient of the problem being optimized, which means PSO does not require that the optimization problem be differentiable as is required by classic optimization methods such as gradient descent and quasi-Newton methods. Also, it is capable of solving complex mathematics problems existing in engineering [28].
is method is now available to use in computer packages such as Matlab or R.

e Study Area.
Our study area is in the northern region of ailand, the haze pollution affected area. e region, approximately 94,000 km 2 in size and six million in population, consists of nine provinces: Mae Hong Son, Chiang Mai, Lamphun, Chiang Rai, Phayao, Lampang, Phrae, Nan, and Uttaradit. For this case study, Uttaradit was excluded since its haze problem was not severe. e study area is geographically characterised by several mountain ranges, which continue from the Shan Hills in bordering Myanmar to Laos, and the river valleys which cut through them. e basins of rivers Ping, Wang, Yom, and Nan run from north to south. e basins cut across the mountains of two great ranges, the anon ong Chai Range in the west and the Phi Pan Nam in the east. All studied provinces lie between these basins. e elevations are generally moderate, a little above 2,000 metres (6,600 ft) for the highest summit. Table 6 provides the geographic information summary of each province. e latitudes and longitudes shown are the locations of meteorology stations where atmospheric data are collected. e basin sizes are divided into five categories: no basin, wide, normal, moderate, and narrow, and we set the airflow difficulty level of each category to be 0, 1, 2, 3, and 4, respectively. e narrow basin implies that the flow of the air is more difficult. e location map of study area is shown in Figure 1.

e Data.
e hourly atmospheric data of PM10 density (μg/m 3 , at 3 m from ground), air pressure (mmHg, at 2 m), relative humidity (%RH, at 2 m), wind speed (m/s at 30 m), rainfall (mm at 3 m), and temperature (°C at 2 m) from 1 st January 2016 to 31 st May 2016 were obtained with authorization from the Pollution Control Department [1]. About 3% of data was missing from the record. e missing data were replaced by the same data at the preceding time. Figure 2 represents the daily fluctuation of PM10 density of the eight selected locations during the study period. Table 7 represents the summary statistics of PM10 density of the eight selected locations.

3.3.
e Parameters. Based on environmental research studies [30][31][32][33], the climate and the topography of the study area play significant roles in the pollution problem. erefore, the parameter set consists of seven parameters in this application, which are PM10 density, air pressure, relative humidity, wind speed, rainfall, temperature, and airflow difficulty level. e first six parameters are atmospheric parameters, while the last parameter is topographic parameter. Additionally, the effects of each atmospheric components on the PM10 density, the severity index, can be categorized into two types; positive and negative. A positive atmospheric component is the component such that increasing in its value will lead to the increase of the PM10 density, while a negative atmospheric component is the component such that increasing in its value will lead to the decrease of the PM10 density. e parameter information is summarised in Table 8.

Haze Warning System.
e first aim of this research is to create a warning system based on real-time atmospheric data. e system predicts whether the PM10 density will exceed the crisis level or not in the following 4 hours. Note that the length of warning period can be adjusted. In this article, we choose the period of 4 hours since the period of time is reasonable enough to do some safety mitigation such as buying protection masks, completing necessary outdoor activities, or evacuating to public designated safe zones. e warnings will be set to be announced at 12 a.m., 4 a.m., 8 p.m., 4 p.m., 8 p.m., and 12 p.m. each day. e PM10 crisis    Advances in Fuzzy Systems  level is set at 120 μg/m 3 based on ailand national ambient air quality standard [34].

Warning System Based on PM10
Density. e trivial warning system is a warning that relies on the information of the PM10 density only. at is, a warning is signaled when the PM10 density at current time exceeds a certain threshold value. e warning system is generated by Algorithm 2.
Algorithm 2. Haze warning system based on PM10 density.
(1) At the warning time, input PM10 density data of each location. e inputted data are the average of hourly data of the components in the preceding 4 hours.
(2) A warning is signaled if the PM10 density of the location exceeds the threshold value α.
e efficiency of the algorithm is evaluated by the accuracy ratio compared to the real data. e prediction is counted as accurate if the warning is signaled and the PM10 density in the next 4 hours exceeds 120 μg/m 3 or the warning is not signaled and the PM10 density in the next 4 hours does not exceed 120 μg/m 3 . We test the algorithm with α � 84, 96, 108, 114 and 118 which are, respectively, 70%, 80%, 90%, 95%, and 98% of the crisis level. e accuracy ratios of each threshold values are shown in Table 9. e plot between the average accuracy ratio of all eight locations and the threshold values is shown in Figure 3. It can be seen that the best threshold value for these data is 118 (98% of the crisis level) with 90.99% accuracy ratio.

Warning System Based on Fuzzy Soft Set with Weighted
Information. To improve the efficiency of the warning system, the fuzzy soft set with weighted information can be comprised. Note that the fuzzy soft set without weights is not suitable for this model.
is is due to the fact that the importance of the parameters is not the same. For instance, PM10 density parameter is the most important parameter than the other parameters for the reason that no haze problem will occur if the PM10 density amount is low. It should be noted that the membership values of the atmospheric parameters change in every warning based on the real-time data, while the topographic parameter remains the same throughout the time period. Additionally, when the weighted information is w � (1, 0, . . . , 0), this warning system turns out to be the warning system based on PM10 density defined in Section 4.1.1.
e choice values are used in decision making. For this system, a warning is signaled when the weighted choice values at current time exceed a certain threshold value.
Our proposed decision making for the warning system with weighted information is as follows.   (1) At the warning time, input the atmospheric data of each location: PM10 density, air pressure, relative humidity, wind speed, rain, and temperature. e inputted data are the average of hourly data of the components in the preceding 4 hours. Additionally, input the weight information w � (w 1 , w 2 , . . . , w 7 ).
Since the aim of this problem is to find the weight information and the threshold value that give the best accuracy ratio, this problem coincides with the optimization problem: By employing the particle swarm optimization method in Matlab programme, the optimum average accuracy ratio is 92.12% with the optimum weight (17, 0, 0, 2, 3, 0, 0) and the optimum threshold α � 0.98.
is optimum result is shown in Table 11.

Identification of the Most Hazardous Location.
e second aim of this research is to identify the location with the most serious haze pollution problem based on real-time atmospheric data. e location is identified at the same time as the warning. e effective prediction will benefit the community in the affected area and assist the authority to provide safety aids and prepare helping devices such as mobile air purifier.  making is to choose a location based on the information of PM10 density only. at is, the location with the highest value of PM10 density at current time is chosen as the most hazardous location in the following 4 hours. e algorithm of the decision making is as follows.

Identification of the Most Hazardous Location
Algorithm 4. Identification of the most hazardous location based on PM10 density (1) At the warning time, input PM10 density data of each location. e inputted data are the average of hourly data of the components in the preceding 4 hours.
(2) e decision is L k , the location with the maximum value of PM10 density at current time. Optimal choices may have more than one if there are more than one element corresponding to the maximum.
e efficiency of the algorithm is evaluated by the accuracy ratio compared to the real data. e prediction is counted as accurate if the most severe location in the next 4 hours is correctly identified. By making decision based on Algorithm 4, the average accuracy ratio from eight locations is 51.15% and Cohen's kappa index of agreement is 0.4312.

Identification of the Most Hazardous Location Based on Fuzzy Soft Set with Weighted Information.
e fuzzy soft set with weighted information can be comprised in order to improve the efficiency of the decision makings. With a similar reason to Section 4.1.2, the fuzzy soft set with weight is more suitable. Note that the membership values of the atmospheric parameters change in every decision making based on the real-time data, while the topographic parameter remains the same throughout the time period. Additionally, when the weighted information is w � (1, 0, . . . , 0), this decision making turns out to be the warning system based on PM10 density defined in Section 4.2.1. It should be emphasized that the membership calculation of PM10 density parameter is different from Algorithm 3. is is because we need to make a comparison of location.
Finally, the evaluation of decision making must be chosen. Note that it can be evaluated based on choice values, score values, or grey relation grade. In our result, we will use all three evaluations in order to choose which evaluation gives the best result.
Our proposed algorithm for decision making of the most hazardous location based on weighted choice values is as follows. (1) At the warning time, input the atmospheric data of each location: PM10 density, air pressure, relative where where x, m, and M are defined in (i). e flowchart of Algorithm 5 is given in Figure 5.
Since our desire of this problem is to find the weight information that gives the best accuracy ratio, this is similar to the optimization problem: max Average accuracy ratio, subject to w 1 , w 2 , . . . , w 7 are integers 0 ≤ w 1 ≤ 30 0 ≤ w 2 , w 3 , . . . , w 7 ≤ 10.
By employing the particle swarm optimization method in Matlab programme, the optimum average accuracy ratio based on weighted choice values, weighted score values, and grey relational grades is 56.58%, 57.13%, and 57.02%, respectively. e summary of the optimum result is shown in Table 13. Cohen's kappa of the decision making based on weighted choice values, weighted score values, and grey relational grades is 0.4457, 0.4521, and 0.4489, respectively.

Haze Warning System.
By introducing the fuzzy soft model with weighted information, the prediction accuracy ratio of the warning system is improved slightly from 90.99% to 92.12% compared to the simple warning system that only considers the PM10 density. Moreover, it is clear that the fuzzy soft models with weighted information provide better prediction than the original (equal weight) fuzzy soft model. Table 14 shows the parameters' weights that provide the best accuracy ratio. Note that the principal parameters are PM10 density, rainfall, and wind speed, respectively, while the other parameters have no weight. is suggests that a simple judgment on the warning can be done by observing only PM10 density, wind speed, and rainfall. e problem is expected to be severe if PM10 density is high, with no wind and no rain. is agrees with the principle study in environmental science research.

Identification of the Most Hazardous Location.
By selecting the most severe location based on the information from PM10 density only, the accuracy ratio is 51.12%. However, this ratio is improved to 57.13% when the locations are chosen by the fuzzy weight model. e decision making is decided by weighted score values. Table 15 shows the parameters' weights that provide the best accuracy ratio.
Based on the optimal parameters' weights, this would imply the following: (1) PM10 density is clearly the main factor in the decision making.
(2) is result shows that topography plays a role in the haze pollution problem for this region of study.  (3) Temperature, wind speed, and rainfall are factors in the model. Unfortunately, these atmospheric parameters are uncontrollable. (4) Air pressure and relative humidity have less or no impact for the prediction model.
is study analysis agrees with principle study in environmental science research. It should be emphasized that the only parameter that can be controlled is PM10 density. e activities that contribute to PM10 such as outdoor burn or car emissions should be disregarded.

Other Discussion.
By the results from Sections 4.1 and 4.2, it should be pointed out that a simple warning system and location identification based on the information of PM10 density is reasonable enough. By adding the parameters, the efficiency of the model is improved very slightly.
is emphasizes the fact that environmental modeling is complicated. However, since the calculation of our algorithm is not expensive, Algorithms 3 and 5 should still be in use to improve the decision-making problem.
For further works, our suggestions are to add the following parameters: Atmospheric parameters: PM2.5 density, SO 2 , ozone, and wind direction. Topographic parameters: height above sea level of the location, location of surrounded mountains, and height of surrounded mountains. Others parameters: population.

Conclusions
In this article, we propose a fuzzy soft model to benefit in the haze pollution management in northern ailand. e main aims of this research are to provide a haze warning system based on real-time atmospheric data and to identify the most hazardous location of the study area. e study area covers eight provinces in the northern ailand, where the problem severely occurs every year. e parameters of the fuzzy soft set include both atmospheric parameters and topographic parameter.
e membership values of atmospheric parameters are calculated based on the real-time data. e    efficiency of the model is tested with the real data from 1 st January 2016 to 31 st May 2016. e results show that our fuzzy models improve the prediction accuracy ratio compared to the prediction based on PM10 density only. e optimum results and optimum weights are chosen based on particle swarm optimization. e meaning of optimum weights also agrees with the principle study in environmental science research. Another benefit of our model is that the topographic parameter, which is normally being disregarded from many models, is included. Moreover, our model would offer an alternative prediction model for the haze pollution problem in northern ailand. e fuzzy soft set approach in the application to haze pollution management furnishes very promising prospect and possibilities. We strongly believe that the efficiency of the model can be improved when appropriate parameters are added. e calculation formula for the membership values and the severity index can also be adjusted. e efficient model will clearly improve the health safety and raise the life quality of the sufferers.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e author declares that there are no conflicts of interest.