Multisensor Data Fusion for Water Quality Evaluation Using Dempster-Shafer Evidence Theory

A multisensor data fusion approach for water quality evaluation using Dempster-Shafer evidence theory is presented. To evaluate water quality, each sensor measurement is considered as a piece of evidence. Based on the water quality parameters measured by sensor node, the mass function of water quality class is calculated. Evidence from each sensor is given a reliability discounting and then combined with the others by D-S rule. According to the decision rule which uses the fusion mass function values, the class of water quality can be determined. Finally, experiments are given to demonstrate that the proposed approach can evaluate water quality from uncertain sensor data and improve evaluation performance.


Introduction
Water quality evaluation is important in providing a reliable supply of potable water. Empirical evidence shows that water quality parameters, such as dissolved oxygen (DO), NH 3 -N, total phosphorus (TP), and total nitrogen (TN), are sensitive indicators of contaminants. In 2005, Hall and Szabo [1] demonstrated that changes in water quality parameters, which potentially indicate contamination, can be detected using sensors. Then, wireless sensor network (WSN) has been extensively applied in monitoring and evaluating water quality [2][3][4].
Multisensor data fusion is a technology to enable combining information from several sources in order to form a unified picture [5]. It is an important tool for improving the performance of monitoring system when various sensors are available. Multi-sensor data fusion seeks to combine data from multiple sensors to perform inferences that will be more efficient and potentially more accurate than if they were achieved by means of a single sensor. During recent years, the multi-sensor data fusion technique has received much attention, but it is more about applications to target identification, signal and image processing, and biomedical engineering [6][7][8][9]. In this paper, we apply multi-sensor data fusion technology in water quality evaluation.
The main problem is that data obtained from sensors have different degrees of uncertainty [10]. This uncertainty may arise for a number of reasons. For instance, the sensing error increases with the age of sensor, and the sensor is disturbed by environment. The quality of the wireless links is another major limiting factor. Furthermore, this uncertainty may lead to conflicting conclusion. Since data obtained from the sensors is inherently incomplete, uncertain, and imprecise, it is imperative that a fusion mechanism be devised so as to minimize such imprecision and uncertainty.
Dempster-Shafer evidence theory (D-S evidence theory) and Bayesian methods are commonly used to handle uncertainty. The basic strategy of Bayesian methods is that if the prior probabilities and conditional probabilities are determined in advance, then the posteriori probabilities can be estimated using Bayes formula. Examples of applying Bayesian methods for multi-sensor data fusion can be found in [11][12][13]. Nonetheless, effective fusion performance can be achieved only if adequate and appropriate a priori and conditional probabilities are available. In some situations, 2 International Journal of Distributed Sensor Networks assumptions can be made with respect to a priori and conditional probabilities, but these assumptions can turn to be unreasonable in many other situations. D-S evidence theory can be regarded as an extension of classical probabilistic reasoning, which makes inferences from incomplete and uncertain data provided by different independent sources. The application of D-S evidence theory in multi-sensor data fusion can be found in [14][15][16][17]. A key advantage of D-S evidence theory is its ability to deal with uncertain data without adequate priori probabilities.
This paper introduces a novel multi-sensor data fusion approach for water quality evaluation using D-S evidence theory. We view each sensor measurement as a piece of evidence that reveals some uncertain information about the water quality. And we get water quality evaluation through the fusion of uncertain data from each sensor. The rest of this paper is organized as follows. In Section 2, we introduce some preliminary concepts of the evidence theory. We present our multi-sensor data fusion approach for water quality evaluation using D-S evidence theory in terms of mass function based on water quality parameters, reliability discounting, and decision rule using the fusion mass function values. Section 3 describes experiments in which we demonstrate the effectiveness of the proposed approach. Section 4 provides some concluding remarks.

Water Quality Evaluation Using D-S Evidence Theory
2.1. D-S Evidence Theory. The D-S evidence theory originated from Dempster's work [18] and is further extended by Shafer [19] is a generalisation of traditional probability which allows us to better quantify uncertainty. The theory is based on a number of key propositions which are summarized as follows.
(1) Frame of discernment: let Θ be a finite set of elements; an element can be a hypothesis, an object, or in our case a water quality evaluation. We refer to Θ as the frame of discernment; the set consisting of all the subsets of Θ is called the power set of Θ and denoted by Ω(Θ).
(2) Mass function: mass function is also called a basic probability assignment function. It is defined as a mapping of the power set Ω(Θ) to a number between 0 and 1as follows: where ( ) is an expression of the level of confidence exactly in . It does not include the confidence in any particular subset of . In water quality evaluation, ( ) can be considered as a degree of belief held by certain class of water quality. If ( ) > 0, the subset of Θ is called a focal element. When a mass value is committed to a subset that has more than one element, it is explicitly stating that there is not enough information to distribute this belief more precisely to each individual element in the subset. In particular, the total belief is assigned to the whole frame of discernment; m(Θ) = 1 when there is no evidence about Θ at all; m(Θ) is the uncertain function.
(3) Dempster's rule of combination (D-S rule): suppose 1 and 2 are two mass functions formed based on data obtained from two different and independent sources in the same frame of discernment Θ. We can get a new m according to D-S rule as follows: ∈ (0, 1) is a normalization constant and can be viewed as a measure of conflict among the sources of evidence. The higher the is, the more conflicting are the sources. Dempster's rule of combination can blend measures of evidence from different sources.

Water Quality Evaluation.
This section focuses on D-S evidence theory applications in multi-sensor environment and presents our implementation for water quality evaluation.
Consider a wireless sensor network with planktonic sensor node as shown in Figure 1, used to monitor and evaluate the water quality in a measurement area, for example, a lake or a pool. Each sensor node has several subsensors used to measure the water quality parameters. Let = { 1 , 2 ⋅ ⋅ ⋅ } denote the water quality parameters measured by each sensor node, depending on the application; may represent quality parameter such as DO, NH 3 -N; is the number of water quality parameters. Through the water quality parameters, we can determine the class of water quality.
Let Θ = { 1 , 2 ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ } be the frame of discernment for water quality evaluation; means that the water quality is classified to the class; is the number of water quality classes.
Assume sensor nodes, for the sake of simplicity, and suppose that all sensor nodes are independent. Each sensor node measures water quality parameters ( ) of measurement area. When applying the D-S evidential theory to multisensor data fusion, data obtained from each sensor node is the theory's evidence. Figure 2 shows the block diagram of our multi-sensor data fusion approach for water quality evaluation using D-S evidence theory.
According to the D-S evidence theory, for each sensor node, the possibility of water quality class can be described by mass function values.
( 1 ), ( 2 ), . . . , ( ) are mass function values obtained by from sensor . ( ) means the confidence assigned to the class of water quality provided by sensor . Multi-sensor data fusion amounts to combine several lines of evidence to form a new comprehensive evidence. For uncertainty, each sensor node is then given a reliability discounting ( ) before combination. By using the fusion mass function values obtained by D-S rule, the class of water quality can be determined.

Mass Function Based on Water Quality Parameters.
The derivation of mass function is the most crucial step in D-S evidence theory, because it determines the reliability of conclusions. In our approach, the calculation of the mass function is based on the water quality parameters provided by sensor node.
Let represent the measurement of water quality parameters obtained from sensor : where is the th element of water quality parameters. Let Θ represent the features of all water quality classes: where describes the feature quality parameters of the class.
Intuitively, the more similar is to , the more probable is the class of water quality, as far as sensor is concerned. Conversely, the more dissimilar is to , the less probable is the class of water quality, again as far as sensor is concerned.
There are many measures for quantifying the distance between the measured parameters and the feature of water quality class. We propose to use the Minkowski distance measure, as follows: where is the distance between and ; is a constant; dividsion by max − min is for normalizing. The distances between measurement of sensor and the features of all water quality classes can be captured as follows: The smaller the distance is, the more probable the class of water quality is. Defining then we have the mass function of water quality class from sensor : where ( ) is the mass function assigned by sensor to the class of water quality.

Reliability
Discounting. Some sensor nodes are more vulnerable to misreading or malfunctioning due to many factors, such as their age, their type, and their location. The impact of evidence is discounted to reflect the sensor node's reliability, in terms of reliability discounting (0 < < 1).
For uncertainty, evidence from each sensor node is then given a reliability discounting as follows: International Journal of Distributed Sensor Networks where is the total number of water quality evaluation from sensor ; is the number of correct water quality evaluation from sensor ; and min are fixed values. If the number of water quality evaluation is less than min , we use a fixed value as reliability discounting. and min are usually selected as 0.9, 10, respectively.
If its reliability discounting ( ) is high, this evidence will be given more weight and have greater effect on the modified combinatorial rule. Then, the mass function in (9) is updated to : where ( ) is the modified mass function assigned by sensor to the class of water quality; (Θ) is the uncertain function (unidentified mass function) from sensor .

Decision Rule Using the Fusion Mass Function Values.
As stated earlier, all sensors are assumed to be independent. The multi-evidence combinatorial rule becomes Using combinatorial rule, a fusion mass function ( ( )) converts the confidence of each class of water quality arising from different evidence sources (sensor nodes) into a fraction in (0, 1). By using the fusion mass function values, water quality can be evaluated on the decision rule as follows.
(1) The current determined class of water quality should have a maximal mass function value and should be greater than a certain value; this value should be at least greater than 1/ ; stands for the number of water quality classes.
(2) The difference of the mass function values between the current determined class of water quality and other classes should be greater than a certain gate limit value, and here it is 0.2.
(3) The uncertain function value should be less than a certain gate limit value, and here it is 0.1.
If the three rules above are not satisfied simultaneously, the current class of water quality is uncertain.

Experiments and Results
In this section, we give two experiments to validate the performance of the proposed approach. In our experiments, we use the senor nodes developed by ourselves to measure water-quality parameters, as shown in Figure 3. The measured parameters include DO, NH 3 -N, TP, and TN; then = {DO, NH 3 -N, TP, TN}. According to "Environmental Quality Standards for Surface Water of China GB 3838-2002", water quality is categorized as 5 classes, and the features of all water quality classes are shown in Table 1. Thus, in our experiments, the frame of discernment for water quality evaluation is Θ = { 1 = "I", 2 = "II", 3 = "III", 4 = "IV", 5 = "V"}, where "I" means that the water quality is classified to the first class. From Table 1, (5) can be expressed as follows: In the first experiment, we use three sensor nodes ( 1 , 2 , and 3 ) to monitor the water quality of a pool. The objective is to determine the water quality class of the pool based on the three sensor nodes. Table 2 is assumed to list the water quality parameters measured by sensors 1 , 2 , and 3 .  From (6), when = 2, max = max( ) and min = 0, we can calculate that    Table 3.
Suppose that the numbers of water quality evaluation from sensors 1 , 2 , and 3 are the same 15, and the numbers of correct evaluation are, respectively, 9, 12, and 14. From (10) and (11), we can get the modified mass functions and the uncertain functions from 1 , 2 , and 3 , as shown in Table 4.  We can see from Table 2 that the water quality parameters measured by 1 are obviously different from 2 and 3 . 1 gives different conclusion. For uncertainty, the data provided by 1 may be incorrect. Only consider the data of 1 and 2 from Table 4; two pieces of evidence are in conflict. According to multievidence combinatorial rule, which is realized by (12) and (13), we can combine the evidence provided by 1 and 2 , as shown in the first row of Table 5. The second class of water quality has the maximal mass function value. It is seen that our approach is effective when there is conflict of evidence. However, according to our decision rule, the uncertain function value is great and we cannot determine the class of current water quality.
Then, consider the data of 3 ; the fusion mass function and uncertain function combining 1 , 2 , and 3 are also shown in Table 5. We reach the water quality evaluation result in favor of "II" with a degree of belief of 0.483. And the uncertain function value is reduced to 0.032. According to the decision rule, we can determine that the water quality of the pool is the second class. This result is the combination of the three sensor nodes and is considered to be more believable compared to that of the single sensor node.
We can see clearly from Table 5 that by comparing the fusion mass function value with the single sensor mass function value, the mass function value of the current determined water quality class is enlarged, while the difference of the mass function values between the current determined class of water quality and other classes is enlarged, and at the same time, the uncertain function value is reduced.
In the second experiment, we use different numbers of sensor nodes to monitor the water quality of a small lake at different time and different locations and use the approach of this paper to evaluate the water quality. Here we already know that water quality of this lake is the fourth class. The objective is to demonstrate that our approach could improve the accuracy of water quality evaluation. According to simulation, Table 6 shows the correct rate and uncertain rate compared between different numbers of sensor nodes used in multi-sensor data fusion for water quality evaluation.
From Table 6, it can be seen that our approach can increase correct rate of water quality evaluation, compared to the approach using single sensor node. It can also be seen that the fusion can reduce the uncertain rate of water quality evaluation. In summary, our experiments have indicated that the proposed multi-sensor data fusion approach for water quality evaluation using D-S evidence theory has improved water quality evaluation performance greatly.

Conclusions
We have applied multi-sensor data fusion technology in water quality evaluation in this paper. We have introduced a novel multi-sensor data fusion approach for water quality evaluation using D-S evidence theory. Within our work, we have proposed a method of calculating mass function based on water quality parameters and proposed a reliability discounting to reflect the sensor node's reliability. Furthermore, we have proposed the decision rule to determine the class of water quality by using the fusion mass function values. Our experiments have indicated that the proposed approach can evaluate water quality from uncertain sensor data, increase correct rate, reduce uncertain rate of water quality evaluation, compared to the approaches using individual sensors.