Exploiting Optimal Threshold for Decision Fusion in Wireless Sensor Networks

Decision fusion has been adopted in a number of sensor systems to deal with sensing uncertainty and enable the sensors to collaborate with each other. It can distribute computation workload and significantly reduces the communication overhead. However, some variants of decision rules such as Voting, Bayes Criterion, and Neyman-Pearson require a priori knowledge on the probability of targets presence which is still an open issue in detection theory. In this paper, we propose a binary decision fusion scheme that reaches a global decision by integrating local decisions made by fusion members. The optimal local thresholds and global threshold are derived by using the Minimax criterion based analysis while they are ensuring false alarm rate constraint, without a preestimated target appearance probability. Simulation results show that our scheme can improve the system performance under certain constraints, which can guide the threshold selection for implementing WSN systems in mission-critical applications.


Introduction
Wireless sensor networks (WSNs) for mission-critical applications such as security surveillance [1], environmental monitoring [2], and targets detection/traction [3] often face the challenges of meeting stringent performance requirements imposed by the applications.However, the actual sensing quality of sensors is difficult to predict due to the uncertainty in physical environments.For instance, the measurements of sensors are often contaminated by noise which renders the detection performance of the systems.Real-world experiments using MICA2 motes showed that the false alarm rate of a network is as high as 60% when sensors make independent decisions [1].
As an effective technique to improve the performance of distributed WSNs, data fusion [4] has been proposed by jointly considering the measurements of multiple sensors in the uncertain ambient.For example, in the DARPA SensIT project [5], advanced data fusion techniques have been employed in algorithms and protocols designed for target detection [3,6], localization [7], and classification [8,9].There is also a vast of work on stochastic signal detection based on multisensor data fusion.Early work [4,10] focused on small-scale sensor WSNs.Recent studies on data fusion have considered the specific properties of WSNs such as sensors' spatial distribution [8,9,11] and limited sensing/communication capability [6].However, as one of the most fundamental issues in WSNs, the design of data fusion scheme which maximizes the system performance of given network remains fundamentally challenging.
In general, the strategies of data fusion can be categorized into value fusion and decision fusion [12].In value fusion, all members transmit the raw measurements to the fusion center which is responsible for fusing and making the final decisions.However, the centralized data collection and processing on fusion center lead to unbalance on system workload and recourses allocation.Unlike value fusion, decision fusion strategy only transmits the local binary decision results made by fusion members to the head, which reduces the communication overhead and distributes the computing workload.However, many existing decision fusion methodologies are derived from some variants of decision rules such as Voting, Bayes Criterion, and Neyman-Pearson [12][13][14][15]; those are hard in practice as they require a priori knowledge on the probability of target presence, whose estimation may be inaccurate and is still an open issue in detection theory [4,16].

International Journal of Distributed Sensor Networks
In this paper, we propose a decision fusion scheme for balancing the workload of distributed sensors as well as concise communicating transmission.Specifically, a binary decision fusion scheme reaches a global decision by integrating local decisions made by fusion members.To obtain the optimal detection cost of system, we deduce the method to calculate the proper local thresholds on the fusion members and the optimal global threshold on fusion heads, respectively, to minimize the system detection cost, under certain system false alarm rate constraints.Unlike Bayes Criterion or Neyman-Pearson rules, the thresholds in our scheme can be obtained by a simple training procedure on the system configuration, without requiring hard expected priori knowledge on the probability of events occurrence.To verify our approach, we conduct extensive simulations based on different scenarios; our simulations show that comparing with the state-of-art [17], adopting analytical threshold proposed in this paper can improve the system performance significantly.The results are particularly useful in guiding practical implementation in which the proper threshold in decision fusion of wireless sensor networks can be sought under certain false alarm rate constraints.
The rest of the paper is organized as follows: system models and assumptions are presented in Section 2. Section 3 formulates the problems based on the fusion model.The solutions and technical approaches to derive the proper thresholds are discussed in Section 4. Numerical and simulation results are given in Sections 5 and 6, respectively.We conclude our work in Section 7.

System Model
2.1.Target and Sensing Model.The distributed wireless sensor network systems are composed with  uniform distributed sensors, which are interconnected by wireless links.Sensors perform detection by measuring the energy of signal emitted by the targets.However, the energy of most physical signals (e.g., acoustic and electromagnetic signals) attenuates with the distance from the signal source.Suppose that sensor  is   meters away from the target that emits a signal of energy .The attenuated signal energy   at the position of sensor  is given by where  0 is the original energy emitted by the target,   is a decaying factor which is typically from 2 to 5,   is a constant determined by the size of the target and the sensor.This signal attenuation model is widely adopted in the literature [4,7,8,18].The signal strength measurements of a sensor are corrupted by noise.Denote the noise strength measured by sensor  is   , which follows a zero-mean normal distribution with variance of  2 , for example,   ∼ N(0,  2 ).In practice, the targets detection is a Hypothesis testing.Suppose  0 represents the events of no target and  1 represents targets appearance, respectively, and all detections by different sensors are independent;   is the sampling of sensor ; then the measurement of sensor  can be calculated as follows: In practice, the parameters of target and noise models are often estimated by using a training dataset before deployment.The measurements of sensors are obtained by averaging multiple samplings (≥30).Assume that the noises upon all sensors are independent; the samplings of sensor  follow distribution described below as   ∼ N(0,  2 ): where   and  2  can be calculated below by sampling  times when targets are absent:

Local Decision Model.
Sensor  compares the sampling   with the local detection threshold   to make a binary local decision and transmit the decision result 0 or 1 to the fusion head for further process as follows: However, the decision made by local sensor  remains inaccurate as the noises are random.Suppose that  0 represents the probability of positive decision with no targets in the sensing area of sensor  and  1 represents the probability of correct decision while targets appear, known as the local false alarm rate and the local detection probability, respectively; then  0 and  1 can be calculated as follows:

Multisensor Fusion Model.
Data fusion [6,19,20] is widely employed in wireless sensor networks as an efficient technique which can improve the detection performance of the system.In fusion-based WSN system typically, sensors are clustered in groups around the group head (fusion center) and transmitting their sampling information to the head which hereafter makes the final decision.This paper employs a simple decision fusion strategy, where sensors transmit their local decision to the fusion center which then compares the number of received positive decisions  = ∑  =1   with the system threshold  and makes the final decision of whether there is a target.
At the system perspective view, the performance of system final decision is also evaluated by the global false alarm rate   , which is the probability of positive decision in fusion center while there are no targets and the global detection probability of detection   which is the final positive decision while the targets appear.Suppose  1 and  0 are presence and absence of targets, respectively,   and   can be explained as follows: Besides   and   , two probabilities can act as similar purpose as to evaluate the system detection performance: the event missing rate   , which is the probability of negative decision while targets appear, and the negative correct probability   , which is the probability of the correct negative decision while there are no targets.According to the definition of those metrics, they can be calculated as

Bayesian Model and Minimax
Criterion.Bayesian criterion [4,16] is widely adopted in fusion-based detection system; the objective of Bayesian criterion is to minimize the expected system cost or risk in making decisions, which is denoted by () and formally given by where   is the cost of deciding   when the ground truth is   and (  ) is the prior probability of the ground truth   .Note that the costs, that is,   | , , ∈ {0, 1}, are constants specified by user.For instance, by letting  00 =  11 = 0 and  01 =  10 = 1, () equals the expected probability that the detector makes wrong decisions over all possible measurements [16], that is, the average error rate.
According to (11), the precondition of the Bayesian detection is the certain probability (  ).However, the optimal solution for detecting targets with variable prior probabilities is still an open issue in detection theory [4,16].In this section, we employ a suboptimal detection criterion called minimax criterion [16], which is widely adopted to handle unknown and changeable prior probabilities.
Suppose that  0 is the eigenvalue set of decision  0 , and  1 represents the eigenvalue set of decision  1 ; then the cost of system decision is Since ( 1 ) = 1 − ( 0 ) and then we have ), by substituting them into (13), we have the reduced condition of decision cost as follows: By letting  00 =  11 = 0 and  01 =  10 = 1 as discussed above, we can get the condition of minimum system detection cost:

Problem Statement.
According to the assumptions and decision models, the problem of this paper can be formulated as follows.Suppose a surveillance WSN is composed by  wireless networked sensors; the measurements and decisions of each sensor are mutually independent,  is the predefined upper bound of the global false alarm rate; for example,   ≤ , the objectives of the problem are to find out (1) the local decision threshold   for sensor ,  = 1, . . ., ; (2) the system global decision threshold  opt .
To satisfy (15), specifically, minimizing the system detection cost (), under the system false alarm constraints   ≤ .

Analysis and Solution
Based on the decision fusion model discussed above, when the targets are absent, the positive decision  1 |  0 , of local sensor , follows the Binomial distribution,  1 |  0 ∼ (,  0 ), with  0 as success probability in  trails.As the global decisions are made by comparing the summation of all local decisions sent by fusion members, the global false alarm rate   after decision fusion on fusion center is According to De Moivre-Laplace theorem [21], when  is large enough,   can be calculated approximately as where (⋅) is the complementary cumulative distributed function (CCDF) of standard normal distribution; that is, International Journal of Distributed Sensor Networks . is the system threshold of the fusion center and  0 is the average of  0 .
The upper bound of global false alarm rate is , and the Cumulative Distribution Function (CDF) of  monotonically increases with  0 [21].By letting   = , we can solve  0 out according to (9).Suppose  0 = (, , ), the solution of problem 1 according to (7) is where  −1 (⋅) is the inversion of CDF of standard normal distribution.
Similarly, the detection probability  1 of sensor  can be calculated as  1 = ( −1 ( 0 ) −   /  ) according to (8).However, unlike the false alarm rate of local decision, the actual detection probability of local decision changes with different landforms or the hardware deviations.As a consequence, the local decision   |  1 does not follow the Binomial distribution.However, according to Lyapunov's central limit theorem [22],  |  1 follows normal distribution when  is large, where  is the summation of results made by the fusion members; for example,  = ∑  =1   .the mean and variance of it are )), then we can calculate the system detection probability   as And the event missing rate is where Φ() is the cumulative distribution function (CDF) of standard normal distribution.According to (15), we have Since (⋅) is the complementary cumulative distributed function of standard normal distribution, we have (−) = 1 − () and () = 1 − Φ().By integrating the above three equations, we can solve the optimal system detection threshold out at the fusion center as where  1 is the average of local decision probability of all fusion members.Equation ( 23) is the solution of problem 2.
From above analysis we can see that the solution can not be solved out directly since there are many steps to obtain  opt .In the procedure of implementing the decision fusion, we can leverage some numerical methods to calculate the results.The pseudocodes of the procedure are shown in Algorithm 1.
The procedure of detection can be divided into two phases, the training phase and detection phase.At the training phase, the system samples the interesting signals repetitively while there is no target.Each sensor calculates the mean and variance of the noise at their position using ( 4) and ( 5), respectively.The main purpose of training procedure is to measure the noise level of the environment and minimize the false alarm rate.To ensure that the training results are accurate in the further detection phases, we conduct case studies using the data traces collected in a real vehicle detection experiment [23].In the experiments, 75 WINS NG 2.0 nodes are deployed to detect military vehicles driving through the surveillance region.Figure 1 depicts the noise energy occurrence percentage collected by sampling 1000 times while there is no vehicle.From it we can see the noise signal fitting the normal distribution with zero mean, and it is also found that the noise mean and variance are stable after 500 samplings are received, which will guide the training procedures in further simulations.
After the noise features are observed, local sensors use (18) to solve out the local decision threshold  and the local detection probability  1 , respectively.After all the sensors find out local detection probability  1 and local false alarm rate  0 , they send those two parameters to the fusion center, which calculates the mean and variance of  1 as the local false alarm rate  0 of all sensors is the same.Finally, the system detection threshold  opt is computed by using (23).
At the detection phase, all sensors in fusion group sample the signal   of ambient, compare   with local decision threshold , and then send a (0, 1) binary local decision to the fusion head.Fusion head makes the final decision by comparing the sum of local decisions with the system threshold  opt .
Unlike the value fusion in which the fusion members do sampling and send mass of raw data to the fusion center, the low end sensors in Procedure 1 are responsible for not only sampling but taking local comparison.The transmission packet only contains the local decision result 0 or 1.As a consequence, this scheme can fairly distribute Require: The surveillance field , fusion cluster , and the upper bound of system false alarm rate .Ensure: Optimal local threshold set   ,  = [1, . . ., ] and global threshold  opt (1) Training phase (2) for for each node in the cluster do (3) Sampling  times (4) average the samplings to obtain the mean and variance of noise by ( 4) and ( 5

Numerical Results
In this section, we conduct numerical experiments to evaluate the performance of our optimal threshold decision fusion scheme proposed in Section 4. The parameters of signal decay are set as follows:  0 = 100,  = 2, fusion number is set to  = 30.Table 1 shows the optimal threshold  opt computed by (23) and the corresponding detection probability   opt under different settings of system false alarm rate upper bound .The baseline is the threshold selection rules proposed in [17], in which the optimal fusion threshold bounds are derived by using Chebyshev's inequality to ensure a higher hit rate and lower false alarm rate, without the priori probability of target presence. base is the threshold randomly selected within the calculated bound intervals according to the baseline, and   base is the corresponding detection probability.
From Table 1 we can see that although the system detection probability drops alone with the decreases of , the   opt are greater than the baseline at all the numerical settings.
The system false alarm rate   and detection probability   are calculated by ( 9) and (10), which are deduced by the De Moivre-Laplace Central Limitation Theorem.However, one of the ideal conditions of the theorem is that assuming the fusion number is very large.This assumption influences the accuracy of the system performance.To evaluate this influence, we calculate the detection probability by (20) corresponding with the optimal global Threshold  opt obtained in our approach.The baseline is the same decision fusion scheme proposed in article [15], which utilizes Neyman-Pearson criterion to guarantee the maximum system detection probability on satisfying a predefined false alarm rate bound, under assumption of a priori probability of targets presence.  max is the optimal detection probability of the baseline, and   num is the detection probability of our scheme.Subtracting the two    we obtain the collum Difference.The results show that the differences between our scheme and the baseline are trivial, and it becomes negligible when the number of fusion member is greater than 50.Note that our scheme does not require a predefine priori probability of targets presence that is hard to estimate in practice [4,16].

Simulation Results
To evaluate the performance of the scheme proposed in Section 4, we conduct extensive simulation experiments.The scenarios of simulations are set in a 100 m × 100 m square area with 30 sensors which are randomly placed.The parameters of decay model are set to  0 = 100,  = 2; all simulation codes are written in C++.The detection probability   is obtained by detection 1000 times with decision made by the measurements which are contaminated by randomly generated noise signals and the false alarm rate  which is predefined from 0.05 to 0.15.
Figure 2 depicts the receiver operating characteristic curve (ROC) of   along with different settings of upper bound of system false alarm rate .The baseline is also selected from article [17]. min ROC is the curve obtained by detection using the lower bound of the interval of detection threshold mentioned in [17], while  max ROC chooses the upper bound. ran ROC chooses randomly threshold within the interval, and  opt ROC is the curve of ROC using our optimal threshold.From the figure we can see the detection probabilities are increasing along with the system false alarm rates, which is corresponding to the monotonicity of their relationships, and the detection probabilities of our scheme are better than the baselines.
Besides the false alarm rate, noise is another main factor that influences the system performance of detection scheme.To evaluate the influence upon decision performance by different noise ground, we simulate the detection results alone, with different settings of SNR, and draw Figure 3.The signal noise ratios are set from 0.1 to 4,   curves are obtained by taking detection for 1000 times at 4 thresholds mentioned above, under different noise levels, and the false alarm rate is set to 0.1.The results show that when noise is high (SNR is small), all approaches are unpromising, but rising along with the signal strength, and the scheme proposed in Section 4 is better than other baselines.

Conclusion
This paper explores the use of decision fusion to address the limitation of performance drops caused by noisecontaminated surveillance by wireless sensor networks.In our approach, we adopt a distributed decision fusion scheme which can balance the workload on each sensor.Meanwhile the data transmitted in our scheme are fewer than traditional value fusion.To improve the system performance, we have deduced the optimal decision threshold according to minimax criterion.The effectiveness of our approach is validated by numerical results and extensive simulations.

Table 1 :
Detection probability with decision thresholds.

Table 2 :
Performance versus fusion number.
Table 2 shows the numerical results of difference of the detection probability.The numbers of sensors are set from 10 to 100, column