Distributed Detection and Fusion in a Large Wireless Sensor Network of Random Size

For a wireless sensor network (WSN) with a random number of sensors, we propose a decision fusion rule that uses the total number of detections reported by local sensors as a statistic for hypothesis testing. We assume that the signal power attenuates as a function of the distance from the target, the number of sensors follows a Poisson distribution, and the locations of sensors follow a uniform distribution within the region of interest (ROI). Both analytical and simulation results for system-level detection performance are provided. This fusion rule can achieve a very good system-level detection performance even at very low signal-to-noise ratio (SNR), as long as the average number of sensors is su ﬃ ciently large. For all the di ﬀ erent system parameters we have explored, the proposed fusion rule is equivalent to the optimal fusion rule, which requires much more prior information. The problem of designing an optimum local sensor-level threshold is investigated. For various system parameters, the optimal thresholds are found numerically by maximizing the deﬂection coe ﬃ cient. Guidelines on selecting the optimal local sensor-level threshold are also provided.


INTRODUCTION
Recently, wireless sensor networks (WSNs) have attracted much attention and interest, and have become a very active research area. Due to their high flexibility, enhanced surveillance coverage, robustness, mobility, and cost effectiveness, WSNs have wide applications and high potential in military surveillance, security, monitoring of traffic, and environment. Usually, a WSN consists of a large number of lowcost and low-power sensors, which are deployed in the environment to collect observations and preprocess the observations. Each sensor node has limited communication capability that allows it to communicate with other sensor nodes via a wireless channel. Normally, there is a fusion center that processes data from sensors and forms a global situational assessment.
In a typical WSN, sensor nodes are powered by batteries, and hence have a very frugal energy budget. To maintain distributed detection problem. In [6,7], optimum fusion rules have been obtained under the conditional independence assumption. Decision fusion with correlated observations has been investigated in [8,9,10,11]. There are also many papers on the problem of distributed detection with constrained system resources [12,13,14,15,16,17,18]. More specifically, these papers have proposed solutions to optimal bit allocation under a communication constraint.
However, most of these results are based on the assumption that the local sensors' detection performances, namely either the local sensors' signal-to-noise ratio (SNR) or their probability of detection and false alarm rate, are known to the fusion center. For a dynamic target and passive sensors, it is very difficult to estimate local sensors' performances via experiments because these performances are time varying as the target moves through the wireless sensor field. Even if the local sensors can somehow estimate their detection performances in real time, it will be very expensive to transmit them to the fusion center, especially for a WSN with very limited system resources. Usually a WSN consists of a large number of low-cost and low-power sensors, which are densely deployed in the surveillance area. Taking advantage of these unique characteristics of WSNs, in our previous paper [19], we proposed a fusion rule that uses the total number of detections ("1"s) transmitted from local sensors as the statistic.
In [19], we assumed that the total number of sensors in the region of interest (ROI) is known to the WSN. However, in many applications, the sensors are deployed randomly in and around the ROI, and oftentimes some of them are out of the communication range of the fusion center, malfunctioning, or out of battery. Therefore, at a particular time, the total number of sensors that work properly in the ROI is a random variable (RV). For example, in a battlefield or a hostile region, many microsensors can be deployed from an airplane to form a WSN. Data are transmitted from sensors to an access point, which could be an airplane that flies over the sensor field and collects data from the sensors. The total number of sensors within the network and the total number of sensors that can communicate with the access point (the flying airplane) at a particular time are RVs. In this paper, the results presented in [19] are extended to this more general situation. The performance of the fusion rule proposed in [19] will be analyzed with this extra uncertainty about the total number of sensors.
In Section 2, basic assumptions regarding the WSN are made, the signal attenuation model is provided, and the fusion rule based on the total number of detections from local sensors is introduced. In addition, it is shown that the proposed fusion rule can be adapted well to a large network with multiple-layer hierarchical structure. Analytical methods to determine the system-level detection performance are presented in Section 3. There, asymptotic detection performance is studied. In addition, the proposed fusion rule is compared to the likelihood-ratio (LR) based optimal fusion rule, which requires much more prior information. Simulation results are also provided to confirm our analyses. In Section 4, the problem of designing an optimum local sensor-level threshold is investigated, and the optimum thresholds for various system parameters are found numerically. Conclusions and discussion are provided in Section 5.

Problem formulation
As shown in Figure 1, a total of N sensors are randomly deployed in the ROI, which is a square with area b 2 . N is an RV that follows a Poisson distribution: The locations of sensors are unknown to the WSN, but it is assumed that they are independent and identically distributed (i.i.d.) and follow a uniform distribution in the ROI: for i = 1, . . . , N, where (x i , y i ) are the coordinates of sensor i. Noises at local sensors are i.i.d. and follow the standard Gaussian distribution with zero mean and unit variance: For a local sensor i, the binary hypothesis testing problem is where s i is the received signal at sensor i, and a i is the amplitude of the signal that is emitted by the target and received at sensor i. We adopt the same isotropic signal power attenuation model as that presented in [19] where P 0 is the signal power emitted by the target at distance zero, d i is the distance between the target and local sensor i: and (x t , y t ) are the coordinates of the target. We further assume that the location of the target also follows a uniform distribution within the ROI. n is the signal decay exponent and takes values between 2 and 3. α is an adjustable constant, and a larger α implies faster signal power decay. Note that the signal attenuation model can be easily extended to 3-dimensional problems. Our attenuation model is similar to that used in [20]. The difference is that in the denominator of (5), instead of d n i , we use 1 + αd n i . By doing so, our model is valid even if the distance d i is close to or equal to 0. When d i is large (αd n i 1), the difference between these two models is negligible.
In this paper, we do not specify the type of the passive sensors, and the power decay model adopted here is quite general. For example, in a radar or wireless communication system, for an isotropically radiated electromagnetic wave that is propagating in free space, the power is inversely proportional to the square of the distance from the transmitter [21,22]. Similarly, when spherical acoustic waves radiated by a simple source are propagating through the air, the intensity of the waves will decay at a rate inversely proportional to the square of the distance [23].
Because the noise has unit variance, it is evident that the SNR at local sensor i is We define the SNR at distance zero as SNR 0 = 10 log 10 P 0 .
Assuming that all the local sensors use the same threshold τ to make a decision and with the Gaussian noise assumption, we have the local sensor-level false alarm rate and probability of detection: where Q(·) is the complementary distribution function of the standard Gaussian, that is, We assume that the ROI is very large and the signal power decays very fast. Hence, only within a very small fraction of the ROI, which is the area surrounding the target, the received signal power is significantly larger than zero. By ignoring the border effect of the ROI, we assume that the target is located at the center of the ROI, without any loss of generality. As a result, at a particular time, only a small subset of sensors can detect the target. To save communication and energy, a local sensor only transmits data ("1"s) to the fusion center when its signal exceeds the threshold τ.

Decision fusion rule
We denote the binary data from local sensor i as N). I i takes the value 1 when there is a detection; otherwise, it takes the value 0.
We know that the optimal decision fusion rule is the Chair-Varshney fusion rule [6], and it is a threshold test of the following statistic: This fusion statistic is equivalent to a weighted summation of all the detections ("1"s) that a fusion center receives. The decision from a sensor with a better detection performance, namely higher p di and lower p fai , gets a greater weight, which is given by log(p di (1 − p fai )/ p fai (1 − p di )). As long as the threshold τ is known, the probability of false alarm at each sensor is known (p fai = p fa ) from (9). However, at each sensor, it is very difficult to calculate p di since according to (10), p di is decided by each sensor's distance to the target and the amplitude of the target's signal. To make matters worse, we do not even know the total number of sensors N because the fusion center only receives data from those sensors whose received signals exceed the threshold τ, as we have assumed in Section 2.1. An alternative scheme would be that each sensor transmits raw data s i to the fusion center, and the fusion center will make a decision based on these raw measurements. However, the transmission of raw data will be very expensive especially for a typical WSN with very limited energy and bandwidth. It is desirable to transmit only binary data to the fusion center. Without the knowledge of p di s, the fusion center is forced to treat detections from every sensor equally. An intuitive choice is to use the total number of "1"s as a statistic since the information about which sensor reports a "1" is of little use to the fusion center. As proposed in [19], the system-level decision is made by first counting the number of detections made by local sensors and then comparing it with a threshold T: where I i = {0, 1} is the local decision made by sensor i. We also call this fusion rule the "counting rule."

Hierarchical network structure
In this paper, we focus on the application aspect of the WSN. Routing protocols and network structures are beyond the scope of this paper. In Sections 2.1 and 2.2, a very simple network structure is implied. That is, all the sensors in the ROI report directly to the fusion center. However, our analysis results, which are based on this simple assumption and will be presented later, are quite general and can be applied to various scenarios and network structures. In this section, we give an example to show how the proposed approach can be adapted to complicated and practical applications.
Suppose that the sensor field is quite vast and the signal decays very fast as the distance from the target increases. As a result, only a tiny fraction of the sensors can detect the signals from the target, as illustrated in Figure 2. Most sensors' measurements are just pure noises. Since the local decisions from these sensors do not convey much information about the target, it is neither very useful nor energy efficient to transmit them to the fusion center. When the sensor network is very large, there is also the issue of scalability. One reasonable solution is to use a three-layered hierarchical network structure, as shown in Figure 3. Sensors that are close to each other will form a cluster and each cluster has its own cluster head or cluster master, which serves as the local fusion center and is supposed to have more powerful computation and communication capabilities. Each cluster is in charge of the surveillance of a subregion of the whole ROI, as shown in Figure 2. Instead of transmitting data to a faraway central fusion center, sensors will send data to their corresponding cluster head. Based on data transmitted from sensors located within a specific cluster/subregion, the corresponding cluster head will make a decision about target presence/absence within that subregion. The decisions from cluster heads will be further transmitted to the fusion center to inform it if there is a target or event in specific subregions. The theoretical analysis provided later in this paper can be used to evaluate the detection performance at the clusterhead level, as long as the assumptions made in Section 2.1 are still valid within each cluster/subregion.

PERFORMANCE ANALYSIS
In this section, the system-level detection performance, namely the probability of false alarm P fa and probability of detection P d at the fusion center, will be derived, and the analytical results will be compared to simulation results.

System-level false alarm rate
At the fusion center level, the probability of false alarm P fa is Obviously, for a given N, under hypothesis H 0 , Λ follows a binomial (N, p fa ) distribution. When N is large enough, Pr{Λ ≥ T|N, H 0 } can be calculated by using Laplace-De Moivre approximation [24]: It is well known that the kurtosis of a Poisson distribution is 3 + (1/λ). As λ increases, the kurtosis of this Poisson distribution approaches that of a Gaussian distribution, and its distribution has a light tail. This can also be explained by the unique characteristic of the Poisson distribution. its distribution approaches a Gaussian distribution, according to the central limit theorem (CLT). As a result, when λ is large, the probability mass of N will concentrate around the average value (λ). This phenomenon is illustrated in Figure 4, where the probability mass function of N has been plotted for λ = 1000 and λ = 10 000. Due to this characteristic of the Poisson distribution, using the fact that both the mean and variance of a Poisson RV are λ, we have the following approximation when λ is large: where N 1 = λ − 6 √ λ and N 3 = λ + 6 √ λ . Hence, for a large λ, a "typical" N is also a large number. The probability that N takes a small value is negligible. For example, when λ = 1000, Pr{N < 810} = 2.4 × 10 −10 ; when λ = 10 000, Pr{N < 9400} = 6.6 × 10 −10 . Therefore, when λ is large enough, we have where N 2 = max(T, N 1 ), µ 0 N p fa , and σ 0 N p fa (1 − p fa ). Note that for a large N, the Laplace-De Moivre approximation in (15) is valid, and this fact has been used in the derivation of (18). The significance of (17) also lies in the fact that the computation load in calculating P fa or P d (see (18) and (25)) is reduced significantly since in the computation, a summation of less than or equal to 12 √ λ terms is sufficient, rather than a summation of infinite number of terms.

System-level probability of detection
Because of the nature of this problem, different local sensors will have different p di , which is a function of d i as shown in (10). Therefore, under hypothesis H 1 , the total number of detections (Λ) no longer follows a Binomial distribution. It is very difficult to derive an analytical expression for the distribution of Λ. Instead, we will obtain the P d either through approximation or through simulation. In [19], through approximation by using the CLT, we derived the system level P d when the number of sensors N is large: Note that in [19], a different γ is used: γ used in this paper is slightly different from that used in [19], and it gives a more accurate approximation. But when the ROI is very large, meaning that b is large, the difference is really negligible. Interested readers can find the detailed derivations in [19]. Taking an average of (19) with respect to N, and similar to the derivation of (18), we have the system level P d as   where µ 1 Np d , and σ 1 Nσ 2 . Again, we use the fact that for a large λ, a typical N is large. Therefore, the Gaussian approximation in (19) by using the CLT is still valid.

Simulation results
The system-level P d and P fa can also be estimated by simulations. In Figures 5, 6, 7, and 8, the receiver's operative characteristic (ROC) curves obtained by using approximations in  Monte Carlo runs. From these figures, it is clear that the results obtained by approximations are very close to those obtained by simulations, even when the system-level P fa is very low (Figures 6 and 8).

Asymptotic analysis
It is useful to analyze the system performance when the average number of sensors λ is very large.  In (18), we know that Hence, as λ → ∞, we have N → λ, if T ≤ λ+6 √ λ . Assuming that the system-level threshold is in the form of T = βλ, we have Similarly, from (25), we have Therefore, when λ → ∞, if β < p fa , P fa = P d = 1; if p fa < β <p d , P fa = 0 and P d = 1; if β >p d , P fa = P d = 0. As a result, as long as β takes a value between p fa andp d , as λ → ∞, the WSN detection performance will be perfect with P d = 1 and P fa = 0. In Figures 9 and 10, P d and P fa as functions of λ are plotted. It is clear that the P d converges to 1 and P fa converges to 0, as λ increases. In this example, we set β such that β = (p fa +p d )/2. Another conclusion is that when λ is large enough, even for a small SNR 0 , the system can achieve a very good detection performance.

Optimality of the decision fusion rule
The proposed decision fusion rule (the counting rule) is actually a threshold test in terms of the total number of detections made by local sensors, and it is intuitive. It is important to compare the performance of this fusion rule to that of the  optimal decision fusion rule, which is also based on the total number of local detections from local sensors.
As we know, Λ in (13) is a lattice-type RV [24], which takes equidistant values from 0 to N. Hence, according to the CLT [24], for a large N, the probability p k = Pr{Λ = k|N} equals the sample of the Gaussian density: N). (29) Therefore, under hypothesis H 1 , for a large λ, we have where µ 1 (N) = Np d and σ 1 (N) = Nσ 2 . Similarly, under hypothesis H 0 , for a large λ, we have where µ 0 (N) = N p fa and σ 0 (N) = N p fa (1 − p fa ). Now it is easy to show that the likelihood ratio of Λ is Hence, the optimal fusion rule at the fusion center is a likelihood ratio test: Note that the implementation of the proposed counting rule for a Neyman-Pearson detector with a given system level P fa requires only the knowledge of λ and τ (or p fa ) in order to find the system-level threshold T through (18). To choose an optimal local threshold τ, as we will see later in this paper, the knowledge of P 0 is required too. However, the counting rule can still be implemented without an optimal τ, and a good choice of τ based on some prior knowledge of P 0 can always be made. As a result, an exact knowledge of P 0 is not necessary for the implementation of the counting rule, even though it is needed in the evaluation of the system-level detection performance.
As for the implementation of the optimal fusion rule, we need to have the exact knowledge of α, P 0 , and b to calculateσ 2 andp d . Hence, the optimal fusion rule requires much more information, especially the knowledge of signal power P 0 , which is unknown in most cases. Furthermore, because of its dependence on the exact knowledge of P 0 , the optimal fusion rule is more sensitive to the estimation errors of P 0 . Therefore, in this paper, the optimal fusion rule only has theoretical importance, and it is not very useful or robust in practical applications, where it is always difficult to estimate P 0 .
As we can see from (32), L(Λ) is a nonlinear transformation of Λ. The threshold tests of Λ and L(Λ) will have identical detection performances if L(Λ) is a monotonically increasing transformation of Λ. In Figures 11 and 12, L as a function of Λ is plotted for different system parameters. As we can see, in all the cases, L(Λ) is a monotonically increasing function of Λ, meaning that the counting rule and the optimal fusion rule are equivalent in terms of detection performance. In addition to the cases shown in Figures 11 and 12, we have extensively investigated the relationship between L and Λ for various system parameters. For all the system parameters we have studied, L(Λ) is a monotonically increasing function of Λ.
In Figure 13, the ROC curves obtained by simulations (based on 10 6 Monte Carlo runs) for the counting rule and the optimal fusion rule are shown. We can see that the ROC curves corresponding to the counting rule and those of the optimal fusion rule are indistinguishable.

THRESHOLD FOR LOCAL SENSORS
In addition to the ROC curve for performance comparison, one can also resort to the so-called deflection coefficient [25,26], especially when the statistical properties of the signal and noise are limited to moments up to a given order. The deflection coefficient is defined as In the case of Var(Λ|H 1 ) = Var(Λ|H 0 ), this is in essence the SNR of the detection statistic. It is worth noting that the use of deflection criterion leads to the optimum LR receiver in many cases of practical importance [25]. For example, in the problem of detecting a Gaussian signal in Gaussian noise, an LR detector is obtained by maximizing the deflection measure. In the above sections, we have assumed that the threshold τ (or equivalently p fa ) is given. From (18)   and (25), we know that both P fa and P d are functions of τ. Hence, τ is a parameter that can be designed to achieve a better system-level performance. In this paper, we will find the optimum local sensor-level threshold τ by maximizing the deflection coefficient. The deflection coefficient for the detection problem in this paper is derived and stated in the following theorem. Theorem 1. The deflection coefficient at the fusion center for the detection problem formulated in this paper is For the proof, see the appendix.  The optimum τ can be found by maximizing D(τ) with respect to τ. As we can see in Figure 14, there exists an optimal τ(0.7694) that maximizes the deflection coefficient D. By employing this optimum τ opt , a significant improvement in D can be achieved.
The system-level ROC curves for different τ are plotted in Figure 15. As we can see, the ROC curve corresponding to the optimal threshold τ opt (0.77) is above those for other thresholds, meaning that τ opt provides the best system level performance. In Figures 16 and 17, τ opt and the corresponding optimal p fa as functions of SNR 0 and α are shown. It is clear that τ opt is a monotonically increasing function of SNR 0 and a monotonically decreasing function of α. This is because with a strong target signal (high SNR 0 and low α), by adopting a higher threshold, local sensors lower their false alarm rate, while at the same time they can still attain a relatively high probability of detection.

CONCLUSIONS
We have proposed and studied a decision fusion rule that is based on the total number of detections reported by local sensors for a WSN with a random number of sensors. Assuming that the number of sensors in a ROI follows a Poisson distribution, we have derived the system-level detection performance measures, namely the probabilities of detection and false alarm. We have shown that even at very low SNR, this fusion rule can achieve a very good system-level detection performance given that there are, on an average, a sufficiently large number of sensors deployed in the ROI. The average number of sensors needed for a prespecified systemlevel performance can be calculated based on our analytical expressions. Another important result is that the proposed fusion rule is equivalent to the optimal fusion rule, which requires much more prior knowledge of the system parameters, for all the different system parameters we have investigated.
We have also shown that a better system performance can be achieved if we choose an optimum threshold at the local sensors by maximizing the deflection coefficient. If SNR 0 is high, and α is small, a higher local sensor-level threshold τ should be chosen; otherwise, a lower τ should be employed to achieve a better performance.