Online Bayesian Data Fusion in Environment Monitoring Sensor Networks

Assuring reliable data collection in environment monitoring sensor network is a major design challenge. This paper gives a novel Bayesian model to reliably monitor physical phenomenon. We briefly review the errors on the data transfer channel between the sensor quantifying the physical phenomenon and the fusion node, and a discrete 𝐾 -ary input and 𝐾 -ary output channel is presented to model the data transferchannel, where 𝐾 is thenumber of quantification levels at thesensor. Then, discrete time series models are used to estimate the mean value of the physical phenomenon, and the estimation error is modeled as a Gaussian process. Finally, based on the transition probability of the proposed data transfer channel and the probability of the estimated value transited to specific quantification levels, the level with the maximum posterior probability is decided to be the current value of the physical phenomenon. Evaluations based on real sensor data show that significant gain can be achieved by the proposed algorithms in environment monitoring sensor networks compared with channel-unaware algorithms.


Introduction
Advances in wireless communication technologies together with the aggressive feature size scaling in VLSI circuits have enabled the massive deployment of ultrasmall, cost efficient, and low power sensor nodes, which have led to the blossom of wireless sensor networks (WSN) in a wide range of applications, such as environment monitoring, battlefield surveillance, health care, and home automation [1]. Common goal in most WSN applications is to reconstruct the underlying physical phenomenon (e.g., temperature and humidity), based on sensor observations [2,3]. However, the context of WSN makes this task very challenging. First, each sensor is characterized by low power supply and limited computation and communication capabilities due to various design considerations such as small size battery, bandwidth, and cost. Second, the harsh environments where the sensor is deployed further exacerbate the reliability of sensors. Subsequently, ensuring the reliability of the data in WSN is going to be very challenging for WSN designers.
In WSN, sensor observations are exposed to various sources of errors during the course of sensing, processing, and communication. First of all, sensors may report readings with time-invariant bias known as systematic errors due to residual sensor calibration errors [4]. For example, in case of a light sensor, the biased readings can be incurred by sensor hardware or external factors such as dust particles on the protective lens of the sensor. After sensing and quantization, the data can still be disturbed by various soft errors on chips caused by thermal noise and cosmic ray radiations [5]. Afterward, the data samples again suffered from the wireless channel errors during the data reporting to sink node [6]. These kinds of errors are different in terms of severity, occurrence rates, and statistics.
Various error control strategies designed to control different errors for WSN have been investigated extensively in the literatures. Recently, schemes can calibrate single-sensor system biases by tuning each individual sensor based on ground truth information and advanced 2 International Journal of Distributed Sensor Networks collaborative schemes can handle system biases for large scale of sensors which have been proposed in [7,8], respectively. To tackle soft errors on chip and channel errors, majority of the existing methods have been turned to introduce redundancy to handle these kinds of errors individually. For example, error correction codes (ECC) and triple modular redundancy (TMR) [9] have been used widely to control soft errors on chips. Similarly, ECC together with automatic retransmission requests (ARQ) [10] have been applied to control wireless channel errors.
All the aforementioned algorithms address the reliable data reception problem by either introducing redundancies or fitting statistical models of the monitored phenomena but ignore the data transfer channel (DTC) error information. Due to the promise of considerable performance improvements as has been proved in decision fusion [11,12], target tracking [13], and multiple accesses [14] scenarios, channelawareness related algorithms have attracted wide interest. In this paper, we focus on the reliable data reception from a sensor node that quantizes the monitored phenomenon into levels, which is the basic problem for WSNs. We wish to design a decision device which chooses a quantization level for the current phenomenon such that the probability of a correct decision is maximized. Both the temporal correlation of a physical phenomenon and the error information of the DTC are modeled as prior input to the decision device.
The rest of the paper is organized as follows. Section 2 introduces the system model of data reception at the sink node from a sensor node. Section 3 models the error-prone DTC by a discrete -ary input and -ary output channel, where is the number of quantization levels. Then, a discrete time series model is used to estimate the value of the monitored physical phenomenon and the estimation error is modeled as a normal distribution in Section 4. Section 5 gives some results that demonstrate the efficiency of the proposed algorithms. Finally, conclusions and future extensions are given in Section 6.

System Models and Problem Formulation
The problem to be solved including the network and channelaware Bayesian model is first described formally. Figure 1   represents a subinterval with size Δ = ( − )/ . Using denotes that = ( = 1, 2, . . . , ), where is the digitalized version of at th transmission phase. Figure 1 at time , the sensor node sampled and quantified the observation as phenomenon J. After sensing and quantization, the observation is reported to the fusion node. After receiving the data reporting from sensor node , it is proposed to design a data detector that makes a decision on the received data and the error probability of the data transfer channel such that the probability of a correct decision is maximized. Using Bayes' rule, the posterior probabilities can be expressed as

Bayesian Data Reliability Model. As shown in
where ( / ) are the conditional probabilities of the phenomenon = at time , given the event that the gathered data at the fusion node from sensor node at time is . The decision criterion is based on selecting the subinterval corresponding to the maximum of the set of posterior probabilities. ( / ). ( / ) are the conditional probabilities of the event that the gathered data at the fusion node from sensor node at time is , given phenomenon = at time , and ( / ) are the priori probabilities which specify the data transfer channel. ( ) are the probabilities of the phenomenon = at time and are also the priori probabilities that depended on the statistic property of the phenomenon. The denominator of (1) can be expressed as From (1) and (2), we observe that the computation of the posterior probabilities ( / ) requires knowledge of the priori probabilities ( / ) and ( ).

Data Transfer Channel Model for WSN
In Section 2, we have demonstrated that the computation of the posterior probabilities ( / ) requires knowledge of the priori probabilities ( / ), which are depended on the statistic property of the data transfer channel between the sensor of the sensor node who samples the phenomenon J and the fusion node shown in Figure 2. In this section, we first investigate the error models for the cosmic ray radiation induced errors, thermal noise on chip induced errors, and wireless channel errors. Then, we propose a discrete -ary input and -ary output channel to model the data transfer channel as shown in Figure 2.

Model for Radiation Induced Transient
Error. When a radioactive particle strikes a semiconductor device, it causes a transient pulse that may alter the logical state of the struck node. The generated transient glitches on the circuit can be propagated through the circuit and reversed value gets latched into sequential circuit and becomes an error. Soft error rate (SER) of a circuit is exponentially related to the critical charge CRIT , which is the minimum charge required to cause a soft error and is proportional to system supply voltage: is the neutron flux (i.e., radiation intensity), which is related to the environment where the device operates. is the area of the circuit sensitive to particle strikes.
is the charge collection efficiency of the device. According to [5] CRIT ∝ .
is system supply voltage. As shown in (4), when supply voltage decreases the critical charge decreases, which will increase the transient error rate. Besides, the flux of low energy particles is orders of magnitude higher than high energy particles [15]. Therefore, with smaller critical charge, circuits are more vulnerable to transient errors induced by lower energy particle strikes.

Model for Thermal Noise Induced Transient Error.
Thermal noise in logic gates is a stationary Gaussian stochastic voltage fluctuation process with zero mean value and deviation = , where is the Boltzmann constant and is the temperature [16]. The model of logic gate considered herein is shown in Figure 2 and was first used by author [17]. It consists of three cascaded stages. The first stage is the logic function, which is used to compute the true value of the logic gate. The next stage is the additive white Gaussian noise (AWGN) channel, which is the noise channel when the logic value transfers through the logic gate. The essence of this stage is to add thermal noise to the data transferred from the input to output. The third is a threshold function, which restores binary data readout from the logic gate. Whenever the noise voltage exceeds the threshold value th of the logic gate, an error will happen. Clearly, the logic gate, through which data is transferred, can be modeled as a binary symmetric channel (BSC) [5]. The bit error rate (BER) of the BSC channel is : where erfc( ) = (2/√ ) ∫ ∞ exp(− 2 ) and we assume th = (1/2) , which is the typical case in [5].

Model for Wireless Channel Error.
Since most of the envisioned applications for WSNs assume that sensor nodes are densely lying on the ground and their antennas are a few centimeters over the ground, this near ground feature makes the channel models for WSNs to be different from the channel models used in mobile wireless communication networks, where antennas are always over the ground by 1.5 meters. Many field measurement campaigns have verified and reported that one and two slope log distance path loss models are suitable for near ground channel for WSNs deployed in outdoor scenarios [18]. Comparing to one-slope log-distance path loss model, two-slope log-distance path loss model can provide lower model error at 868 MHZ in ISM band. Because most of the radio energy is intercepted by the ground within the Fresnel zone, the path loss exponent will be bigger than the path loss exponent beyond the Fresnel zone. However, considering the small size of the Fresnel zone, we think one-slope log-distance path loss, which can be fitted by the measurements beyond the Fresnel zone, can accurately model the channel for WSNs: where PL( 0 ) is the path loss at a reference distance 0 in dB, is the path loss exponent, and denotes the shadowing fading component, with ∼ (0, ). is the one-hop channel response in frequency domain for the two sensors at distance apart.
Based on Definition 2, we define an intermediate variable called channel fading factor ℎ ( ) = 10 log 10 (‖ℎ( )‖), which measures the amplitude of ℎ( ) in dB: Definition 3. The additive Gaussian noise at the receiver is ∼ (0,√ 0 /2). We further assume that the sensor nodes transmit binary PAM signals, where the two signals waveforms are 1 = √ and 2 = −√ in the signal duration intervals and zero elsewhere. The two signals are equally likely to be transmitted.

Theorem 4. If Definitions 2 and 3 hold, the probability of error between one hop is
Proof. Let us assume 1 is transmitted. Then, the received signal from the demodulator is where = ‖ℎ‖ 4 , ∼ (0, √ 0 /2), 0 = ‖ℎ‖ 2 0 . The optimal decision is made by the following rules: The two conditional pdfs of are Given that 1 is transmitted, the probability of error is simply the probability that < 0; that is, Similarly, if we assume that 2 is transmitted, the received signal from the demodulator is Given that 2 is transmitted, the probability of error is simply the probability that > 0; that is, Based on (10) and (12), the probability of error between one hop is as follows like (8): Theorem 5. Figure 3 where ℎ , , and denote the channel response of the th hop, the transmitted, and received signal by the th relay node in the th transmission phase, respectively. Particularly, 0 and are, respectively, the transmitted signal by the sensor node and received signal at the sink node in the th transmission phase.
Proof. First, we define that the th relay node scales the received signal by a factor = 1/‖ℎ ‖ at the transmission phase . As shown in Figure 3, the received signal at the sink node in the th transmission phase is = ℎ −1 + 0 and the transmitted signal of this hop in this transmission phase is −1 = −1 −1 ; then, The received signal of the first relay node in the th transmission phase is Based on (17) and (18), the received signal at the sink node in the th transmission phase is where 0 = 1 plug the scale factor into (19); the received signal at the sink node in the th transmission phase can be expressed as Clearly, the second term of (20) can be upper-bounded by Based on (20) and (21), using LS demodulator, the bit error probability of -hop WSNs is (16).

Data Transfer Model for WSN.
Observing (3), (5), and (8), the reliability of data transfer between any two sensor nodes within one hop is determined by the soft error rate induced by the cosmic ray radiation, bit error rate induced by the on chip thermal noise, and error probability of one-hop transmission as shown in Figure 2. Furthermore, all of the three kinds of errors are related to the power supply voltage. Given the power supply voltage, the above three kinds of errors are independent with each other. For data transfer from the sensor of the sensor node who samples the phenomenon J and the fusion node within one hop, the bit error probability is  Then, on the same data transfer channel, the data error probability is where is the number of bits in one data sample as defined in Definition 1. Since the sensor node quantifies the phenomenon J into subintervals = { 1 , . . . , } as in Definition 1, we can use the discrete -ary input and -ary output channel shown in Figure 4 to model the data transfer channel between the sensor of sensor node who samples the phenomenon J and the fusion node shown in Figure 2. The channel model shown in Figure 4 is characterized by the following conditional probabilities: where dis( , ) is the Hamming distance between the binary representation of and , which are, respectively, the th and th subintervals of the discrete -ary input and -ary output channel. ( / ) is the conditional probability of the event that the output of the channel is , given the event that the input is .

Data Source Model for Physical Phenomenon
From (1) and (2) in Section 2, we observe that the computation of the posterior probabilities ( / ) requires knowledge of the priori probabilities ( ), which are depended on the statistic property of the phenomenon J. In this section, we first investigate methods to model the data source of the phenomenon J. Then, methods to update the parameters of data source models online are provided to ensure robustness. Consider an environment monitoring scenario where the sink node needs to continuously reconstruct the data field based on the data gathered from all the sensor nodes spatially and temporally correlated. That is data at any given point in such a field are correlated not only with the data at nearby points but also with the previous values of the data measured at the same point.

Empirical Model for Data Source.
Models to explore the spatial and temporal correlation of a physical phenomenon are investigated in [13]. As in [13], the quantization is assumed to be very fine, and the quantization error is ignored. If at time 2 , the sink node uses the data value gathered at the point 1 and time 1 to estimate the current value at point 2 , the corresponding mean square error is assumed as follows: In (25), the spatial distortion is related to | 1 − 2 |, and the temporal distortion is related to | 1 − 2 |. and are the spatial and temporal correlation parameters, and higher values of these parameters specify weaker correlated field. In this section, we focus on the prediction of the current value using the historically gathered data at the same location and make the following definitions. Definition 6. Define the prediction function as (⋅). The estimation of the current value is denoted bȳ, and then We further define the ℓ = −̄as the prediction error.
Time series models are used for the prediction of the future behavior of variables. These models account for the fact that observations have an internal structure (e.g., autocorrelation, trend, or seasonal variation) that should be accounted for. Two commonly used forms of these models are autoregressive models (AR) and moving average (MA) models.
Definition 7. The distribution of the prediction error ℓ is a normal distribution, with ℓ ∼ (0, 2 ).
Based on Definitions 4 and 5, we can give the data source model for the phenomenon J as shown in Figure 5. Denote ( /̄) as the probability of occurrence of the event that the current data gathered from sensor node is under the condition that̄= .
Given −1 = , we observe that the parameter ( ) of the Bayesian model (1) is equal to ( / −1 ) as shown in Figure 5. This empirical model is characterized by one parameter to be determined. Comparing to (25), we know that interval. In the runtime stage, we can update the parameter according to the observations gathered at the fusion node as = max(| − −1 |), with = − , . . . , − 1. is the window length and is less than .

Learning Model for Data Source.
Beside the empirical models, we can also learn the parameter ( ) in the Bayesian model (1) by random experiments. That is, the fusion node needs to keep 2 counters to log the occurrences of the pair ( −1 , ). Learning the model parameter by training is simple. However, enough random experiments are required until the learning is reasonable.

Evaluation Setup.
The data are quantized as 8-bit values in our evaluations, including sensor data about water temperature, dissolved oxygen in river water, and river stage from the California Data Exchange Center (CDEC) [19]. As shown in Figures 6, 7, and 8, these data sets differ in terms of autocorrelation properties and degrees of stationarity.
Most of the envisioned applications of WSNs assume that sensor nodes are densely lying on the ground and their antennas are a few centimeters over the ground. This near ground feature makes the channel in WSNs to be very different from the channel in mobile communication networks, where antennas are always over the ground by 1.5 meters. Many field measurement campaigns have verified and reported that the two-slope log-distance path loss model can accurately model the near ground channel in WSNs deployed in outdoor scenarios at ISM frequency band [20].   Considering the small Fresnel zone, we assume there are no sensor nodes locating within the Fresnel zone of the other sensor nodes, and we can use one-slope path loss model shown in (28) as the channel between the transmitter and receiver of every hop: where 0 is the breakpoint, 0 is the path loss at 0 , is the path loss factor, is the distance between the transmitter and receiver, and is a normal random variable with standard deviation .
In experiments, we assume 0 = 2 m, 0 = 70 dB, = 4, = 6 dB, the output power of the transmitter is −30 dBm, the noise power at the receiver is 0 = −174 dBm, and there are no gains for both the antennas of the transmitter and receiver. We further assume that the relay nodes and sink node know perfect channel state information. The residual errors in the data after error corrections at the sink node are evaluated in terms of the normalized mean square error (NMSE) as follows: (29) In (29), is the total number of samples. An AR model is first fitted by an offline process with the Yule-Walker method [21] in our evaluations. If the order or prediction error of the fitted AR model is too large, a RHW model [22] will be used instead. At the runtime stage, the parameters of the AR and RHW models are updated by each window of observations with the Yule-Walker method [21] and the method detailed in [22], respectively. The residual errors out are compared with the NMSE in of the error-corrupted data received at the sink node. For example, we list the improvement factors ( in / out ) of the proposed channel-aware Bayesian model in Table 1, where the sensor node directly routes the observations of the monitored phenomenon J to the sink node and the distance between the sensor node and sink node is 40 m. In Table 1, we also give the parameters of the data sets used in our evaluation.

Evaluation in One-Hop WSNs.
In this subsection, we evaluate the error resilience of the proposed Bayesian model, Reed-Solomon (RS) coding, and discrete time series models in one-hop WSNs, where the sensor node directly routes the observations of the monitored phenomenon J to the sink node. And the distance between the sensor node and the sink node varies from 20 m to 100 m. Figures 9, 10, and 11 compare the NMSE of the data corrected by the proposed Bayesian model and Reed-Solomon (RS) coding, and the data estimated by discrete time series models (i.e., RHW or AR models) with the NMSE of the error-corrupted data received by the sink node. In Figures 9, 10, and 11, NMSE floors of the data estimated by discrete time series models can soon be noticed, which are dominated by estimation lags around relatively fast changes of the physical phenomena. The error-corrupted data received at the sink node and data corrected by Bayesian model and RS coding show NMSE floors as well; however, these NMSE floors are much lower than the error floor of the data estimated by RHW or AR models and the common source of these NMSE floors is quantization errors. From Theorem 4 and (28), we know that the bit error probability of one-hop transmission is directly proportional to the hop distance. Due to error correction ability, the RS coding exceeds the proposed Bayesian model and discrete time series models at very low channel error levels. However, this trend is reversed in the presence of high channel error levels, because large NMSE incurred by the untreated decoding errors of RS coding in this case. The other side effect of RS coding is high communication overhead. For example, RS coding with 0.43 code rate ( = 0.43), which can nearly correct all errors at very low channel error levels, will increase the communication overhead by 57%. Unlike FEC, the overhead of the proposed Bayesian model is computation increasing at the sink node, which is caused by probabilities multiplications and can be solved by log-domain additions. Depending on the channel error levels, the channel-aware Bayesian model will measure the quantization level with higher likelihood by more confidence. Particularly, at very low channel error levels (e.g., the distance of the hop is 20 m), the quantization levels with small distances (i.e., 0 or 1) to the channel output will have significant chance to be believed as the channel input by the channel-aware Bayesian model. Hence, at very low channel error levels the channelaware Bayesian model performs much better than the AR and RHW models that only explore temporal correlations among observations.

Evaluation in Multihop WSNs with Amplify-and-Forward
Relay Nodes. In this subsection, we evaluate the error resilience of our model, RS coding, and discrete time series  models in multihop WSNs, where the readings of the monitored phenomenon observed by the sensor node are routed to the sink node by ( = 1, 2, 3, 4) relay nodes. The distance between the sensor node and the sink node is 200 m, and the distance of every hop is 200/( + 1) m. Figures  12, 13, and 14 compare the performance of our model with RS coding and discrete time series models. In the multihop sensor networks, where the former relay node amplifies its received signal and forwards it to the next relay node or sink node, the bit error probability of the DTC for this scenario is given by Theorem 5. Clearly as shown in both Theorem 5 and experiment results, the reliability of the endto-end link in multihop networks degrades considerably compared with the one-hop networks with the same average distance per hop, because the amplified noise in every relay node along the transmission will finally be cumulated at the sink node. However, given a field for the coverage of WSNs, deploying more AF relay nodes still can improve the reliability of the data reception at the sink node. Similar to the results in one-hop networks, our model is still superior over the AR or RHW models and performs as well as RS coding in all cases. However, because of the increased noise level at the sink nodes in multihop networks with AF relay nodes, the performance gain of our model compared with RS coding at medium to large hop distance is increased and the International Journal of Distributed Sensor Networks  performance degradation of our model at small hop distance is decreased. Moreover, like the one-hop networks, our model brings no communication overhead and the overhead of our model is computation increasing at the sink node.

Conclusion and Future Works
In this paper, we have presented a channel-aware Bayesian model to reliably recover data from transient errors on the data transfer channel between the sensor node which monitors a phenomenon and the sink node, and a dis-16 crete -ary input and -ary output channel has been provided to formulate the error information on the data transfer channel. Using the real sensor data from CDEC, we have evaluated the performance of our method in both one-hop and multihop sensor networks. In all scenarios, the channel-aware Bayesian model is obviously superior over the discrete time series models (i.e., RHW or AR models), which merely explore the correlations among the observations of the monitored phenomena. Furthermore, the proposed channelaware Bayesian model performs as well as the RS coding with code rate 0.43; however, the later will increase the communication overhead by 57%. Besides, RS coding can only control the wireless channel errors. The shortcoming of the channel-aware Bayesian model, which is the computation increasing caused by probability multiplications at the sink node, can be solved by log-domain algorithms. We have identified several future research avenues. First, we currently assume that the sink node and relay nodes know perfect channel state information, and in-field model performance still needs future evaluation. Moreover, the combinations of the proposed algorithms with distributed data fusion and signal detection algorithms considering noise at the sensor of individual sensor node are the most important extensions in future.