Minimum Energy Decentralized Estimation in a Wireless Sensor Network with Correlated Sensor Noises

Consider the problem of estimating an unknown parameter by a sensor network with a fusion center (FC). Sensor observations are corrupted by additive noises with an arbitrary spatial correlation. Due to bandwidth and energy limitation, each sensor is only able to transmit a ﬁnite number of bits to the FC, while the latter must combine the received bits to estimate the unknown parameter. We require the decentralized estimator to have a mean-squared error (MSE) that is within a constant factor to that of the best linear unbiased estimator (BLUE). We minimize the total sensor transmitted energy by selecting sensor quantization levels using the knowledge of noise covariance matrix while meeting the target MSE requirement. Computer simulations show that our designs can achieve energy savings up to 70% when compared to the uniform quantization strategy whereby each sensor generates the same number of bits, irrespective of the quality of its observation and the condition of its channel to the FC.


INTRODUCTION
Wireless sensor networks (WSNs) are ideal for environmental monitoring applications because of their low implementation cost, agility, and robustness to sensor failures.A popular WSN architecture consists of a fusion center (FC) and a large number of spatially distributed sensors.The FC can be either a standard base station or a mobile access point such as an unmanned aerial vehicle hovering over the sensor field.Each sensor in a WSN is responsible for local data collection as well as occasional transmission of a summary of its observations to the FC via a wireless link.In a practical WSN, each sensor has only limited computation and communication capabilities due to various design considerations such as small size battery, bandwidth, and cost.As a result, it is difficult for sensors to send their entire real-valued observations This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.to the FC.Instead, a more practical decentralized estimation scheme is to let each sensor quantize its real-valued local measurement to an appropriate length and send the resulting discrete message (typically short) to the FC, while the latter combines all the received messages to produce a final estimate of the unknown parameter.Naturally, the message lengths are dictated by the power and bandwidth limitations, sensor noise characteristics as well as the desired final estimation accuracy.
Recently, several decentralized estimation schemes (DES) [1,2,3,4] have been proposed for parameter estimation in the presence of additive sensor noise.These DESs require each sensor to send only a few bits to the fusion center, with the message length determined by the sensor's local SNR.Performance of the resulting estimator is shown to be within a constant factor of the best linear unbiased estimator (BLUE) performance.While the designs suggested by [1,2,3,4] give a guaranteed estimation performance with low bandwidth requirement, the effect of wireless channel distortion and the important issue of total sensor energy minimization were not directly modelled.
In a practical WSN, the wireless links from sensors to the FC may have different qualities, depending on the sensor locations relative to the FC.Intuitively, local message length should depend not only on the quality of sensor's observation (i.e., local SNR), but also on the quality of its wireless link to the FC.In particular, even if a sensor has a highquality observation, it should not perform any local quantization or transmission when its wireless link to the FC is weak, in order to conserve sensor energy.In general, minimizing the total sensor energy consumption for a decentralized estimation task is essential to ensure long lifespan of a WSN.Motivated by these considerations, the authors of [5,6] proposed optimal coded and uncoded transmission strategies for sensor networks which can minimize the required energy per transmitted bit, although no consideration was given to the quantization effect and the accuracy of final estimation.In the recent work of [7,8], the authors considered the problem of optimal energy scheduling for decentralized estimation where sensor measurements are corrupted by additive noises, while communication links from sensors to the fusion center differ in quality.In particular, [7] used an adaptive modulation scheme with an exponential dependence of energy on the transmitted message size, and then derived optimal sensor power and quantization levels via convex optimization.
The aforementioned results all require an important assumption that sensor observation noises are spatially uncorrelated.Unfortunately, this assumption can be restrictive in a practical WSN, especially when sensors are densely deployed.In this paper, we consider distributed parameter estimation in situations where sensor observations are corrupted by correlated additive noises.Assuming a standard energy model [5,6], uniform quantization at sensors, and the knowledge of sensor noise correlation matrix, we use convex optimization techniques to derive a nearly-optimal (modulo a minor relaxation) energy scheduling strategy with a mean-squared error performance guaranteed to be within a constant factor to that of the centralized BLUE estimator.Computer simulations show that our designs can achieve energy savings up to 70% when compared to the uniform bit allocation strategy whereby each sensor generates the same number of bits.
Our sensor energy scheduling strategy is suitable for direct application when the sensor noise correlation matrix is available at the FC.In practice, the sensor noise correlation matrix may have to be determined in the sensor network calibration phase, possibly with the help of training signals.In the absence of this knowledge, our scheme is also useful as it provides an upper bound on the performance of all other energy scheduling schemes, both centralized and distributed.In fact, our scheme gives an estimate of the amount of energy "wasted" due to the lack of sensor noise correlation knowledge.The power schedules generated by our design also give insight into the design of distributed energy scheduling algorithms.
Our paper is organized as follows.In Section 2 we describe the DES and formulate total energy minimization problem.In Section 3 we present a convex relaxation of the energy minimization problem and give a nearly-optimal solution in closed form.The performance of our energyefficient design is analyzed in Section 4 by numerical simulation.Section 5 contains an extension of the work where we formulate an alternative problem of minimizing maximal individual sensor energy and present an analytic solution.Final remarks are given in Section 6.
Throughout, we use the following notations.Matrices and vectors are denoted by boldface letters, capital and small correspondingly, whereas same regular letters with indices denote their elements.Diagonal matrix with nonzero elements a 1 , . . ., a N is denoted by diag(a 1 , . . ., a N ).Logarithms denoted by log(•) are taken to the base 2; for natural logarithms notation ln(•) is used.For any real number x ∈ R, we use x to denote the smallest integer greater or equal to x.For any random variable R, we use E x R to denote the expected value of R taken with respect to random variable x, while E x|y R denotes the expected value of R with respect to x given y.Finally, var R denotes the variance of random variable R.

PROBLEM FORMULATION
Consider the problem of estimating an unknown parameter θ by a sensor network consisting of N sensors.Measurement of each sensor x i is corrupted by additive noise n i so that ( We assume that both θ and n i have finite range, so that all x i belong to a common finite interval [−U, U], with U > 0 a known constant.The noises n i are assumed to be zero mean and correlated across sensors with covariance matrix C, but otherwise unknown.We assume C is known at the FC.Measurements x i are quantized to produce messages m i to be passed on to the fusion center; the latter then combines received messages in order to estimate θ, see Figure 1.The exact form of m i will be detailed later.We assume that each sensor sends messages to FC using a separate channel.This can be achieved by using a multiple access technique such as TDMA or FDMA.Each channel is corrupted by additive white Gaussian noise (AWGN) with power spectral density N 0 /2: where mi is the received message at FC and v i is the AWGN.The signal power received at the FC is assumed to be inversely proportional to d κ i where d i is the distance between sensor i and the FC, and κ is the path loss exponent.Suppose that message m i has length b i bits.We will assume that energy W i required for transmission of m i is proportional to the number of bits in the message.This is the case, for example, if sensors use M-QAM or M-PSK modulation to transmit messages.For example, if M-QAM is used, W i can be found as follows [5,6]: where s = log M is the number of bits per symbol, N f is the receiver noise figure, P b is the required bit error probability, and G 0 is the system constant defined as in [5].

Quantization strategy
Suppose that sensor observation x i is bounded to a finite interval [−U, U].Suppose further that we wish to quantize x i in such a way that resulting message m i has length b i bits, where b i is to be determined later.We therefore have divide the observation range into K i − 1 intervals, it follows that ∆ i = 2U/(K i − 1).Quantization is done in the following probabilistic manner.Suppose that ).Then x i is quantized to either a (i)  k+1 or a (i) k according to This probabilistic quantization produces a message m i whose expected value equals the observation itself: where the expectation E pi is taken with respect to the probabilistic quantization noise model ( 4).Next, we consider any fixed observation value of x i , and bound the variance var m i (taken with respect to the quantization noise) as follows.Suppose x i falls in the interval [a (i)  k , a (i) k+1 ).We denote r = a (i) k+1 − x i and Thus, the maximum variance of m i is equal to ∆ 2 i /4 and is achieved when the observation x i falls in the middle of quantization interval [a (i)  k , a (i) k+1 ).

A linear fusion rule
The classical best linear unbiased estimator (BLUE) for θ is given by [9] θ where x = (x 1 , . . ., x N ) T and 1 is the vector of all ones.Estimation performance is characterized by the variance of the estimator To implement BLUE exactly in a WSN setup, we must have m i = x i (i.e., real-valued message) and assume that the channel is distortion-less, both of which are unrealistic in practice.Nonetheless, BLUE estimator serves as a good performance benchmark for the DES to be designed.Motivated by the centralized BLUE, we adopt the following fusion rule: upon receiving sensor messages m i , the FC combines them into an estimator θ given by θ where m = (m 1 , . . ., m N ) T .Equation (5) gives us an important property of θ: it is an unbiased estimator for θ.Indeed, we have where E p denotes expectation taken with respect to all sensor quantization noises, and the last step is due to E x x = θ1.The mean-squared error (MSE) of θ can be expanded as follows: Consider the third term in the last expression.We have where the second step is due to the fact that θ is independent of m for any fixed x, and the last step follows from (10).Thus, we can write where is the quantization noise correlation matrix.
In our formulation, we seek an energy-efficient DES which can deliver an MSE performance that is comparable to that of the centralized BLUE estimator.Specifically, we will minimize the transmission energy while maintaining the MSE( θ) to be within a constant factor of the BLUE performance, that is, MSE( θ) ≤ (1 + α) var θ for some constant α > 0. Therefore, the following condition must hold: The total sensor transmission energy is equal to where w i is the energy required for transmission of a single bit from sensor i to the FC; see (3).Therefore, the minimum energy DES design problem becomes where N denotes the set of nonnegative integers.
To complete the formulation, we need to make explicit the dependence of Q on b i .The unbiasedness of our quantization strategy leads to the following important property on the quantization noise correlation matrix Q.
Lemma 1.The quantization noise matrix Q is diagonal.Proof.Consider any (i, j)th element of the matrix Q, with i = j.We have Here we use the fact that random variables m i and m j are conditionally independent given corresponding observations x i and x j , which together with (5) gives the desired result.
Lemma 1 states that all the off-diagonal entries of Q must be zero.Let Q ii be the ith diagonal element of Q. Recalling (6), we obtain the following important bound on the diagonal entries of Q: where b i is the number of bits in m i .This bound will be useful in our final formulation of the energy minimization problem.

Total energy minimization
We introduce the notation c = C −1 1 and β = α/ var θ.Since var θ = 1/1 T C −1 1, we can rewrite the MSE condition (15) as This constraint ensures that the MSE performance of the DES is within a factor of α to the BLUE performance.Since the distribution of x is unknown in general, we enforce a stronger condition, namely max Recalling that Q is diagonal (cf.Lemma 1), we can use the bound (19) to rewrite the above condition as max Now we can reformulate the original energy minimization problem (17) explicitly as follows: minimize To relate this formulation to physical parameters, we note that the wireless channel conditions, the choice of modulations/BER, and so forth will determine the values of weighting factors w i , as shown in (3).The values of c i are determined by the noise correlation matrix C. Without loss of generality we assume c i = 0 for all i.In case c i = 0 for some sensors, we can exclude corresponding m i from fusion consideration, as it does not contribute to the fusion estimate θ.

CONVEX RELAXATION WITH A CLOSED-FORM NEARLY-OPTIMAL SOLUTION
Since b i can only take integer values, problem ( 23) is actually a nonlinear integer program whose computational complexity is typically NP-hard.To make this problem computationally tractable, we relax the integer constraints on b i to allow them to take real nonnegative values: The relaxed problem (24) has a linear objective function and convex inequality constraints.Therefore, solution to problem (24) can be efficiently found by the fusion center using convex optimization techniques such as the interior point methods [10].Once the optimal b i 's are found, the fusion center can round this solution to the nearest greater integer and broadcast it to the sensors for power adjustment.
In what follows, we will present an approximatelyoptimal solution to the problem (24) in closed form.Such a closed-form solution not only simplifies the energy scheduling process, but also provides valuable insight into the optimal power-scheduling scheme.To begin, we first note that, by a simple monotonicity argument, the main MSE constraint will be active (i.e., holds with equality) at any optimum point, 1 while the remaining nonnegativity constraints on b i will be inactive since b i = 0 for some i would violate the main MSE constraint.Therefore, we can ignore the nonnegativity constraints (since the Lagrangian multipliers associated with these constraints will be zero).Associating a multiplier λ with the MSE constraint, we can write the Lagrangian for the problem (24) as follows: At the point of optimum we must have ∂L/∂b i = 0 for i = 1, . . ., N, yielding the following set of conditions: or alternatively where λ = 1/2λ ln 2. Also, the main MSE constraint holds with equality at optimum point (as noted above), yielding The optimal solutions {b i , λ } can be found from the nonlinear equations ( 27) and ( 28) which unfortunately cannot be solved in the closed form.To facilitate a closed-form solution, we consider a slightly modified system in variables The above system is almost identical to the original Karush-Kuhn-Tucker (KKT) system ( 27) and ( 28) except for the small change in the numerators of the left-hand sides of ( 30) and ( 27).Simple algebraic manipulation shows that ( 29) and (30) can be solved analytically, yielding Substituting this λ * into (30) gives the following feasible solution to the original energy scheduling problem (24): It remains to quantify the performance of this particular energy scheduling strategy.This is the content of next two lemmas.
Lemma 2. Let {b i , λ } be the optimal solution to the problem (24) such that b i ≥ 1 for all i, and let {b * i , λ * } be its approximation defined by (29) and (30).Then Proof.Since an upper bound on λ can be found using (27) as follows: and we conclude that λ * ≤ λ .On the other hand, if all b i ≥ 1 we can write therefore λ ≤ 2λ * , and the result of the lemma follows.
We now bound the difference Proof.Using left-hand side of (36) and right-hand side of (33) we can write which gives the lower bound on b i : By analogy, from right-hand side of (36) and left-hand side of (33) we have which further implies This completes the proof.

Lemma 3 implies that |b
Thus, rounded optimal solution b i is at most one bit away from b * i .We can interpret this result as follows: in situation when b i are sufficiently large, for example, when high estimation precision is required, the optimal solution behaves approximately as log(1 + |c i |/ λ * w i ).Notice that c i = e T i C −1 1 (e i denotes the ith unit vector), so c i signifies the inverse of "noisiness" of signal x i in relation to the other sensor observations.Recalling the definition of λ * we note that product λ * w i is proportional to the relative energy per bit w i / w j and the value of 1/ λ * w i can be interpreted as being proportional to the relative quality of wireless link between sensor i and the FC.Thus, the local message length b * i can be intuitively interpreted as being proportional to the logarithm of the product of signal quality and channel quality at sensor i.
We now consider a special case when the use of {b * i } is especially appealing.Suppose that covariance matrix C has a block-diagonal structure This situation may occur when sensors in the network are partitioned into several clusters in such a way that sensors within each group are placed relatively close to each other and far from the rest of the sensors.Thus, sensor observations are uncorrelated unless they are generated from the same cluster.In this case matrix C −1 is also block-diagonal: We assume further that sensors within each group can cooperate to learn the corresponding covariance submatrix C j .
Value of λ * can be computed by the fusion center and broadcasted back to the sensors.Thus, each sensor can easily compute c i = [C −1 j 1] i and independently find its own quantization level b * i .The advantage of this method is that the fusion center needs to broadcast only one universal message for all sensors.
To conclude this section we observe that our strategy can be applied even if sensor noises have infinite range.Indeed, with an appropriate choice of U, that is, if tails of the noise pdf are negligible, the pdf can be approximated by a finite support function.However, the estimator (9) will no longer be unbiased and cross terms E( θ − θ)( θ − θ) in the MSE expression will no longer be zero.Thus, inequality (15) only defines a lower bound on estimation performance for some α, and the gap between left-hand side of (15) and actual MSE is determined by the noise pdf.Therefore, the full pdf knowledge will be required in order to specify constants U and α and quantify the estimation bias.

NUMERICAL SIMULATIONS
In this section, we present numerical simulations to compare the transmission energy requirement for two energy scheduling strategies: (i) quantization using the closed-form approximate solution (32); (ii) uniform bit allocation when all sensors quantize their observations to the same number of bits to achieve the same MSE.We denote by b the number of bits used in case of uniform bit allocation.We can find the minimum of b from the MSE constraint which gives The number of bits can only take integer values, so the total minimal energy is given by Recall that we have relaxed b i to take real values to make the problem convex.Therefore, the optimal energy obtained by allowing b i to take on real values is a lower bound on the actual optimal energy.If we round b i up to the closest integer b i , we can obtain an upper bound (denoted by W opt ) on the actual energy.Even though we use b * i to approximate the actual optimal solution, significant energy can be saved when compared with the uniform bit allocation strategy in order to achieve the same target distortion.The percentage of saving is defined as For a positive random variable R we define which will be used as a measure of the absolute heterogeneity of R. The sensor noise variances {σ 2 i } are taken to be σ 2 i = 1 + a 2 Z i , where Z i are i.i.d.random variables with Z i ∼ χ 2 1 (z).As can be easily verified, {σ 2  i } are also i.i.d. with σ i ∼ χ 2 1 ((x − 1)/a 2 ).We control heterogeneity of sensor noise variances by varying the parameter a.In Figure 2a, we suppose that sensor noises have tri-diagonal correlation matrix where ρ = 0.2.In Figure 2b, we suppose that sensor noises have correlation matrix In all simulations, the total number of sensors N = 200.Since all coefficients w i are scaled by a common factor, in our simulation, {w i } are taken to be channel path losses Assume that the target estimation performance is fixed.From Figure 2 we can see that the amount of energy saving becomes significant when the local noise variances become more and more heterogeneous, assuming that all sensors have identical w i .In Figure 3, we plot the percentage of energy savings versus the heterogeneity of channel gains, supposing that sensors have same observation noise variances with tri-diagonal structure as in (49) where σ 2 i = 1 for all i, and ρ = 0.2.Here we suppose that all sensors are uniformly distributed inside a unitary disk whose center is at the FC.It is easy to show that in this case normalized deviation of w i depends only on κ (cf.(51)).In our simulation, we choose 1 ≤ κ ≤ 8.We observe that percentage of saving depends more on the heterogeneity of sensor noise variances than that of channel gains.This can be understood regarding expression (32) for b * i , where in the logarithm, the quantity depends on the distribution of c i , but only on the distribution of 1/ √ w i .

AN EXTENSION: MINIMAX FORMULATION
Minimizing total transmission energy results in sensors having different lifetimes.This may induce frequent changes in the network topology.An alternative approach is to minimize  maximal energy W i which leads to maximum network lifetime.Relaxing {b i } as in (24), we can state the problem as follows: or alternatively As in Section 3, we assume that c i = 0 for all i and ignore the nonnegativity constraints b i ≥ 0 (which must be inactive at optimum).The Lagrangian for problem (53) is found to be Differentiating L with respect to primal variables we obtain the following conditions: where as before λ = 1/2λ ln 2. Taking sum of (57) over all i we obtain Since each term in the right-hand side sum in (58) is positive, we conclude that λ > 0, therefore µ i > 0, and complimentary slackness condition gives w i b i = t.
(59) Thus, the optimal value t opt can be found as a solution to the following equation: The solution t opt is unique due to the monotonicity of the left-hand side function in (60).The FC can solve (60) and broadcast t opt to the sensors, which in turn can determine their quantization levels locally.In this case sensor lifetime is not affected by transmitted power.

CONCLUSION
In this paper we have shown that total energy consumption required for transmission in a sensor network can be minimized if number of quantization levels for each sensor is determined jointly by the fusion center using information about correlation of sensor observations.We have also presented a nearly-optimal solution in closed form to the energy minimization problem which can achieve the same target estimation performance as the optimal solution.It is shown by numerical simulations that to attain the same MSE performance our energy-efficient quantization scheme can achieve energy saving up to 70% when compared to simple uniform bit allocation scheme.We plan to consider various extensions of this work in our future work.These include joint estimation of a common vector signal by a WSN, and distributed least squares and target tracking for dynamic targets.

Figure 2 :
Figure 2: Percentage of energy saving increases when sensor noise variances become more heterogeneous.

Figure 3 :
Figure 3: Percentage of energy saving increases when channel gains become more heterogeneous.
in operations research.In 1989, he joined the Department of Electrical and Computer Engineering, McMaster University, Hamilton, Canada, where he became a Professor in 1998 and held the Canada Research Chair in information processing since 2001.Starting April 2003, he has been a Professor in the Department of Electrical and Computer Engineering at the University of Minnesota, and holds an ADC Chair in digital technology.His research interests lie in the union of large-scale optimization, signal processing, data communications, and information theory.He is a Member of SIAM and MPS.He is presently serving as an Associate Editor for several international journals including SIAM Journal on Optimization, Mathematical Programming, Mathematics of Computation, and Mathematics of Operations Research.