An MEF-Based Localization Algorithm against Outliers in Wireless Sensor Networks

Wang, Dandan; Wan, Jiangwen; Wang, Meimei; Zhang, Qiang

doi:10.3390/s16071041

Open AccessArticle

An MEF-Based Localization Algorithm against Outliers in Wireless Sensor Networks

by

Dandan Wang

,

Jiangwen Wan

^*,

Meimei Wang

and

Qiang Zhang

School of Instrumentation Science and Opto-electronics Engineering, Beihang University, Xueyuan Road No.37, Haidian District, Beijing 100191, China

^*

Author to whom correspondence should be addressed.

Sensors 2016, 16(7), 1041; https://doi.org/10.3390/s16071041

Submission received: 31 March 2016 / Revised: 19 June 2016 / Accepted: 26 June 2016 / Published: 7 July 2016

(This article belongs to the Section Sensor Networks)

Download

Browse Figures

Versions Notes

Abstract

:

Precise localization has attracted considerable interest in Wireless Sensor Networks (WSNs) localization systems. Due to the internal or external disturbance, the existence of the outliers, including both the distance outliers and the anchor outliers, severely decreases the localization accuracy. In order to eliminate both kinds of outliers simultaneously, an outlier detection method is proposed based on the maximum entropy principle and fuzzy set theory. Since not all the outliers can be detected in the detection process, the Maximum Entropy Function (MEF) method is utilized to tolerate the errors and calculate the optimal estimated locations of unknown nodes. Simulation results demonstrate that the proposed localization method remains stable while the outliers vary. Moreover, the localization accuracy is highly improved by wisely rejecting outliers.

Keywords:

wireless sensor networks; localization; outliers; maximum entropy principle; fuzzy set theory

1. Introduction

Wireless Sensor Networks (WSNs), the networks of sensor nodes, have been widely used in many promising applications such as condition monitoring, target tracking, and home security. Precise localization plays an important role in WSNs localization systems. From the viewpoint of localization systems, there are two types of sensor nodes in WSNs. Anchor nodes, also known as beacon nodes, can obtain their location directly by using manual placement or Global Positioning System (GPS); unknown nodes, also known as regular nodes, derive their locations through localization methods. Up to now, most existing localization algorithms of WSNs could be classified as either range-based localization [1,2] or range-free localization [3,4]. Range-based localization algorithms use absolute point-to-point range measurements (distance or angle) to estimate unknown nodes’ locations, while range-free localization algorithms depend on the contents of received messages. In this paper, the range-based localization methods are taken into consideration, since they are normally of high localization accuracy [5].

In localization methods, the calculation of unknown node’s positions heavily relies on primary data, which are the distances between neighboring nodes and the position knowledge of anchors. In many applications of WSNs, sensor nodes are vulnerable to the internal or external disturbance. As a result of the inference, the measured distances and anchor positions can deviate from their true values. These inaccurate values are called outliers, including both the distance outliers and the anchor outliers. Due to the existence of outliers, the usage of such corrupted data can severely degrade the localization accuracy. Hence, the outlier detection process [6] is a necessary step to assure data quality in localization process. Up to the present, most existing outlier detection methods [7,8,9] simply assume that either distance or anchor position is the outlier, thus they are not comprehensive detecting methods. Furthermore, when the difference between the outlier value and the normal value is small enough, the outlier will not be detected, thus these methods will clearly be invalid. Therefore, an error-tolerant localization method is greatly needed to calculate the estimated locations of unknown nodes in the presence of undetected outliers. The error-tolerant localization method is a positioning refinement process which allows the existence of undetected outliers instead of discarding them. Through the error-tolerant localization process, the accuracy of localization will be improved efficiently.

In this paper, a novel secure localization method is developed to reject both the distance outliers and anchor outliers. Firstly, the uncertain value of the measured distances is obtained based on the maximum entropy theory in the lack of ranging error distribution. The uncertain value is served as the threshold in the membership function, which is compared with the difference between the Euclidean distance and the measured distance between every two neighboring anchor nodes. The Euclidean distance is calculated by the coordinates of the two anchors while the measured distance is obtained by the range-based methods. Secondly, a trust evaluation model is constructed based on the fuzzy set theory. In the trust evaluation model, a membership function is used to calculate the mutual trust values of anchor nodes. Through the data fusion, the trust value of each anchor node is obtained, and the lower trust value nodes are discarded. Finally, the Maximum Entropy Function (MEF) method is used to calculate the optimal estimated locations of unknown nodes by using the trustable data. Simulations demonstrate that the outliers can be detected effectively and the localization method can achieve high accuracy.

The rest of this paper is organized as follows. Section 2 reviews the related works. Section 3 shows a preliminary structure of sensor nodes localization system. Section 4 describes the outlier detection method. Section 5 presents the MEF method for calculating the optimal estimated locations of unknown nodes. Section 6 shows the simulation results. Finally, Section 7 concludes this paper.

2. Related Works

Generally speaking, the outliers have three anomalous causes: (1) hardware malfunctions; (2) environment interferences; or (3) malicious attacks [10]. For instance, in Time of Arrival (ToA) and Time Difference of Arrival (TDoA) systems, the transmission time or reception time of a packet can be delayed, thus resulting in distance enlargement or distance reduction [11]. In Received Signal Strength Indicator (RSSI)-based localization systems, the signal strength may be unstable or shadowed in the presence of natural or artificial interferences [12]. In malicious attacks, an attacker can increase or decrease the transmission power to make the measured distances deviate from their true values. In addition, the attacker can also capture anchor nodes to declare fake anchor positions to generate anchor outliers [13].

In practice, the existence of outliers is a fact that cannot be neglected for localization algorithms. In the case of distance outliers, a consistency check method [14] has been proposed to filter out the malicious beacon signals. The signals contain measured distance outliers on the basis of the “consistency”. However, if the attackers do not revise the measured distances randomly, but make the modified distances be consistent, the strategy mentioned above will be failed under this scenario. In literature [15], linear equations are used to describe the localization problem. Hence, the norm and linear programming are applied to detect the outliers and avoid the wild measurements in the final solution. To deal with noisy and outlier ranging results, a theoretical foundation [16] has been built to identify distance outliers based on graph embeddability and rigidity theory. However, rigidity theory requires high ranging accuracy and it is computationally intensive. By applying the rigidity theory, the concept of verifiable edges [17] has been presented and the conditions for an edge to be verifiable have been derived. On this basis, the paper designs outlier detection method which explicitly eliminates ranges with large errors. However, facing with the undetected small outliers, the method would lose efficacy. In summary, based on the detection target, the literatures [14,15,16,17] mentioned above ignore the influence of anchor outliers. Therefore, these methods are one-sided.

Regarding anchor outliers, a scheme named Localization Anomaly Detection (LAD) [18] is put forward to detect malicious anchor node. The scheme attempts to perform compromise resistant localization without removing the malicious anchors. To monitor and timely detect anchor outliers in large-scale WSNs, a rule-based anomaly detection system, called RADS [19], has been proposed. In conclusion, the results of the algorithms [18,19] which are committed to eliminate anchor outliers are not comprehensive without analyzing the influence of ranges with large errors.

With respect to both outliers, an innovative modular solution [20], featuring two lightweight modules, has been developed. One is attack detection module that harnesses simple geometric triangular rules and an efficient voting technique. The other is secure localization module that computes and clusters certain reference points to estimate the coordinate of the unknown nodes. In [21], a novel algorithm, called neighbor constraint assisted distributed localization (NCA-DL), has been proposed. The method introduces the geometric constraints to detect outliers. To make localization attack-tolerant, a robust statistical method [22] has been presented. By using an adaptive least squares and Least Median Squares (LMS) position estimator, the method is capable of switching to a robust mode when the outliers exist. As a summary of the foregoing, the methods [20,21,22] could dispel either the anchor outliers or distance outliers. If two kinds of outliers both exit, the geometric constraints and statistical method will become invalid to filter out malicious colluding beacons or the beacon whose measured distance and coordinates change at the same time. In addition, to reduce the impact of both outliers simultaneously, Jin et al. [23] has put forward a trilateral localization algorithm for outliers suppression. However, the study of the paper focuses on the error of the algorithm itself, and discusses the stability of equations. This outliers excluded are just a portion of malicious beacons. Reference [24] designs a Beta Reputation System-based Localization (BRSL) algorithm to mainly detect and eliminate both outliers, but the Taylor-series least squares localization algorithm utilized after trust evaluation phase can’t reach high accuracy.

3. Preliminaries

A WSN consists of two types of nodes, namely anchor nodes and unknown nodes. The anchor nodes are specially equipped and aware of their coordinates after deployment. The unknown nodes, whose positions are yet to be discovered, estimate their locations by measuring distances to neighboring anchor nodes. All the nodes are randomly deployed in a 2D spatial region. The communication radius of unknown or anchor nodes is R. Every node is capable of measuring the distance to any of its immediate neighbors through measurement techniques such as RSS, ToA or TDoA. As shown in Figure 1, when the unknown node

N_{u}

gets enough measured distances

d_{u i}^{'}

to anchor nodes

N_{i} (i = 1, 2, \dots, m)

,

m \geq 3

, a system of Euclidean equations can be set up according to the trilateration:

{\begin{matrix} {(x_{1} - x_{u})}^{2} + {(y_{1} - y_{u})}^{2} = d_{u 1}^{' 2} \\ {(x_{2} - x_{u})}^{2} + {(y_{2} - y_{u})}^{2} = d_{u 2}^{' 2} \\ ⋮ \\ {(x_{m} - x_{u})}^{2} + {(y_{m} - y_{u})}^{2} = d_{u m}^{' 2} \end{matrix}

(1)

where

X_{u} = {[x_{u}, y_{u}]}^{T}

is

N_{u}

’s coordinates that need to be estimated,

X_{i} = {[x_{i}, y_{i}]}^{T}

is anchor node

N_{i}

’s declared position, and

d_{u i}^{'}

is the measured distance between the anchor node and the unknown node.

Generally,

X_{u}

should be located in the intersection of

m

circles, of which the centers and radiuses are

X_{i}

and

d_{u i}^{'}

, respectively. The smaller the intersection is, the more accurately

X_{u}

can be pinpointed. When both

X_{u}

and

d_{u i}^{'}

are accurate, the

X_{u}

can be well estimated by solving Equation (1). However, if the distance outliers or the anchor outliers exist, the system would incorrectly estimate the

X_{u}

to a location that deviate far from its physical position.

The measured distance can be expressed as

d_{u i}^{'} = d_{u i} + e_{r}

, and the declared anchor position can be expressed as

X_{i}^{'} = [x_{i} + e_{p x}, y_{i} + e_{p y}]

, where

e_{r}

and

e_{p}

are the ranging error and the anchor position error, respectively. Ranging error

e_{r}

is the difference between real distance and measured distance between two sensor nodes. Position error

e_{p}

is the difference between the real position of the anchor node and the received position of the anchor node. If the distance-measuring process is disturbed, the measured distance

d_{u 2}^{'}

between anchor node

A_{2}

and unknown node

N_{u}

, as well as the measured distance between anchor node

A_{2}

and

A_{1}

, will be enlarged or reduced. Take the enlarged case for example. As shown in Figure 2a, the distance outlier is

d_{u 2}^{″} = d_{u 2} + e_{r} + d_{a} = d_{u 2}^{'} + d_{a}

, where

d_{a}

is the enlarged distance, and

d_{u 2}

is the real distance between anchor node

A_{2}

and unknown node

N_{u}

. Meanwhile, the Euclidean distance between

A_{1}

and

A_{2}

, i.e.,

‖ A_{1} A_{2} ‖

, is different from their measured distance

‖ A_{1} A_{2}^{'} ‖

. In addition, if the measured distance

d_{u 2}^{'}

, as well as the measured distance between anchor node

A_{2}

and

A_{1}

, is reduced, the computed distance between

A_{1}

and

A_{2}

, i.e.

‖ A_{1} A_{2} ‖

, is also different from its measured distance

‖ A_{1} A_{2}^{'} ‖

. As shown in Figure 2b, if anchor node

A_{2}

is malicious, the declared anchor position may deviate far from the true position. The anchor outlier is defined as

X_{2}^{″} = [x_{2} + e_{p x} + d_{x}, y_{2} + e_{p y} + d_{y}]

, where

[d_{x}, d_{y}]

is the offset distance. In addition, the Euclidean distance between

A_{1}

and

A_{2}

, i.e.

‖ A_{1} A_{2}^{'} ‖

, is not equal to their measurement distance

‖ A_{1} A_{2} ‖

. Based on the above discussion, these outliers will severely degrade the localization accuracy. Therefore, it is necessary to eliminate the outliers in localization systems.

The measured distance and anchor positions exist in pairs in the localization systems. Considering two neighboring anchor nodes around an unknown node, no matter the measured distance between the two anchor nodes is enlarged or the declared position of one anchor node deviates far from the true position, the Euclidean distance will be different from the measured distance between the two anchor nodes. Therefore, no matter the distance outliers or the anchor outliers exist, or how they are generated, the difference between the Euclidean distance and the measured distance, as well as the cooperation of anchor nodes, can be utilized to detect the outliers.

4. Outlier Detection Method

In this section, firstly, the uncertain value of measured distances is calculated based on maximum entropy theory by using the ranging error priori information. Then a trust evaluation model is constructed based on the fuzzy set theory by using the uncertain value and the difference between the Euclidean distance and the measured distance. In the trust evaluation model, the trust value of each anchor node can be obtained.

4.1. Calculation of the Entropy Uncertainty

Based on the maximum entropy theory, the uncertain value of the measured distance can be obtained by utilizing the mean and standard deviation of ranging error in this section. The information entropy

H (e_{r})

[25] of ranging error can be calculated as Formula (2),

H (e_{r}) = - \sum_{i = 1}^{n} p (e_{r i}) ln p (e_{r i})

(2)

where

p (e_{r})

is the probability density function of ranging error.

The ranging error

e_{r}

is assumed to appear in

[e_{r 1}, e_{r 2}]

with equal probability before measuring. Then, after measurement, the estimated ranging error

e_{r}^{'}

with bias

\pm U

is obtained, where

U

is the entropy uncertainty of ranging error. Hence the true value of ranging error appears in

[e_{r}^{'} - U, e_{r}^{'} + U]

. The information entropy of

e_{r}^{'}

can be calculated as Formula (3).

H (e_{r}^{'}) = - \int_{e_{r}^{'} - U}^{e_{r}^{'} + U} \frac{1}{2 U} ln \frac{1}{2 U} d e_{r} = ln 2 U

(3)

As we all know, the probability density function of Gaussian distribution

N (0, σ^{2})

is

p (e_{r}) = \frac{1}{\sqrt{2 π} σ} exp (- \frac{{e_{r}}^{2}}{2 σ^{2}})

(4)

Hence the information entropy of ranging error can be calculated as Formula (5).

H (e_{r}) = \int_{- \infty}^{+ \infty} p (e_{r}) ln p (e_{r}) d e_{r} = \frac{1}{2} ln [2 e π σ^{2}]

(5)

Let Formula (3) equal to Formula (5), the entropy uncertainty of Gaussian distribution

N (0, σ^{2})

can be derived as Formula (6).

U = \frac{\sqrt{2 π e}}{2} σ = 2.07 σ

(6)

The entropy coefficient of Gaussian distribution is 2.07. In general, let

U = k σ

, where k is called entropy coefficient and σ is the standard error deviation. The value of k depends on the error distribution. In this paper, the distribution of ranging error is unknown, so that k cannot be calculated directly. Definition 1 illustrates how to choose the value of k.

Definition 1.

Based on the maximum entropy principle [26] and the obtained partial information of the unknown distribution, the distribution with the maximum entropy should be selected. In all the distributions, Gaussian distribution has the maximum information entropy. Thus, the entropy coefficient of Gaussian distribution can be used to calculate the entropy uncertainty of ranging error in this paper. It is a relatively conservative but reasonable choice.

Proof of Definition 1.

Given the probability distribution of

p (x)

and

q (x)

, the in Equation (7) can be obtained by using the inequation of

log x \leq (x - 1)

,

\int p (x) log \frac{q (x)}{p (x)} d x \leq \int p (x) (\frac{q (x)}{p (x)} - 1) d x = \int q (x) d x - \int p (x) d x = 0

(7)

Meanwhile, since

log \frac{q (x)}{p (x)} = log q (x) + log \frac{1}{p (x)}

,

\int p (x) log \frac{q (x)}{p (x)} d x

can be changed into another format.

\int p (x) log \frac{q (x)}{p (x)} d x = - \int p (x) log p (x) d x + \int p (x) log q (x) d x

(8)

Through Formulas (7) and (8), the in Equation (9) can be obtained.

H (p) \leq - \int p (x) log q (x) d x

(9)

The Formula (9) is a famous conclusion that entropy of a probability distribution is always less than the relative entropy in the information theory. Only when

q (x) = p (x)

can the equality hold in Formula (9).

Let

q (x) = N (u, σ^{2})

, when

p (x)

is under the given condition of mean value

u

and variance

σ^{2}

, then the Formula (9) can be derived as follows.

\begin{array}{l} H (p) & \leq - \int p (x) log {\frac{1}{\sqrt{2 π} σ} e^{- \frac{{(x - u)}^{2}}{2 σ^{2}}}} d x \\ = \int p (x) {\frac{{(x - u)}^{2}}{2 σ^{2}} + log \sqrt{2 π} σ} d x \\ = \frac{1}{2 σ^{2}} \int p (x) {(x - u)}^{2} d x + log \sqrt{2 π} σ \end{array}

(10)

Under the limit of the mean value and the variance of

p (x)

:

\int p (x) {(x - u)}^{2} d x = σ^{2}

, the inequation of

H (p) \leq \frac{1}{2 σ^{2}} σ^{2} + log \sqrt{2 π} σ = \frac{1}{2} + log \sqrt{2 π} σ

can be obtained. When

p (x) = N (u, σ^{2})

, the equality of Formula (9) holds. Hence, the conclusion mentioned above in Definition 1 is verified.

Since the Gaussian distribution has the maximum information entropy, it has the maximum entropy coefficient. Choosing the entropy coefficient of Gaussian distribution is relatively conservative. However, based on the maximum entropy principle, the reasonable inference of the unknown distribution is the distribution which is most random and is in accord with the known information. Because this is the only choice which could be made impartially, and any other options mean that other constraints and assumptions would be added, which cannot be obtained based on the known information. Gaussian distribution is the most random distribution in nature, as we all know. Thus, the entropy coefficient of Gaussian distribution is a reasonable choice.

4.2. Foundation of the Trust Evaluation Model

After obtaining the entropy uncertainty of ranging error, the uncertain value of distance estimation can be written as

U_{d} = b + 2.07 σ

, where

b

is the mean of ranging error. More specifically,

b

is calculated as Formula (11),

b = \frac{\sum_{u = 1}^{n} (d_{u i}^{'} - \sqrt{{(x_{u}^{'} - x_{u})}^{2} + {(y_{u}^{'} - y_{u})}^{2}})}{n}

(11)

where

d_{u i}^{'}

is the measured distance between the anchor node and the unknown node,

\sqrt{{(x_{u}^{'} - x_{u})}^{2} + {(y_{u}^{'} - y_{u})}^{2}}

is the computed distance between the anchor node and the unknown node, and

n

is the number of unknown nodes. Because the real distance between the unknown node and anchor node cannot be known in the localization process, the distance outlier needs to be detected by using the cooperation of the neighboring anchor nodes around the unknown node.

The difference between Euclidean distances and measured distances is

D_{i j} = | d_{i j} - d_{i j}^{'} |

, where

d_{i j}

is the Euclidean distance between the anchor nodes and

d_{i j}^{'}

(i = 1, 2, 3,…, m; j = 1, 2, 3,…, m; i≠j) is the measured distance between the anchor nodes.

d_{i j}

is defined as

d_{i j} = \sqrt{{(x_{i}^{'} - x_{j}^{'})}^{2} + {(y_{i}^{'} - y_{j}^{'})}^{2}}

(i = 1, 2, 3, …, m; j = 1, 2, 3, …, m; i≠j), where

X_{i}^{'} = [x_{i}^{'}, y_{i}^{'}]

(i = 1, 2, 3, …, m),

X_{j}^{'} = [x_{j}^{'}, y_{j}^{'}]

(j = 1, 2, 3, …, m) are the declared coordinates of anchor nodes, and

m

is the number of anchor nodes. Based on the fuzzy set theory and the neighboring anchor nodes, a trust evaluation model is constructed. In the model, the fuzzy membership function is shown as Formula (12).

T_{i j} = {\begin{cases} 1 & D_{i j} \leq U_{d} \\ 0 & D_{i j} > U_{d} \end{cases}

(12)

Define

T_{i j}

as the trust value of anchor node

A_{i}

from anchor node

A_{j}

. All these mutual trust values calculated by Formula (12) comprise the fuzzy relation matrix

T

.

T = [\begin{matrix} 0 & T_{12} & \dots & T_{1 m} \\ T_{21} & 0 & \dots & T_{2 m} \\ ⋮ & ⋱ & ⋱ & ⋮ \\ T_{m 1} & T_{m 2} & \dots & 0 \end{matrix}]

(13)

Then give a weight matrix

W

to calculate the trust value of each anchor node.

W = [\begin{matrix} 0 & w_{12} & \dots & w_{1 m} \\ w_{21} & 0 & \dots & w_{2 m} \\ ⋮ & ⋱ & ⋱ & ⋮ \\ w_{m 1} & w_{m 2} & \dots & 0 \end{matrix}]

(14)

where

w_{i j} = 1 / m - 1

.

Through data fusion, an evaluation result vector

S

can be obtained

S = W \circ T = [\begin{matrix} s_{1} & s_{2} & \dots & s_{m} \end{matrix}]

, where

s_{i}

is the trust value of anchor node

A_{i}

and

s_{i} = \sum_{i \neq j, j = 1}^{m} w_{i j} T_{i j}

.

Definition 2.

Based on the majority principle, if the trust value of an anchor node is larger than 0.5, it can be concluded that this anchor node is trusted or normal. If the trust value of an anchor node is smaller than or equal to 0.5, it can be determined that the position of the anchor node is an outlier or the corresponding measured distance is an outlier. Discard the outliers and only use the trustable data to estimate the locations of unknown nodes.

Since the presence of moving obstacles and other special situations could generate outliers temporarily, the corresponding trust values will decrease at the same time. Throughout the lifetime of the network, this kind of trust value is not credible. Therefore, all the trust values are not stored into the sensor nodes in this paper. In every localization process, the trust values will be recalculated.

5. MEF-Based Location Estimation Method

5.1. Formulation of the Localization Problem

Overall, the localization process against outliers consists of two steps. Firstly, in order to eliminate both kinds of outliers simultaneously, an outlier detection method is proposed based on the maximum entropy principle and fuzzy set theory. The first step of the localization process, named as the initial localization phase or the detecting phase is the foundation of the follow-up positioning process. Then, since not all the outliers can be detected in the detection process, the Maximum Entropy Function (MEF) method is utilized to tolerate the errors and calculate the optimal estimated locations of unknown nodes. Both steps of the localization algorithm are indispensable. Only by utilizing both steps can the localization accuracy be highly improved.

In a word, the detection method mentioned above should be applied to eliminate distance and anchor outliers in the initial localization phase. After the detecting phase, the unknown nodes utilize all their multihop communication anchor nodes to estimate their coordinates. Based on the Formula (1) in Section 3, the nodes localization problem is shown as Formula (15),

min F (X_{u}) = min \sum_{i = 1}^{m} | f_{i} (X_{u}) | = min \sum_{i = 1}^{m} | \sqrt{{(x_{i} - x_{u})}^{2} + {(y_{i} - y_{u})}^{2}} - d_{i}^{'} |

(15)

where

X_{u} = {[x_{u}, y_{u}]}^{T}

is the coordinate of the estimated unknown node

N_{u}

in the localization,

X_{i} = {[x_{i}, y_{i}]}^{T}

is the anchor node

N_{i}

’s declared position, and

d_{i}^{'}

is the measured distance between the anchor node

N_{i}

and the unknown node

N_{u}

.

From Formula (1),

f_{i} (X_{u})

is assumed equal to zero. Due to the presence of errors and outliers,

f_{i} (X_{u})

is not equal to zero actually. By obtaining the minimum sum of

f_{i} (X_{u})

(i = 1, 2, 3, …, m), the impact of the comprehensive error on the localization will be minimized. Therefore, the estimated coordinate of unknown node with the minimum sum of

f_{i} (X_{u})

(i = 1, 2, 3, …, m) can be as the optimal estimated coordinate in the localization.

Note that

F (X_{u})

is a non-smooth function and is difficult to be minimized from Formula (15). Therefore, the MEF method, which is the least biased estimate possibility on the given information and mainly used to solve the non-smooth minimum optimization problem [27], is used to estimate the locations of unknown nodes in this paper. Using the MEF method,

F (X_{u})

can be changed into the entropy function

F_{p} (X_{u})

, which is smooth and obtained by the following formula,

F_{p} (X_{u}) = \frac{1}{p} \sum_{i = 1}^{m} ln [exp (p f_{i} (X_{u})) + exp (- p f_{i} (X_{u}))]

(16)

where

p

is called the maximum entropy factor.

Based on [27], the following properties of the entropy function

F_{p} (X_{u})

are listed as follows.

Theorem 1.

For any estimated coordinate

X_{u}

of unknown node, (1) when

p \to + \infty

,

F_{p} (X_{u}) \to F (X_{u})

; (2) For any

p

,

F (X_{u}) \leq F_{p} (X_{u}) \leq F (X_{u}) + (ln m) / p

.

Proof of Theorem 1.

(1): Given $X_{u}^{'} \in R^{2}$ , if there is a vector-valued function $V (X_{u}^{'})$ with components $v_{i} (X_{u}^{'}) = exp [f_{i} (X_{u}^{'})]$ , where $1 < i < m$ , then the l_p-norm of $V (X_{u}^{'})$ is.

${‖ V (X_{u}^{'}) ‖}_{p} = {\sum_{i = 1}^{m} {[v_{i} (X_{u}^{'})]}^{p}}^{1 / p} = {\sum_{i = 1}^{m} exp [p f_{i} (X_{u}^{'})]}^{1 / p}$

(17)

Hence, $F_{p} (X_{u}^{'}) = ln {‖ V (X_{u}^{'}) ‖}_{p}$ . Consequently,

$\lim_{p \to \infty} {‖ V (X_{u}^{'}) ‖}_{p} = \max_{1 \leq i \leq m} v_{i} (X_{u}^{'}) = exp [F (X_{u}^{'})]$

(18)

That is $\lim_{p \to \infty} F_{p} (X_{u}^{'}) = F (X_{u}^{'})$ .
(2): With the properties of l_p-norm, for $X_{u}^{'} \in R^{2}$ , $F_{p} (X_{u}^{'})$ is a monotonically decreasing function in terms of $p$ , hence

$\begin{array}{l} 0 \leq F_{p} (X_{u}^{'}) - F (X_{u}^{'}) & = ln {‖ V (X_{u}^{'}) ‖}_{p} - ln [exp F (X_{u}^{'})] \\ = (1 / p) ln \sum_{i = 1}^{m} exp {p [f_{i} (X_{u}^{'}) - F (X_{u}^{'})]} \\ \leq (1 / p) ln m \end{array}$

(19)

The theorem mentioned above describes the relationship between entropy function

F_{p} (X_{u})

and original function

F (X_{u})

when

p

changes.

F_{p} (X_{u})

converges to

F (X_{u})

point wisely on

X_{u}

, as

p

tends to infinity. Theoretically, under the given conditions, as long as

p

is sufficiently large, the error between the optimal solution of

F (X_{u})

and the optimal solution of

F_{p} (X_{u})

can be made arbitrarily small. However in terms of numeral calculations, when

p

is fairly large, the value of entropy function

F_{p} (X_{u})

is overflow. Therefore, in case of the overflow, Equation (16) is transformed into the following modus.

F_{p} (X_{u}) = \sum_{1}^{m} | f_{i} (x_{u}) | + \frac{1}{p} \sum_{1}^{m} l n [1 + exp (- 2 p | f_{i} (x_{u}) |)]

(20)

Derivation steps as follows.

\begin{array}{l} F_{p} (X_{u}) & = \frac{1}{p} \sum_{1}^{m} l n [e^{p f_{i} (x_{u})} + e^{- p f_{i} (x_{u})}] \\ = \sum_{1}^{m} l n {[e^{p f_{i} (x_{u})} + e^{- p f_{i} (x_{u})}]}^{\frac{1}{p}} \\ = {\sum_{1}^{m} l n [e^{p | f_{i} (x_{u}) |} (1 + e^{- 2 p | f_{i} (x_{u}) |})]}^{\frac{1}{p}} \\ = \sum_{1}^{m} | f_{i} (x_{u}) | + \frac{1}{p} \sum_{1}^{m} l n [1 + e^{- 2 p | f_{i} (x_{u}) |}] \end{array}

(21)

Hence, summarizing all results, the nodes localization problem can be described as Formula (22) when

p \to + \infty

,

min F_{p} (X_{u}) = min {\sum_{1}^{m} | f_{i} (x_{u}) | + \frac{1}{p} \sum_{1}^{m} ln [1 + exp (- 2 p | f_{i} (x_{u}) |)]}

(22)

where

X_{u}

is the estimated unknown node coordinate in the localization, and

p

is called the maximum entropy factor. Through minimizing the entropy function

F_{p} (X_{u})

, the estimated coordinate of the unknown node can be regarded as the optimal estimated coordinate in localization.

5.2. MEF-Based Localization Process

After removing the detected outliers, the MEF-based method, which has good error tolerance and calculation accuracy, is used to estimate the locations of unknown nodes in this paper. Meanwhile, it can also rapidly converge to the global optimal value by only iterating twice or three times. Based on the above discussion, the following definition can be concluded about the localization process:

Definition 3.

The entropy function

F_{p} (X_{u})

is the overall approximation to the localization function

F (X_{u})

. In the localization systems, by minimizing the entropy function

F_{p} (X_{u})

and increasing

p

, the minimum

F (X_{u})

can be indirectly obtained under certain accuracy after several iterations. Thus the optimal estimated locations of unknown nodes are obtained.

The detailed procedures of the MEF-based method for estimating the optimal locations of unknown nodes are presented in Table 1.

6. Performance Evaluation

In this section, simulation results are presented and discussed. For all of the simulations, the sensor nodes are uniformly distributed in a 150 m × 150 m square field. We assume a fixed transmission range R = 30 m for both anchor nodes and unknown nodes. The measured distance of the sensor nodes consists of two sections. One is the real distance between two nodes and the other is the measurement error. The measurement error obeys a Gaussian distribution with the mean of 0 and the variance of 1. Thus, the ranging error is set

e_{r} ~ N (0, 1)

. The distance outliers, which are the measured distance attacked or disturbed by external factors, can be described as

d^{″} = d^{'} (1 + α)

, where

d^{'}

is the measured distance without attacks or disturbance, and

α

is the disturbed distance percentage. In each simulation, the sensor nodes of the network are deployed 100 times to compute the average localization accuracy. The default parameters of the simulation are shown in Table 2.

LMS [22] and BRSL [24] are used to compare with proposed localization method. They are both aimed at solving the problem of locating the unknown nodes in the presence of outliers. LMS is an outlier tolerance method and BRSL is an outlier detection and elimination method. Compared with these two different methods, the advantage of our method is revealed clearly in the simulations. The Average Localization Error (ALE) using in the experiment is calculated as Formula (23).

A L E = \frac{\sum_{u = 1}^{n} \sqrt{{(x_{u}^{'} - x_{u})}^{2} + {(y_{u}^{'} - y_{u})}^{2}}}{n R}

(23)

where n is the number of unknown nodes, R is the network communication radius.

In the initial localization phase, no matter the measured distance between an anchor node and unknown node or the declared anchor position is outlier, the detection result is the same. Hence, the scenarios in which distance outliers are only considered are simulated. In the outlier detection phase of the simulation, based on the maximum entropy principle and fuzzy set theory, if the trust value of an anchor node is smaller than or equal to 0.5, it can be determined that the corresponding measured distance is an outlier. Discard the outliers and only use the trustable data to estimate the locations of unknown nodes. If the trustable data left are not enough to estimate the locations of unknown nodes, the information of the neighboring unknown nodes which have been located are used. In this section, all simulations are executed in MATLAB.

6.1. Impact of the Number of Distance Outliers

Figure 3 shows the ALE of our localization method and the compared methods under different numbers of distance outliers. In this simulation, set α = 50% and ranging error

e_{r} ~ N (0, 1)

. Simulation results show that all the detected percent of distance outliers are almost equal to 100%. With the increase of number of outliers, the ALE of LMS rises obviously, while that of MEF and BRSL remain stable, which declares that our localization is robust to the variation of distance outliers. Meanwhile, under the same numbers of distance outliers, our method can greatly improve the average localization accuracy than BRSL.

6.2. Impact of Disturbed Distance Percentage

Figure 4 presents the ALE of our localization method and the compared methods under different disturbed distance percentage. In this simulation, set the number of distances outliers as 10 and ranging error

e_{r} ~ N (0, 1)

. In this case, the detected percent is increased to almost 100% when

α = \pm 50 %

. As shown in Figure 4, no matter when

α > 0

or

α < 0

, the absolute difference

| d^{″} - d^{'} |

increases with the rise of

| α |

. In conclusion, the localization accuracy of our method decrease slowly under different disturbed distance percentages, which shows that our localization can effectively inhibit aggressive behaviors of malicious nodes and improve the localization accuracy of unknown nodes.

6.3. Impact of the Mean of Ranging Error

Figure 5 illustrates the performance of our method under different means of ranging error and disturbed distance percentages. In this simulation, set the number of distances outliers as 10. Because the distance outlier detection is based on the uncertain value of distance estimation, the detected percent of distance outliers is decreased with the increasing mean of ranging error when the variation in distance is small. From the Figure 5, the localization accuracy is increased with the decreasing detected percentand the increasing mean of ranging error in the curves of α = 30% and α = 50%. Thus, it can be concluded that when the difference between the distance outlier and estimated distance is small, our localization method is error-tolerant to the undetected distance outliers; when the estimated distance is large, our localization method can detect the distance outliers.

6.4. Impact of the Standard Deviationof Ranging Error

Figure 6 presents the performance of our localization method under different standard deviations of ranging errors and disturbed distance percentages. In the simulations, also set the number of distance outliers as 10. Compared with Figure 5, the standard deviation of ranging error has a larger impact on localization accuracy.

6.5. Impact of the Iteration Step Length

Note that MEF-based algorithm contains a variable parameter, i.e. the iteration step length l, the value of which will affect the performance of the algorithm. To this end, the simulation is applied to analyze the influence of step length value on algorithm performance and explain the rationality of the parameter value selection.

In this simulation, set the number of distances outliers to 10, p = 10, α = 50% and ranging error

e_{r} ~ N (0, 1)

. Figure 7 illustrates the change trend of the average localization error and average iteration times with l increases. The increasement of iteration step length contributes to improve the efficiency of the localization. However, on the contrary, it also results in the reduction of the localization accuracy. In summary, the small value of l will reduce the efficiency of iteration. Meanwhile, the quite large value of l will decrease the localization accuracy. It should be taken into consideration that the effect of value l to the localization accuracy and the localization efficiency, when deciding the appropriate value of l in the MEF-based iteration method.

7. Conclusions

This paper develops an error-tolerant localization method against distance outliers and anchor outliers. First, an outlier detection method is proposed based on the maximum entropy principle and fuzzy set theory. With the cooperation of the neighboring anchor nodes of unknown node, the outliers can be detected effectively. In order to tolerate the undetected outliers and achieve high localization accuracy, MEF method is used to estimate the locations of unknown nodes. Compared with the BRSL method and LMS method, simulation results show that our localization method has higher localization accuracy.

Acknowledgments

This research was supported by the National Natural Science Foundation of China under Grant No.61371135 and Beihang University Innovation & Practice Fund for Graduate under Grant YCSJ-02-2015-07.

Author Contributions

Dandan Wang contributed to whole idea for this paper and mathematical development and paper written. Jiangwen Wan and Meimei Wang was involved in the simulations as well as drafting of the paper. Qiang Zhang was involved in data analyzing and critically reviewed the paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

Tomic, S.; Beko, M.; Dinis, R. Distributed RSS-based localization in wireless sensor networks based on Second-order cone programming. Sensors 2014, 14, 18410–18432. [Google Scholar] [CrossRef] [PubMed]
Yin, J.H.; Wan, Q.; Yang, S.W.; Ho, K.C. A sample and accurate TDOA-AOA localization method using two stations. IEEE Signal Process. Lett. 2016, 23, 144–148. [Google Scholar] [CrossRef]
Gui, L.Q.; Val, T.; Wei, C.; Dalce, R. Improvement of range-free localization technology by a novel DV-hop protocol in wireless sensor networks. Ad Hoc Netw. 2015, 24, 55–73. [Google Scholar] [CrossRef] [Green Version]
Zhang, S.G.; Liu, X.; Wang, J.X.; Cao, J.N.; Min, G.Y. Accurate range-free localization for anisotropic wireless sensor networks. ACM Trans. Sens. Netw. 2015, 11, 1–28. [Google Scholar] [CrossRef] [Green Version]
Han, G.J.; Xu, H.H.; Doung, T.Q.; Jiang, J.F.; Hara, T. Localization algorithms of wireless sensor networks: A survey. Telecommun. Syst. 2013, 52, 2419–2436. [Google Scholar] [CrossRef]
Shahid, N.; Naqvi, I.H.; Qaisar, S.B. Characteristics and classification of outlier detection techniques for wireless sensor networks in harsh environments: A survey. Art. Intell. Rev. 2015, 43, 193–228. [Google Scholar] [CrossRef]
Zhong, S.; Jadliwala, M.; Upadhyaya, S.; Qiao, C. Towards a theory of robust localization against malicious beacon nodes. In Proceedings of the 27th Conference on Computer Communications, Phoenix, AZ, USA, 13–18 April 2008; pp. 2065–2073.
Garg, R.; Varna, A.L.; Wu, M. An efficient gradient descent approach to secure localization in resource constrained wireless sensor networks. IEEE Trans. Inf. Forensic Secur. 2012, 7, 717–730. [Google Scholar] [CrossRef]
Yu, N.; Zhang, L.R.; Ren, Y.J. A novel D-S based secure localization algorithm for wireless sensor networks. Secur. Commun. Netw. 2014, 7, 1945–1954. [Google Scholar] [CrossRef]
Xiao, Q.J.; Bu, K.; Wang, Z.J.; Xiao, B. Robust localization against outliers in wireless sensor networks. ACM Trans. Sens. Netw. 2013, 9, 1–26. [Google Scholar] [CrossRef]
So, H.C.; Chan, Y.T.; Chan, F.K.W. Closed-form formulae for time-difference-of-arrival estimation. IEEE Trans. Signal Process. 2008, 56, 2614–2620. [Google Scholar] [CrossRef]
Li, B.; Cui, W.; Wang, B. A robust wireless sensor network localization algorithm in mixed LOS/NLOS scenario. Sensors 2015, 15, 23536–23553. [Google Scholar] [CrossRef] [PubMed]
Boukerche, A.; Oliveira, A.B.F.; Nakamura, E.F.; Loureiroet, A.A.F. Secure localization algorithms for wireless sensor networks. IEEE Commun. Mag. 2008, 46, 96–101. [Google Scholar] [CrossRef]
Liu, D.G.; Ning, P.; Liu, A.; Wang, C.; Du, W.L. Attack-resistant location estimation in wireless sensor networks. ACM Trans. Inf. Syst. Secur. 2008, 11, 1–22. [Google Scholar] [CrossRef]
Picard, J.S.; Weiss, A.J. Bounds on the number of identifiable outliers in source localization by linear programming. IEEE Trans. Signal Process. 2010, 58, 2884–2895. [Google Scholar] [CrossRef]
Yang, Z. Beyond triangle inequality: Sifting noisy and outlier distance measurements for localization. ACM Trans. Sens. Netw. 2013, 9, 1–26. [Google Scholar] [CrossRef]
Yang, Z.; Jian, L.R.; Wu, C.S.; Liu, Y.H. Detecting outlier measurements based on graph rigidity for wireless sensor network localization. IEEE Trans. Veh. Technol. 2013, 62, 374–383. [Google Scholar] [CrossRef]
Du, W.L.; Fang, L.; Peng, N. LAD: Localization anomaly detection for wireless sensor networks. J. Parallel Distrib. Comput. 2006, 66, 874–886. [Google Scholar] [CrossRef]
Sarigiannidis, P.; Karapistoli, E.; Economides, A.A. Detecting Sybil attacks in wireless sensor networks using UWB ranging-based information. Expert Syst. Appl. 2015, 42, 7560–7572. [Google Scholar] [CrossRef]
Zhu, W.T.; Xiang, Y.; Zhou, J.Y.; Deng, R.H.; Bao, F. Secure localization with attack detection in wireless sensor networks. Int. J. Inf. Secur. 2011, 10, 155–171. [Google Scholar] [CrossRef]
Wen, L.F.; Cui, L.G.; Chai, S.C.; Zhang, B.H. Neighbor constraint assisted distributed localization for wireless sensor networks. Math. Problem Eng. 2014, 2014. [Google Scholar] [CrossRef]
Li, Z.; Trappe, W.; Zhang, Y.Y.; Nath, B. Robust statistical methods for securing wireless localization in sensor networks. In Proceedings of the 4th International Symposium on Information Processing in Sensor Networks, Los Angeles, CA, USA, 24 April 2005; pp. 91–98.
Jin, R.C.; Che, Z.P.; Xu, H.; Wang, Z.; Wang, L.D. An RSSI-based localization algorithm for outliers suppression in wireless sensor networks. Wirel. Netw. 2015, 21, 2561–2569. [Google Scholar] [CrossRef]
Yu, N.; Zhang, L.R.; Ren, Y.J. BRS-based robust secure localization algorithm for wireless sensor networks. Int. J. Distrib. Sens. Netw. 2013, 2013. [Google Scholar] [CrossRef]
Edwards, S. Elements of Information Theory, 2nd ed.; Wiley-Interscience: New York, NY, USA, 2008; pp. 400–401. [Google Scholar]
Jaynes, E.T. Information theory and statistical mechanics. Phys. Rev. 1957, 106, 620–630. [Google Scholar] [CrossRef]
Li, X.S.; Fang, S.C. On the entropic regularization method for solving min-max problems with applications. Math. Methods Op. Res. 1997, 46, 119–130. [Google Scholar] [CrossRef]

Figure 1. The measured distance

d_{u 2}^{'}

and declared anchor position

X_{2}^{'}

is normal.

Figure 1. The measured distance

d_{u 2}^{'}

and declared anchor position

X_{2}^{'}

is normal.

Figure 2. (a) The distance-measuring process is disturbed; (b) the declared position of the anchor node is inaccurate.

Figure 3. Average Localization Error (ALE) under different numbers of distance outliers.

Figure 4. (a) ALE under different disturbed distance percentages when α > 0; (b) ALE under different disturbed distance percentages when α < 0.

Figure 5. (a) ALE under different means of ranging errors and disturbed distance percentages; (b) Detected percent of distance outliers under different means of ranging errors and disturbed distance percentages.

Figure 6. (a) ALE under different standard deviations of ranging error and disturbed distance percentages; (b) Detected percent of attacked distance estimations under different standard deviation of ranging errors and distance attacked percentages.

Figure 7. (a) ALE under different iteration step length; (b) The average iteration times under different iteration step length.

Table 1. The Maximum Entropy Function (MEF)-based method.

**Table 1.** The Maximum Entropy Function (MEF)-based method.
1: set maximum entropy factor $p = 10$ , multiple (iteration step length) l = 3, threshold ε = 1e-6
2: calculate the lower limit of the unknown node’s coordinate $L_{l} = {[m a x (x_{a}^{'} - d_{a}^{'}), m a x (y_{a}^{'} - d_{a}^{'})]}^{T}$
3: calculate the upper limit of the unknown node’s coordinate $L_{u} = {[m i n (x_{a}^{'} + d_{a}^{'}), m i n (y_{a}^{'} + d_{a}^{'})]}^{T}$
4: calculate the initial coordinate of unknown node $X_{u}^{(0)} = (L_{l} + L_{u}) / 2$
5: while 1
6: minimize $F_{p} (X_{u}^{(j)})$ and get the next iterative coordinate $X_{u}^{(j + 1)} (j = 0, 1, 2, \dots)$
7: //determine whether $X_{u}^{(j + 1)}$ is the optimal solution
8: if $\| F_{p} (X_{u}^{(j + 1)}) - F (X_{u}^{(j + 1)}) \| \leq ε$
9: get the optimal estimated coordinate $X_{u}^{(j + 1)}$
10: break
11: end if
12: change the iterative number: $j = j + 1$
13: change the maximum entropy factor: $p = l \times p$
14: end while

Table 2. Default simulation parameters.

**Table 2.** Default simulation parameters.
Parameters	Values
Network size	150 m × 150 m
Number of sensor nodes	150
Percent of anchor nodes	30%
Communication radius (R)	30 m
Hop count	2

© 2016 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC-BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, D.; Wan, J.; Wang, M.; Zhang, Q. An MEF-Based Localization Algorithm against Outliers in Wireless Sensor Networks. Sensors 2016, 16, 1041. https://doi.org/10.3390/s16071041

AMA Style

Wang D, Wan J, Wang M, Zhang Q. An MEF-Based Localization Algorithm against Outliers in Wireless Sensor Networks. Sensors. 2016; 16(7):1041. https://doi.org/10.3390/s16071041

Chicago/Turabian Style

Wang, Dandan, Jiangwen Wan, Meimei Wang, and Qiang Zhang. 2016. "An MEF-Based Localization Algorithm against Outliers in Wireless Sensor Networks" Sensors 16, no. 7: 1041. https://doi.org/10.3390/s16071041

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An MEF-Based Localization Algorithm against Outliers in Wireless Sensor Networks

Abstract

1. Introduction

2. Related Works

3. Preliminaries

4. Outlier Detection Method

4.1. Calculation of the Entropy Uncertainty

4.2. Foundation of the Trust Evaluation Model

5. MEF-Based Location Estimation Method

5.1. Formulation of the Localization Problem

5.2. MEF-Based Localization Process

6. Performance Evaluation

6.1. Impact of the Number of Distance Outliers

6.2. Impact of Disturbed Distance Percentage

6.3. Impact of the Mean of Ranging Error

6.4. Impact of the Standard Deviationof Ranging Error

6.5. Impact of the Iteration Step Length

7. Conclusions

Acknowledgments

Author Contributions

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI