Geometric Range-Free Localization Algorithm Based on Optimal Anchor Node Selection in Wireless Sensor Networks

In range-free localization scheme of wireless sensor networks, estimating the distance to the anchor nodes having the actual location is common to compute the position of unknown node. Since the range-free scheme is based on the topology information, the accuracy of distance estimation is considerably affected by node density or node deployment. In this paper, we propose a geometric range-free localization algorithm which estimates the unknown positions geometrically by topological information without considering the distance estimation. To achieve this, we propose an optimal anchor node selection algorithm which selects the anchor nodes connected topologically well for the geometrical location estimation. Simulation results show that the proposed algorithm offers considerably an improved performance compared to the other existing studies.


Introduction
Wireless sensor networks have gained attention in recent years, because they can be applied in various fields such as environmental monitoring, medical care, military monitoring, and disaster relief. Since most of these applications need the physical position of wireless sensor nodes, the localization/positioning has been an important issue for wireless sensor networks [1]. Range-free scheme is one of the existing localization techniques, which estimates sensor node's unknown position with the relative connectivity information (e.g., hop count between the sensor nodes) [2]. Generally, the sensor nodes with their known positions are called as anchor nodes, while the others are called unknown nodes. In this scheme, unknown nodes calculate their position by utilizing the topology information such as the hop count of the shortest hop path between the anchor nodes and between the anchor node and unknown node and the position of anchor nodes.
Most of the existing range-free algorithms estimate the Euclidean distance to the anchor nodes in order to obtain unknown node's position. After estimating the distance, each sensor node starts to calculate its location by multilateration technique [3]. Hence, the most important issue in rangefree algorithm is to precisely estimate the Euclidean distance between the anchor node and the unknown node (called as " 2 distance" in this paper). DV-Hop [4] is the wellknown range-free algorithm, which utilizes a metric (called as "average hop length" in this paper) to estimate the 2 distance. It estimates the 2 distance by multiplying the average hop length with the hop count of the corresponding shortest hop path. In the DV-Hop, the average hop length is obtained by considering the entire network. Hence, it will cause a lot of errors when estimating the distance, if the shortest hop path has a form which is different from the average.
Later, there have been a lot of studies to improve the accuracy of 2 distance estimation. References [5][6][7] proposed an algorithm which improves the accuracy of average hop length. The authors of [5] calculated the average hop length stochastically by considering the number of neighboring nodes. The authors of [6] computed the optimal average hop length by minimizing the sum of squares of the distance errors between all anchor nodes. The authors of [7] recalculated the average hop length by considering the number of neighboring nodes. And the authors of [8,9] proposed the refinement algorithms using optimization algorithms.
References [10,11] proposed a scheme which estimates the specific anchor nodes that offer the well-estimated 2 2 International Journal of Distributed Sensor Networks distance values unlike the DV-Hop which selects all anchor nodes to estimate 2 distance. The authors of [10] utilize the fact that the accuracy of 2 distance estimation becomes higher, when hop count of the shortest hop path between the unknown node and the anchor node is smaller. Hence, in this scheme, each unknown node estimates the distance to the three anchor nodes which have the minimum hop count. The authors of [11] utilize the fact that the accuracy of 2 distance estimation becomes higher, when the shortest hop path is a straight line between the anchor node and the unknown node. Hence, in this scheme, the unknown nodes do not estimate the distance to the anchor nodes which have the curved shortest hop path.
In [12], a new 2 distance estimation scheme (called this algorithm as "PDRL" in our paper) has been proposed. In this scheme, the unknown node estimates the Euclidean distance to the distant anchor node by utilizing another anchor node located nearby geometrically. However, this method has some drawbacks that the performance can be degraded by the impractical assumption and imprecise information.
In [13], we have proposed an algorithm (called this algorithm as "RARL" in our paper), which selects optimal anchor nodes based on the geometrical approach to improve the estimation accuracy of the 2 distance. In this scheme, each unknown node selects the three optimal anchor nodes based on two conditions. First, the hop count of the shortest hop paths between the unknown node and the anchor nodes should be smaller than 5 hops. Second, the unknown node should be located within a triangle formed by the three selected anchor nodes. From these anchor nodes selections, each unknown node can obtain the well-estimated 2 distance values. However, this method has some drawbacks in the selection procedure.
In the previously described range-free algorithms, after estimating the 2 distance, multilateration technique has been applied to calculate the position of the unknown nodes. In order to calculate the position by multilateration technique, at least three estimated 2 distance values are needed. Hence, it will cause a lot of localization errors if any of the 2 distances is calculated incorrectly. Even though many schemes have been proposed to improve the accuracy of 2 distance estimation, they did not resolve a fundamental issue about the distance error which is accumulated by increasing the hop count. To solve this problem, we proposed a new algorithm which computes the position of unknown nodes directly without 2 distance estimation. In the proposed algorithm, each unknown node calculates its position by selecting the optimal anchor nodes which is connected well to compute its position geometrically.
The main contribution of this paper is as follows. In this paper, we try to maximize the localization accuracy of the specific unknown nodes which is connected well with the anchor nodes. By improving the accuracy of the specific unknown nodes, we expect that the average localization performance of the network is improved. There are some unknown nodes which cannot be estimated by our algorithm and will be estimated by other well-known algorithms such as DV-Hop, RARL, and PDRL.
The rest of the paper is organized as follows. We analyze two important range-free schemes in the following section. In Section 3, we propose a geometric location estimation algorithm. Simulation results are shown in Section 4. Finally, we give our concluding remarks in Section 5.

Related Study
First, we explain some of assumptions and definitions to describe the related study and the proposed algorithm. In the network, there are two kinds of sensor nodes: unknown node and anchor node. It is assumed that every unknown node and sensor node have an identical transmission range . Basically, each unknown node and anchor node finds its hop-count of the shortest hop path from all other anchor nodes and the coordinates of that anchor nodes through a flooding scheme. Let be the Euclidean distance between the sensor nodes and , and let̂be the estimated value of . ℎ is the hop count of the shortest hop path between the sensor nodes and . Ω is a set of anchor nodes in the network.
In this section, we introduce three algorithms: DV-Hop, RARL, and PDRL which will be compared with our proposed algorithm in the performance evaluation section. In the DV-Hop, each anchor node calculates its average hop length and then broadcasts it. The anchor node calculates its average hop length by When the average hop length is known, then each unknown node estimates the 2 distance by multiplying the average hop length with the hop count of the corresponding shortest hop path. That is, the unknown node estimates the distance to the anchor node bŷ= ℎ × .
Basically, this 2 estimation approach has a drawback that the estimation error is accumulated by the hop count. After estimating the distances to all the anchor nodes, the unknown node determines its position by multilateration technique. The DV-Hop approach is accurate only when the node density is high and the node deployment is uniform where the shortest paths between anchor nodes and unknown nodes approximate to their Euclidean distances.
In case of multilateration-based localization scheme, an unknown position can be computed by three estimated 2 distances. However, the DV-Hop utilizes all of the estimated 2 distances, and so the localization accuracy is degraded. In RARL method, three optimal anchor nodes selection algorithm has been proposed to improve the accuracy of the 2 distance estimation as shown in Figure 1. An optimization model to select the three optimal anchor nodes, , and of unknown node is defined as follows: Area( , , ) is the area of a triangle formed by three anchor nodes , , and .̂,̂, and̂mean the estimated distance between the unknown node and the three anchor nodes , , and , respectively. Here, the distance estimation method is the same as DV-Hop. , , and mean the Euclidean distance between one anchor node and the line connecting the other two anchor nodes, respectively. Here, (5) is a constraint to judge whether the triangle formed by three anchor nodes , , and includes the unknown node or not. The method has a few limitations. When node density is low or node deployment is irregular, there may be a possibility of choosing the incorrect anchor node because of using the estimated distance values (̂,̂, and̂).
In PDRL method, the authors proposed a new 2 distance estimation algorithm based on the geometrical approach. In this scheme, when unknown node estimates the Euclidean distance to the distant anchor node, each unknown node utilizes one nearby anchor node ("reference anchor node") which can form a triangle with the selected anchor and the distant anchor as shown in Figure 2.
In Figure 2, the distance between the distant anchor node and the unknown node can be determined bŷ Instead of estimating the distance to the distant anchor node directly, the scheme estimates the distances (̂,̂in Figure 2) which can be computed comparatively well and calculates the distance to the distant anchor node in the triangle geometrically. The detailed procedures are explained in [12]. Even though the method offers an improved performance in estimating the 2 distance geometrically, the method has few limitations by using impractical assumption and imprecise information which are used to obtain the unknown distanceŝ. In order to calculate the distance, it assumes that the hop count between the anchor nodes and and the hop count between the anchor node and the point are same. This assumption is quite unrealistic.
In these three algorithms, they utilize the imprecise information to estimate the 2 distance and the unknown location. In this paper, we propose a geometric location estimation algorithm which utilizes only the precise information and computes the unknown position directly without estimating the 2 distance. In order to compute the unknown position, each unknown node selects the three anchor nodes topologically and then computes its position by the geometrical method through the information offered by the selected anchor nodes.

Proposed Algorithm
3.1. The Basic Idea. In the proposed algorithm, the aim of each unknown node is to select the three anchor nodes which can form a triangle as shown in Figure 3. Here, the unknown node is located within a triangle formed by 1-hop anchor node and other two anchor nodes and . This shows that the unknown node is located within the intersection region of the triangle and the transmission region of 1-hop anchor node . From this fact, we can compute the position of an unknown node at an intersection point of the two perpendicular lines passing through the points and , respectively, as shown in Figure 3.
The main issues of the proposed algorithm are as follows: (1) the first issue is to select the three anchor nodes which makes the shape as shown in Figure 3, and (2) the second issue is to compute , to obtain two perpendicular lines as shown in Figure 3. Basically, the unknown node selects randomly 1-hop anchor node among the anchor nodes located within its transmission range, and the two anchor nodes and are selected based on the 1-hop anchor node . These two anchor nodes are selected among the other anchor nodes by satisfying a condition. The unknown node should be included in each of the shortest hop paths between the 1-hop anchor node and the two anchor nodes and . From this condition, we can obtain the two unknown distance values and by This computation means calculating the average hop length of the specific shortest hop path between the two nodes. Because the unknown node is 1 hop from the anchor node and is included in each of the shortest hop paths between the anchor node and the two anchor nodes and , it is reasonable to obtain unknown distances and . Now, we have one problem left that is how the unknown node selects the two anchor nodes and which make the node deployment shape as shown in Figure 3. There are three main conditions in the problem. First, the estimated position of the unknown node should be located within a triangle formed by three anchor nodes. Second, the estimated position of the unknown node should be located within the transmission range of the 1-hop anchor node . Last, the area of a triangle should be smaller to improve the accuracy of the geometric estimation. Considering these conditions, we propose an optimal anchor nodes selection algorithm.

Optimal Anchor Nodes Selection Algorithm.
Unknown node having 1-hop anchor node will select two optimal anchor nodes (second anchor node and third anchor node) sequentially by the two optimization models mentioned below. The optimization model to select the second anchor node is as follows: where is the average hop length between the anchor nodes and . In this model, the unknown node selects the anchor node which has the maximum average hop length among the anchor nodes satisfying (9). It is the constraint to judge whether the unknown node is included in the shortest hop path between the 1-hop anchor node and the anchor node or not. The reason why we select the anchor node which has the maximum average hop length is that when the average hop length becomes longer, the shape of the shortest hop path between the two anchor nodes will be close to the straight form. In this case, the possibility that the intermediate node (unknown node ) of this shortest hop path between the two anchor nodes is located nearby of the straight line between them becomes higher. After selecting the second anchor node , the unknown node selects the third anchor node with the 1-hop anchor node and the second anchor node . The optimization model to select the third anchor node is as follows: ≥ cos , and mean a straight line between the anchor node and anchor node , the anchor node and anchor node , respectively.
is the angle between the two lines and . Since we know the coordinates of the three anchor nodes, we can calculate . In this model, the estimation accuracy of geometric algorithm is obtained by minimizing the area of the triangle formed by three anchor nodes. To achieve this, unknown node selects third anchor node which minimizes the anchor . Here, the third anchor node is selected among the anchor nodes satisfying the following conditions. First, the unknown node is included in the shortest hop path between the 1-hop anchor node and the anchor node (11). Second, the estimated position should be located within the triangle made by three anchor nodes (12) and should be located within the transmission range of the 1-hop anchor node (13). We prove the constraints (12) and (13) with the following two Theorems 1 and 2. Figure 3 if the following equation holds:

Theorem 1. The estimated position of unknown node is located within a triangle made by three anchor nodes , , and as shown in
Proof. Let us assume that the estimated position of unknown node is located outside of the triangle. In this case, is longer than as shown in Figure 4, and cos is derived by From (15), we can get the following: Here, we proved the theorem by showing that the negation of (14) is valid when the unknown is located outside of the triangle. Figure 5(b), if the following equation holds:

Theorem 2. The estimated position of unknown node is located within the transmission range of the 1-hop anchor node as shown in
Proof. To prove this theorem, assume that the unknown node is located exactly on the transmission range of the 1-hop anchor node as shown in Figure 5(a). In this case, the following equation is satisfied: Next, Let us assume that the unknown node is located outside of the transmission range , as shown in Figure 5(b). In this case the following equation is satisfied: Here, we cannot calculate . However, in Figure 5(b), we can get an important fact. The fact is that That is, it means that when the unknown node is located outside of the transmission range , the following equation is satisfied from (18): Here, we proved the theorem by showing that the negation of (17) is valid when the unknown is located outside of the transmission.
From the above two optimization models, each unknown node selects two optimal anchor nodes through its 1-hop anchor node and then estimates its position geometrically as shown in Figure 2. Since each unknown node has the coordinates of the anchor nodes and the hop count between the anchor nodes, two perpendicular lines and an intersection point of these two lines can be easily calculated.
In the proposed algorithm, there may be some of the unknown nodes which cannot estimate their position. It is because that each unknown node should have 1-hop anchor node to calculate its position. In this paper, we focus on improving the localization accuracy of the specific unknown nodes which have favorable information for geometric localization. The number of unknown nodes which can be estimated by our algorithm becomes higher when node density is higher.

Performance Evaluation
In this section, we discuss the analytical and simulation results on communication overhead and estimation accuracy.
The proposed algorithm has a similar communication overhead with other algorithms: DV-Hop, PDRL, and RARL. The overall communication cost of DV-Hop, PDRL, and RARL is 2 × ( ), where is the number of anchor nodes and is the number of unknown nodes. Basically, these algorithms have two kinds of cost. One is to get hop count information between unknown nodes and anchor nodes (a flooding of a × matrix, communication is ( )), and the other is to get average hop length of each anchor nodes (a flooding of a × matrix, communication is ( )). In case of our algorithm, unknown nodes need additional information of 1-hop anchor node to select the optimal anchor nodes. Hence, broadcasting of 1-hop anchor node is additionally needed compared to the other algorithms. However, the cost is quite less compared to the cost of a flooding. An additional discussion about the computational cost issue is as follows. In the proposed algorithm, in order to select the optimal anchor nodes, each unknown node should perform two exhaustive searches to solve the optimization problem. However, our algorithm does not utilize multilateration which consumes much computational cost. Hence, our algorithm has a similar computational cost with DV-Hop, RARL, and PDRL.
We compare the performance of proposed algorithm with the three algorithms: DV-Hop, RARL, and PDRL through MATLAB simulation. To evaluate the performances of these three algorithms, we captured the location estimation error   which defines the Euclidean distance between the actual position and estimated position of unknown node. Since range-free scheme utilizes topology information to estimate the unknown position, the result of localization is affected by the node deployment. Hence, to investigate the impact of node deployment, we compare these four algorithms in Regular-shaped topology, C-shaped topology, and O-shaped topology as shown in Figure 6.
The deployment area is 10 × 10 m 2 , where every unknown node and anchor node have an identical transmission range as 1 m. Before performing the detailed simulation, to compare the localization results of the three algorithms visually, we performed an experiment. 200 sensor nodes are randomly distributed in Regular-shaped topology and 170 sensor nodes are randomly distributed in C-shaped and O-shaped topologies, respectively. We set the anchor node ratio as 20% of the total sensor nodes, and the rest is unknown nodes. In the same node deployment of each topology, the positions of unknown nodes are estimated by four algorithms (DV-Hop, RARL, PDRL, and Proposed algorithm), respectively.  cannot be computed exactly in C-and O-shaped topologies. Like DV-hop, PADL also utilizes the average hop length to estimate the 2 distance but in a geometric manner. Hence, it shows bad performance in C-and O-shaped topologies than RARL and proposed algorithm. Proposed algorithm shows better performance than RARL algorithm. It is because RARL utilizes estimated information to select optimal anchor nodes and uses average hop length. Here, an important fact is that we cannot estimate the position of all unknown nodes by RARL and proposed algorithm. It is because there may be some unknown nodes which do not have the selected anchor nodes. In these examples, 89 and 77 unknown nodes in Regular-shaped topology, 82, and 56 unknown nodes in Cshaped topology, and 85, and 76 unknown nodes in O-shaped topology are estimated by RARL and Proposed algorithm, respectively.
Through the next simulation, we compare and analyze the performance of four algorithms in more detail. Simulation settings are as follows. Sensor nodes are randomly deployed in Regular-, C-, and O-shaped topologies. To capture the effect of the node density and the number of anchor nodes, we varied the node density from 2.0 to 3.4 (node/m 2 ) and we assigned the anchor ratio as 15% and 20% of total deployed sensor nodes for each case. We ran the 100 simulation for each node setting and computed the average of each simulation results. In every simulation, the sensor nodes (unknown and anchor nodes) are redistributed randomly. To compare the performance accurately, we analyzed the localization results of the unknown nodes only which are estimated by proposed algorithm. That is, we performed the DV-Hop, RARL, and PDRL algorithm for unknown nodes which are estimated by proposed algorithm (see Table 1).
Simulation results are as follows. Figures 8 and 9 show the location estimation performance of three algorithms in Regular-, C-, and O-shaped topologies. The main difference between the proposed algorithm and the other two algorithms is whether the average hop length calculated in the entire network to estimate the unknown position is used or not. DV-Hop and RARL use the average hop length to estimate the 2 distances. The accuracy of the average hop length becomes higher when the node density is increased and it becomes lower when the node density is decreased. It is because the possibility of shortest hop path which has a heavily curved shape becomes higher when the node density is decreased. Hence, the performance of DV-Hop in the Regular-shaped topology is poor in case of low node density whereas it is degraded considerably in the C-shaped and Oshaped topology, because the average hop length is calculated inaccurately.
In case of RARL algorithm, since it estimates the distance to the selected anchor nodes only which are located close to unknown node, the estimation of the 2 distance is very accurate. Hence, in most of the cases, this algorithm offers better performance than DV-Hop algorithm. However, this algorithm also utilizes the average hop length to estimate the 2 distance. It means that the performance can be degraded in C-shaped and O-shaped topologies where the average hop length can be computed inaccurately. Hence, the performance is worse than proposed algorithm. In case of PDRL, like DV-hop, it also utilizes the average hop length to estimate the 2 distance but in a geometric manner. Even though it performs better than DV-Hop because of geometric distance estimation, the usage of the average hop length to estimate the projected distance causes errors even though it performs better than DV-Hop. Hence, the location estimation performance can be degraded in Cshaped and O-shaped topologies. And, when node density is low, the estimation accuracy of projected distance (̂in Figure 2) is degraded because of the impractical hop count assumption which we mentioned in the related work. It also causes a lot of estimation errors.
In case of proposed algorithm, it shows the stable performance in all the cases. First of all, proposed algorithm does not estimate the 2 distance. It computes the position geometrically with the topology information. Hence, the location error is not accumulated by hop count and is not affected by the 2 distance estimation errors. Second, proposed algorithm does not use the average hop length calculated International Journal of Distributed Sensor Networks The ratio of estimated unknown nodes when anchor node ratio is 15% Regular-shaped topology C-shaped topology O-shaped topology (b) Figure 10: The ratio of estimated unknown nodes in Regular-shaped, C-shaped, and O-shaped topologies with anchor node ratio is 15% and 20%.
on the whole network to estimate the positions of unknown nodes. Hence, the localization results are not affected by node density. Third, proposed algorithm estimates the position of unknown node geometrically within the transmission range of 1-hop anchor node. Hence, the localization results are limited within the transmission range of 1-hop anchor node. From these reasons, the proposed algorithm offers the stable and robust performance in every case regardless of the node density and the topology shape.
As we mentioned, we cannot estimate the position of all unknown nodes by our proposed algorithm. Figure 10 shows the number of unknown nodes estimated by proposed algorithm in each of the simulation settings. As shown in Figure 10, it shows that more than 20% of the unknown nodes are estimated by our algorithm, and the ratio becomes higher when the node density or the anchor node ratio is higher.

Conclusion
In the range-free algorithms, there have been a lot of studies to improve the 2 distance estimation. However, most of the studies did not solve that the accuracy of the 2 distance estimation is affected by node density or topology shape. In this paper, we proposed a geometric location estimation algorithm which computes the unknown position directly without estimating the 2 distance. In the proposed algorithm, to compute the unknown position, each unknown node selects the three anchor nodes topologically and then computes its position by the geometrical method through the topological information offered by the selected anchor nodes. From the analytical and simulation results, we showed that the proposed algorithm offers considerably an improved performance compared to the other existing studies.