Split Distributed Computing in Wireless Sensor Networks

. We have designed a novel method intended to improve the performance of distributed computing in wireless sensor networks. Our proposed method is designed to rapidly increase the speed of distributed computing and decrease the number of the messages required for a network to achieve the desired result. In our analysis, we chose Average consensus algorithm. In this case, the desired result is that every node achieves the average value calculated from all the initial values in the reduced number of iterations. Our method is based on the idea that a fragmentation of a network into small geographical structures which execute the distributed calculations in parallel significantly affects the performance.


Introduction
The main goal of this paper is to present a novel method which improves the efficiency of distributed computing in wireless sensor networks (WSN).WSN are defined as networks consisting of spatially distributed devices monitoring an environmental quantity [1], [2].In general, the devices are labeled as nodes.The nodes are assumed to be limited in terms of their computing capabilities and available energy.We implemented the Average consensus algorithm, the distributed algorithm intended to calculate the average value from the initial ones.Because of its simplicity and an iterative manner, this algorithm is suitable for the implementation in WSN, as shown in [3], [4].On the other hand, the performance of the algorithm may vary [3].In this paper, we have shown how significantly our proposed method improves the performance of the algorithm.It is based on the idea that a fragmentation of a network into small geographical structures which perform parallel distributed computing significantly affects the performance.The longer computation is executed, the more time nodes have to be in the active mode, which causes higher energy consumption and that they have to either send or receive more messages.This results in the signifi-cant energy requirements.The authors of [5] described the impacts of battery exhaustion on WSN and since our novel method saves both time required to execute distributed calculations and the number of transmitted messages, it may be used in order to decrease energy consumption and increase network's lifetime (as described in [6]).As the distribution of the inner states of nodes separated from each other for a longer distance lasts for a long time, before executing the numerical experiments, we expected our novel method to significantly decrease both the number of iterations necessary for the whole network to converge (i.e.necessary time to converge) and the overall number of messages in a network.
In the first part of this paper, the features of distributed computing in WSN and the method of splitting a network into packs have been introduced.In the next chapter, the terms 'the local' and the 'global consensus' have been explained and the proposed mathematical tools to describe our novel method have been introduced.In the numerical experimental part, the performed numerical experiments and analyzed the obtained results have been described.

Distributed Computing in WSN
In order to describe the properties of WSN, we use a set of graph theory tools [7][8][9][10].We assume that WSN is an undirected graph defined as follows: NET is a label of a network.Each vertex v i represents a particular node [11] according to the identity number varying from 1 to N. V represents the set of all vertices, i.e.V = {v 1 , v 2 ,…, v N }.Here, the index N also defines the number of vertices; therefore, the size of a network.Some vertices are connected to each other and this connection is referred to as a path (the path is labeled as e i,j in the case of vertices v i and v j ).E forms a subset of the Cartesian multiplication E V × V. Since we use an indirect graph to describe networks, the following statement is valid: As mentioned above, we chose Average consensus as the subject on which we demonstrated our proposed method.Average consensus is a distributed algorithm calculating the average from a set of initial values [12].This means that every node forming the network converges to x i (k l ) defined as follows: Here, i and j represents the indices of corresponding vertices, i.e. v i ∈ V and v j ∈ V.The value of k l represents the last iteration in which a network achieves the consensus.The vector x ∈ R N contains the inner values of nodes and is updated for each k.
The consensus is reached in an iterative, distributed manner [12].This means that every node converges to the average according to [13]: Here, A ∈ {0, 1} N × N is the adjacent matrix [14], [15].If v i and v j are neighbors, then A ij , A ji = 1, otherwise, they are equal to 0. Only inner value of a node and values sent by adjacent nodes are locally available.The value of determines the speed of the convergence of the algorithm [16].
The convergence range is determined as follows: Here w i is the weight of a node and represents the number of i's adjacent nodes.For the node represented by the vertex v i , it is defined as follows: Here, J is an all-ones matrix [17].The parameter w i determines the number of the nodes adjacent to node i.
Mathematically, the convergence can be explained as follows [18]: It is not possible to fulfill this condition; therefore, we have defined an event when a node is considered to be converged.We used the distributed mechanism described in [3], which allows every node to determine the event of convergence (i.e.consensus) in a distributed manner.After it reaches the consensus, it no longer upgrades its inner value.We defined the parameter d, whose value determines the precision of the algorithm.A node compares its current inner value with the one from the previous iteration.When their difference is smaller than d during the three following iterations, the node considers itself to be converged.

Split Distributed Computing
Split distributed computing is based on the idea that nodes situated in a geographically close area are gathered into a pack.The pack is an entity consisting of geographically proximate nodes.In case of the Average consensus, each pack converges to the local average determined by the initial values of the nodes forming this pack.Every pack appoints a head, i.e. a node of the highest connectivity within the pack.In this paper, a node's connectivity is defined according to the weight w i .
The node in a pack fulfilling this condition is appointed as the head: PK defines the label of a pack.H is the vector which contains 1 (when v i is a head) or 0 (when it is not).This definition implies that the node whose weight is the highest within a pack is appointed as the head.When there is not another node (labeled as v j ) in the pack with weight larger than v i 's, it results in choosing v i for the head of the pack (therefore, the appertaining position within H matrix is set to 1).
After the local consensus is reached within each pack, the phase of reaching the global consensus begins and just the heads of particular packs communicate together and converge to the average.Each node has to fulfill this condition: , , : for .
This means that every node is included only in one pack.The definition implies that every node affects the average value of exactly one pack.If it were present in more than one pack, the final result would differ from the expected results defined by (3).Thus, splitting a network into the packs has to fulfill these conditions: This means that the packs contain all nodes and there is no node which is shared by any other packs.It implies that every node is assigned to just one pack.If it were present in either more than one or no one, the incorrect result would be achieved.Only nodes forming a same pack communicate with one another in the phase of reaching the local consensus.The messages from nodes forming other packs are rejected and do not affect the pack's inner state.The union of all the packs forms the whole set V. Their conjunction results in an empty set because no pack shares at least one node with any other pack.

Distributed Reaching the Local Consensus
In this part, we focus on the reaching the local consensus.This phase is executed independently and in parallel in each pack.We assume that nodes acquaint just adjacent nodes forming the same pack with their inner value.Just nodes fulfilling the condition (11) communicate with the node labeled as v i : Here, p is the matrix determining whether two nodes are in the same pack (if they are, p ij , p ji = 1) and NP is a set containing all nodes with which v i communicates during the phase of reaching the local consensus.
Each node converges iteratively to the local average determined as follows: The value closed to ( 12) is achieved in a distributed manner as follows: ] }.
Let K ∈ N Q be the vector containing all the values of k, i.e. the number of the iteration, in which a single pack reaches a local consensus: The parameter Q is the number of packs in the network: After all packs reach the local consensus, the network as the whole converges to global consensus.This moment of transmission is defined as follows: We label k x fulfilling the condition ( 16) as k lc .All packs reach the local consensus in parallel.After all the packs reach the consensus, there are Q local consensuses.Since the phase of reaching the global consensus can begin after all the packs converge, the local consensus which lasted for the longest time determines the length of this phase.We insert all the durations within one process into the set of K.The maximal value within this set determines k lc and therefore, the duration of the whole phase of reaching the local consensuses.It determines the number of iterations necessary for the slowest pack in a network to reach the consensus.In this moment, all packs reached their local averages.

Distributed Reaching the Global Consensus
In this phase, just the heads continue to communicate and send messages to each other.The initial value of each head is determined by the particular local consensus.Each head converges to the average calculated from all the local consensuses: The heads converge to these values in such a way that they update their inner value in a distributed manner, which can be defined as follows: Here, P is a two dimensional matrix and determines whether two packs' heads are adjacent.Mathematically, it can be defined as follows: We assume that transmitting a message to an adjacent head requires a single iteration.After this phase is completed, the whole process is completed and the network reaches the consensus.The previous condition implies that two heads are adjacent only if there is at least one node in both these packs containing the respective adjacent heads which is adjacent to the node in the other pack.
This formula means that if there is v i from the pack PK x and v j from the pack PK y and they are adjacent, the indices within P matrix belonging to all the nodes from both these packs are set to 1.

Numerical Experiments
In this section, our method is verified using the numerical experiments performed in Matlab.

Network with a Tree Topology
In the first numerical experiment, we presented our method on the network whose topology was the tree with the size of 15 nodes and step-by-step explained particular steps.The topology of this network is shown in Fig. 4. We can see that the network is formed by the nodes which have either three or just one neighbor.Nine of them have only one neighbor and the rest of six have three neighbors.We can see that the average number of neighbors equals just 1.8 neighbors per a node.Therefore, this topology is considered to be less connected.

Iterations Minimization
As the first step, we divided the network into the small packs whose nodes then converged to the local consensuses.(The local consensuses of these packs differ from each other if the initial values of the nodes forming these packs are not same).Following the pack description presented in Sec. 2, it was also necessary to choose the head of each pack.We see can in Fig. 5 that choosing the head h according to w parameter is clear for PK 2 , PK 3 , PK 4 and PK 5 .A small problem occurs when the nodes try to appoint the head of PK 1 because there are two nodes with maximal w.In such a case, the node whose identity number is of the lower value will become the head.This procedure allows us to solve this ambiguity.We executed the Average consensus algorithm twice in this network.For the first time, we used the Standard method and the algorithm converged the way described in [3].The network converges as the aggregate to the average counted from the initial values.In this way, a node is able to reach the average value just according to messages sent by the adjacent nodes and inner state from previous iteration.The values of particular node in every iteration are shown in Fig. 2, parts #1-#5.Communication among the nodes is depicted by solid lines in Fig. 5.
In the second case, we used the Partial method, where every PK reached the local consensus according to (18).
After this phase is completed, the topology was in fact changed to the one shown in Fig. 5 (communication among the heads is depicted by dash lines) where each PK was substituted by the corresponding head h whose initial value was determined by a particular local consensus.The values of particular heads varied during the iterations are shown in Fig. 2, part #6 (each head obtains the value of the local consensus).
We can see from the results shown in Tab. 1 that the pack labeled as PK 5 reached the local consensus as the last.So, the phase of reaching the local consensus ends in 86 th iteration.Consequently, the second phase began and re- sulted in the global consensus -the average value counted from the initial values of the nodes.We can see that this phase lasted only for 27 iterations.Then the overall number of the iterations k l is counted as addition: Here, k gc represents the number of iterations required to achieve the global consensus.The parameter k lc determines the number of iterations required by the phase of reaching the local consensus.Obviously, the results from Tab. 1 show that our method rapidly decreased both the number of iterations and the number of the sent messages.We saved 747 iterations using our proposed method and 11478 messages.In [19][20], the rate of algorithm reaches hundreds of the iterations necessary for a network to reach the consensus (therefore, k l ) just like in our experiments (when the Standard method was used).Comparing two networks with significantly different attributes is ambiguous because of the numerous aspects affecting k l such as the topology of a network, the connectivity, the maximal hop distance etc.).The deep analysis is shown in [21][22].
Messaging is explained in detail in the next section.

Analysis of Sent Messages
We assume that nodes send the broadcast messages in order to transmit information about their inner state.Therefore, in the Standard method, the overall number of messages sm is determined as follows: In our numerical experiment, sm value is calculated as follows: 15 860 12900 sm = ⋅ = .
In the Partial method, the way of determining the number of messages differs.In the phase of reaching the local consensus, the node sends messages until its pack achieves the consensus.During the phase of reaching the global consensus, the heads may not be mutually adjacent; therefore, delivering a message could require sending more than one message.When the head of PK 1 (Head 1) wants to inform other heads about its inner state, it sends a broadcast message to all its adjacent nodes (in our case, it is Head 2, Head 3 and Node 1), by which it transmits information to Heads 2 and 3. Subsequently, the message has to be retransmitted by Node 1 to Node 3, which retransmits it again in order to deliver that information to Head 4 and 5.As we assume a broadcast transmission mode, just one message is sufficient for information to be delivered to both Head 4 and 5 from the Node 1.Thus, three messages are necessary for Head 2 to send information about its inner state to all the other heads.This procedure is shown in Fig. 4.
According to the previous description, we calculated the overall number of the sent messages:

Networks with a Random Topology
In the second numerical experiment, we used the generator described in [23] to generate a random topology network.We set its size to 24 nodes and the size of the network's area and the nodes' communication range were set with such value that the network can be classified as an average density network.The network topology is always connected, i.e. every node is able to communicate (singlehop or multi-hop) with any other node.The example is shown in Fig. 6.We present the results achieved from seven numerical experiments.For each numerical experiment, we used the same topology and changed the sizes of packs.The parameter was set to 0.08.The network's division into packs is shown in Fig. 7.
The scenario 1 is a case when the algorithm was executed without splitting the network.The network in the scenario 2 is divided into eight packs of the size of three nodes per pack.In the scenario 3, we created six packs consisting of four nodes.In the scenario 4, we used the network consisting of five packs with varying sizes.There are  four packs with the size of five nodes per pack and one with four nodes.In the scenario 5, the network consisted of four packs whose size was six nodes per pack.The network in the scenario 6 has three packs with the size of eight nodes per a pack.In the last scenario, the network consisted of two packs, both containing twelve nodes.
From the results shown in Tab. 1, it is obvious that our proposed method significantly decreases number of iterations necessary for WSN to converge k l .we can see that in case when the network is divided into more packs, reaching the local consensus requires less iterations.However, the phase of reaching global consensus was much faster for the networks containing a few numerous packs.We can see that splitting distributed computing is effective for a network divided into smaller packs.However, we are not able to claim which scenario is the best because there are a lot of factors affecting k l .The best results were obtained for the scenario 2, where the network needs by 72.09% less iterations compared with the case when the network was not split.It is a significant improvement of distributed computing without negatively affecting the result's precision.In the networks containing bigger packs, we decreased k l by approximately 50 %.For the network formed by huge packs, the method is less effective.We saved just 15.72% of iterations.We repeated this procedure for other 19 networks whose size and features were both the same and calculated average saved k l from all 20 networks.We can see from the results shown in Tab. 3 that the method decreases k l regardless the number of packs, but the higher number the network is divided into, the more effective the method is.We can see that the scenarios in which the network is formed by higher amount of packs are more effective.

Conclusion
In this paper, we introduced a novel method to accelerate distributed computing in WSN and analyzed its performance.The major idea is that a fragmented network consisting of a group of smaller elements performs the distributed computation much more efficiently.Thus, we divided the network into the packs.We changed the number and the size of packs and compared the results obtained using the well-known distributed algorithm.We depicted a deep analysis for one network.Then we repeated same procedures for other 19 networks and calculated the average of saved iterations.We can see from Tab. 2 that the best results were achieved when the network was divided into a larger number of smaller packs.In that case, we achieved 71.42 % reduction of iterations in the average and 83.25 % reduction of the number of the sent messages.With increase of the size of the packs, the reduction was decreasing.We also see other important features of this method: k lc is the smallest for networks formed by lot of packs, but k gc for networks formed by a small number of packs.These results encourage us to try to improve also this aspect.

Fig. 2 .
Fig. 2. The graphs depicting behavior of inner values within particular packs.

Fig. 3 .
Fig. 3.The graphs depicting behavior of inner values when the Standard method was used.

Fig. 7 .
Fig. 7. Figure containing each scenario executed in this paper.