The Performance of Multimessage Algebraic Gossip in a Random Geometric Graph

Gossip algorithm has been widely regarded as a simple and efficient method to improve quality of service (QoS) in large-scale network which requires rapid information dissemination. In this paper, information dissemination based on algebraic gossip in a random geometric graph (RGG) is considered. The n nodes only have knowledge about their own contents. In every time slot, each node communicates with a neighbor partner chosen randomly. The goal is to disseminate all of the messages rapidly among the nodes. We show that the gain of the convergence time is O n 1 / 2 log ε - 1 / log 1 / 2 n with network coding. Simulation results show that these bounds are valid for the random geometric graph and demonstrate that network coding significantly improves the bounds with the number of users increasing.


Introduction
With the increased demand for network resources and the expansion of the variety of information resources, information dissemination based on gossip becomes an effective way to improve quality of service (QoS) in large-scale network environment.Gossip algorithm is a simple and efficient algorithm which has good extensibility and robustness to adapt to the large-scale network environment well; it also reduces the redundancy of the retransmission to disseminate information in a wide range of distributed applications.Recently, gossip algorithms have been proposed for many distributed applications, including peer-to-peer (P2P) network, broadcast and multicast [1,2], adhoc network routing [3], and distributed computation and parameter estimation [4,5].
Network coding [6] is first proposed by Ahlswede et al. in 2000 to change the traditional store-and-forward routing.Network coding brought many potential advantages like improving network throughput, balancing network load and high bandwidth utilization, and increasing the network robustness and adaptability for multicast network.Chen and Tseng [7] proposed a distributed algorithm with random linear network coding [8], which is developed to effectively detect, locate, and isolate the Byzantine attackers in a wireless ad-hoc network.
In 2006, Boyd et al. boldly proposed a randomized gossip algorithm [9].They showed that the convergence time in the random geometric graph (RGG) was O( −2 log ), where  is the transmission radius threshold and  is the number of nodes.Based on the advantage of random linear network coding, Deb et al. [10] suggested an algebraic gossip algorithm that innovatively integrated network coding with a gossip algorithm and proved that the convergence time of the information dissemination improved more quickly than in a randomized gossip algorithm, needing only O( log ) time to disseminate the messages in a completed graph.Subsequently, Mosk-Aoyama and Shah in [11] generalized the results of [10] and obtained the convergence time bound O( log ) in expectation using an algebraic gossip algorithm in an arbitrary graph.Vasudevan and Kudekar [12] developed a uniform strong bound of O( log 2 ) with algebraic gossip for an arbitrary graph with high probability.Borokhovich et al. used queuing theory as a novel approach for analyzing algebraic gossip [13].They obtained an upper bound of O(Δ) International Journal of Distributed Sensor Networks rounds for a graph with a maximum degree of Δ for all-toall communication.An average connectivity degree cluster (ACDC) scheme gossip algorithm to improve the convergence speed and the accuracy of the consensus is proposed in [14].The proposed algorithm can yield results that are 50% closer to the real average value than the referenced standard gossip and grid cluster gossip algorithms.
However, realistic networks are so complex that traditional topology graphs cannot match them except with the RGG.We analyze the performance of multimessage algebraic gossip in the RGG.We utilize the concept of conductance [15] to analyze the convergence time bounds.The conductance of a graph is a measure of the connectivity of the graph, which determines how fast a random walk on a graph converges to a uniform distribution.Our results show that the gain of the convergence time is O(( 1/2 log  −1 )/log 1/2 ) with network coding.
The rest of this paper is structured as follows.In Section 2, we describe the model, the algorithm, and the protocols with a few preliminaries.We state our main theoretical results and also provide essential proofs of them in Section 3. In Section 4, simulation results and detailed analysis are given.Section 5 concludes this paper.

Preliminaries and Models
In this network model, there are  nodes, and each node has only one message.We assume that the set of messages is M = { 1 ,  2 , . . .,   }, where node  stores the message in the initial time slot  = 0.
(i) Place  vertices on the region which are in random uniformly and independently distribution.(ii) Connect two vertices (, V) if and only if the distance between them is less than or equal to a threshold; that is, (, V) ≤ , where  is a transmission radius threshold.Obviously, the connection probability between the nodes depends on their Euclidean distance.
We use adjacency matrix A  () to represent graph structure, where if nodes  and  are in their neighborhood, A  () = 1, for  ̸ = , else A  () = 0.Moreover, N  = { ∈ {1, 2, . . ., } : A  () ̸ = 0} represents the set of neighbor nodes of node , and   = |N  | represents the degree of node , which means the number of neighbor nodes.

Synchronous Time Model.
A time model determines the time of multimessage dissemination.We use the synchronous time model as in [11].In the gossip algorithms, round is the time unit during which nodes exchange their messages with each other.Assume one round is divided into  consecutive time slots.In each time slot, each node selects one of its neighbor nodes randomly and independently and shares information with it.

Gossip Algorithm.
In time slot , any node i can select neighbor nodes uniformly, randomly and independently with the probability  = 1/, where  represents a node's degree [13].In the RGG, the transition probability matrix of the node [17] is Here N  is the set of node 's neighbors.Equation (1) presents the transition probability in the RGG without network coding.When node is one neighbor of node , (1) that ensures this neighbor is selected randomly and uniformly.The transition probability expectation of node  is equal to that of node  when chosen as a transmitted node.Then we can use it to derive the transition probability with network coding in the RGG.
There are three kinds of gossip algorithms-Push, Pull, and Exchange, which are defined as follows, respectively.
(i) Push: in this model, a message is transmitted from a transmitting node (the caller) to a receiving node (the called).Thus, the communication process is initiated by the transmitting node.
(ii) Pull: in this model, a message is transmitted from a transmitting node (the called) to a receiving node (the caller).Thus, the communication process is initiated by the receiving node.
(iii) Exchange: here, a message is transmitted from a transmitting node (the caller) to a receiving node (the called), and at the same time, the other message is transmitted from a transmitting node (the called) to a receiving node (the caller).

Gossip Protocol.
Here we use the random linear network coding technology as the gossip protocol.Random linear network coding encodes each message before sending it.We take each message as a vector from the finite field F  of size .So if the message size is  bits, the dimensions of message vector can be represented as  = ⌈/log 2 ⌉.All the initial messages are linearly independent vectors in the finite field F  , and we take A  () to represent the matrix of a coding coefficient vector of nodes  in time slot .
When coding coefficient vector of any node  increases to , it indicates that node  can recover all the messages and also means that wireless communication process has achieved convergence.
In the rest of this paper, we integrate the exchange algorithm with random linear network coding to analyze the convergence time of multimessage dissemination.

Main Results
Based on the above model, we analyze the benefit of network coding in multimessage dissemination in the RGG.Before transmission, each sender and receiver encodes the necessary messages stored in the buffer.At each time slot, each sender  randomly selects the encoded message to send to one of its neighbors  and receives the encoded message from node  at the same time.
Firstly, we use the following definitions similar to [17].
Definition 1.The "Basic Class" B() is the set of matrices satisfying the following properties.
(ii) nonzero submatrices of A  () and A  () satisfy column full rank.
If the matrices satisfy condition (i) but do not satisfy condition (ii), then the set of them can be defined as the "Virtual Class" denoted by Ṽ().
If neither of the conditions (i) or (ii) are satisfied, then the matrices belong to the set defined as the "Residue Class" denoted by R().
Let a node  ∈ N  .Suppose that    represents the transition probability of the nodes ,  that satisfy A  (), A  () ∈ R() and   represents the transition probability of the nodes ,  that satisfy A  (), A  () ∈ Ṽ().Therefore, the relation between    and   satisfies the following.

Lemma 2. Consider
( Proof of Lemma 2. According to the definition of the RGG and (1), we can easily derive   ≈ (Δ  ()/(  − Δ  ()))   , where Δ  () is the number of "Virtual Class" in the set N  and  is the maximum degree of node .
Theorem 3.For the RGG, the conductance of the static algebraic gossip algorithm with multimessage dissemination is Proof.According to the definition of conductance, we can have -conductance for any  ∈ [1, ] as follows: Here (S) is the probability density of S under the stationary distribution  and Q(S, S) is the sum of Q(V, ) over all (V, ) ∈ S × S.
According to the cut size between S and SCut Φ (S, S) = 3 2  3 in [19], (4) can be converted as follows: From [20], we can obtain the wireless transmission radius in the RGG of two dimensions as () = O(√log /).Then, from (5) we can obtain -conductance as follows: Therefore, the conductance is Theorem 4. For any  > 0, the convergence time of the static algebraic gossip algorithm with multimessage dissemination is Proof.Assume that in the initial time slot  = 0, rank(A  ()) = 1, for all  ∈ V, where V is the set of all of the nodes,   = inf{ : C() ≥ } and C() is the largest cardinality in B().
Our goal is to make rank(A  (  )) = , for all  ∈ V.
According to the definition of   , we divide the total convergence time into  time periods as follows: Let   be a characteristic function which takes the value of one if node  transmits message to node ; otherwise it is zero.When node  exchanges messages with node , we can obtain the inequality as follows: International Journal of Distributed Sensor Networks Now consider the general time slot  ∈ [  ,  +1 ); the accumulative rank Δ() = ∑  =1 (rank(A  ()) − 1) should satisfy the following inequality: By taking expectations on both sides of the above inequality and using (11), we have For the convenience of our analysis, we use () = Δ( + 1) − Δ() − (1 + 2/( − 2))Φ  ().Therefore, we quote the definition of   () in [20]   () = ∑ −1 =  ()1 {i< +1 } ; then we can easily get   (  ) = 0.
We unfold the above inequality to get the result as According to (14) and Theorem 5 in [11], (12) can be derived as follows By Markov's inequality, (15) implies that Therefore we can conclude that the convergence time of the static algebraic gossip network model is expressed as According to ( 6) and ( 7), we can conclude that the convergence time of the mobile algebraic gossip model is Because the conductance is the same as in ( 6), we conclude that the convergence time of the static gossip algorithm is as follows.
Corollary 5.For any  > 0, the convergence time of the static gossip network model is The results are summarized in Tables 1 and 2 for clarity.From these tables, we observe that network coding directly affects the convergence time.Next, we will check and verify these results by simulation.

Simulation
In this section, simulation results are presented to support our theoretical analysis.In our RGG simulation, it is assumed that the maximal communication range for each node is set to O(√log /).All the nodes do not perform as the message relay any more, because they are outside of the communication radius.The simulation network is implemented in a fully distributed way, which guarantees that nodes have no knowledge of the other's state (e.g., location, caller/called state, etc.).Therefore, both successful and unsuccessful communication attempt are added to the convergence time.The simulation parameters are set as follows: (i) simulation software: MATLAB R2010b, (ii) simulation area: unit square, (iii) number of nodes: 100-200 with step 10, (iv) maximum number of rounds for each case: 2000, (v) runs for each case: 100, (vi) time unit: round number.
Here, the meanings of the abbreviations are set as follows: (i) SAG: static algebraic gossip, (ii) SG: static gossip, (iii) GCT: global convergence time, and its time unit is a round number, (iv) LCT: local convergence time, and its time unit is a round number, (v) W.H.P: with high probability.
To analyze the performance of the static algebraic gossip algorithm, we built, simulator to investigate the relation between convergence time and other typical parameters.The simulation results are as follows.
Firstly, we analyze the relation between the global convergence time and W.H.P 1 − .From Figure 1, it can be observed that the global convergence time increases as the parameter  decreases.Moreover, GCT for each algorithm rapidly grows as W.H.P 1− increases from 0.1 to 1. Obviously, when W.H.P 1 −  is less than or equal to 0.7, the gap of the global convergence time between two algorithms in Figure 1 becomes smaller as W.H.P 1 −  increases, which proves that network coding has great advantages for the convergence time in this condition.However, when W.H.P 1− is greater than or equal to 0.7, the gap maintains relatively stable, meaning that network coding has little influence on the convergence time in this condition.
Thus, we can choose an appropriate value of  = 0.1 for the convenience and the precision of the next simulations.
Secondly, we analyze the relation between the global convergence time, the local convergence time, and the number of nodes in Figures 2 and 3.
As shown in Figures 2 and 3, both the global convergence time and local convergence time go up with the increasing number of nodes.As the number of nodes increases, the global convergence time and local convergence time of static algebraic gossip algorithm are less than those of static gossip algorithm.This phenomenon results from the network coding, which effectively and obviously reduces the GCT and LCT when the number of nodes becomes larger.It can be observed from Figure 2 and Figure 3 that network coding has a relatively larger benefit on the local convergence time than the global convergence time.
Finally, we can conclude some relations from the figures.In the beginning periods of communication, network coding significantly lowers the retransmission probability.It can explain why the gap is relatively larger in the initial phase in Figure 1 and why network coding has a relatively larger impact on the local convergence time than the global convergence time in Figures 2 and 3.However, in the subsequent periods of communication, new messages are increasingly difficult to transmit, so network coding has less help to improve the performance.The constant gap in the later phase shown in Figure 1 and the differences between the global convergence time and local convergence time in Figures 2 and  3, respectively, can illustrate network coding has less help to improve the performance.

Conclusion and Future Work
In this paper, we studied the performance of multimessage dissemination based on the algebraic gossip algorithm in the RGG.We considered the network with  nodes which only International Journal of Distributed Sensor Networks have knowledge about their own contents.In every time slot, each node communicates with a randomly chosen neighbor partner by the exchange algorithm.Our goal is to disseminate all of the messages rapidly among the nodes.We utilized the concept of conductance to analyze the convergence time bounds.The results showed that the bounds are significantly improved from O( 3/2 log  −1 /log 1/2 ) to O( 3/2 log  −1 −  1/2 log  −1 /log 1/2 ).Next, through the simulations, we verified that these bounds are valid for the random geometric graph and network coding significantly improves the convergence time bounds of multimessage dissemination in the RGG.
However, there are still a number of open questions that require further research and improvement.One is to find a quantitative description of the local convergence time to conclude some related rules may be a feasible step.The other important direction is to research the influence of nodes' mobility.Due to the static network environment, the connectivity between nodes is fixed.That may lead to the problem where some neighbors of a node have very low transition probabilities, resulting in a sharp increase in the propagation times.Nodes' mobility could bring a lot of potential opportunities, like improving the network throughput, increasing network coverage and so on.There are more and more researches about nodes' mobility to solve the problem of information dissemination.Therefore, we will analyze the case with nodes' mobility in future work.The start point is if nodes' mobility can help change the connectivity between nodes over time to impact transition probability.

2. 5 .
Communication Criteria.First, consider the local convergence time T EX,RLC local () = min{ : dim(A()) = , ∃ ∈ V}, representing the shortest time that there is a node  containing all of the messages in the set M. Second, consider the global convergence time [18] T EX,RLC global (, ) = inf{ : Pr(⋃  =1 {dim(A()) ̸ = }) < , ∃ ∈ V}, representing the time point at which all of the nodes receive all of the messages with high probability 1 − , where  > 0.

Figure 3 :
Figure 3: The relation between LCT and the number of nodes.

Table 1 :
Comparison of algorithms in the RGG.

Table 2 :
Gossip in different graph models.