Network Coded Wireless Cooperative Multicast with Minimum Transmission Cost

We study multicasting over wireless lossy links. Instead of downloading all the data from the source node, we allow the destination nodes themselves to locally exchange the packets, as local communication within a cluster achieves higher packet reception probability with less transmission cost. However, when shall we stop the transmission from the source node? If the source stops too early, the destination nodes locally cannot reconstruct all the original packets, while if the source stops too late, the benefit of cooperative data exchange cannot be fully exploited. In this paper, we propose a network coded hybrid source and cooperative exchange scheme to determine when to stop the source sending and start the exchange process, so as to minimize the total transmission cost. For the case when the clusters are predefined, we derive the expected total transmission cost with our hybrid scheme. Our theoretical results show that under a special condition, the source node should keep sending the packets until all the destinations get the complete information. For the case when the clusters are not predefined, we propose a cluster division algorithm such that the destination nodes within each cluster can conduct data exchange locally with energy efficiency. Finally, simulation results demonstrate the effectiveness of the proposed scheme.


Introduction
Over the past decades, wireless sensor networks have attracted a great deal of research attentions [1][2][3].Once deployed, sensors are expected to operate for a long period of time, and it is impractical to reach these sensors physically.However, it is quite often necessary to update the software running on those sensors or add new functionalities to the sensors [4,5], which needs to reliably multicast large data objects with energy efficiency [5,6].Particularly, in wireless lossy sensor networks, multicasting packets from a single source node is still a challenge problem due to the heterogeneous lossy links to the destination nodes.To satisfy all the destination nodes, the source node needs to keep sending until the destination node with the worst packet reception link successfully receives all the packets, which is inefficient for the source node.
Recently, cooperative data exchange [7][8][9][10][11] has become a promising approach to achieve the efficient data communications.Instead of downloading all the packets from the source node (e.g., the server) [12,13], cooperative data exchange allows the destination nodes to cooperatively exchange their received packets among themselves, once the destination nodes collectively hold all the packets.Compared with pure source-dominated multicast, cooperative data exchange has two main benefits: (a) short-range communication among destination nodes is often more reliable and consumes less transmission cost, (b) the bandwidth saved at the source node can serve more other nodes in the system.
However, most of the existing works on cooperative data exchange problem never consider when to stop downloading the data from the source node.On the one hand, if the source node stops too early, the destination nodes themselves may be unable to collectively reconstruct the complete packets.On the other hand, if the source node stops too late, the benefit of cooperative data exchange cannot be fully utilized.In addition, although the average transmission cost with cooperative data exchange is lower than with source transmission, the sum of the transmission costs within multiple clusters may exceed the cost with pure source sending.To sum up, for wireless multicasting, it is important for us to consider when to stop the source sending and start the cooperative data exchange, so as to reduce the total transmission cost or energy consumption [1-3, 14, 15].
So far, only the work in [16] studies the hybrid source transmission and cooperative data exchange for wireless multicasting, and it stops the source sending exactly once the destination nodes within each cluster collectively hold the complete information.Although the scheme in [16] performs well in reducing the transmission delay, it is still possible that the cooperative data exchange by multiple clusters may consume more transmission cost/energy than source-dominated scheme.As discussed above, the cooperative data exchange needs to be conducted separately in each cluster, and thus the sum of their transmission costs may exceed the transmission cost with pure source transmission.In addition, the predefined clusters in [16] might not perform well, as the destination nodes in some clusters may collect the complete information more quickly than the destinations in other clusters, which may stop the source transmission too late for the other clusters.
Recent work also shows that network coding [17][18][19] can improve the network throughput and reliability, especially for wireless lossy networks.Instead of sending/forwarding the original packets directly, network coding allows the source/transmitting node to linearly combine multiple packets together.With this approach, each transmitted packet has almost the same contribution in reconstructing the original packets, and hence improves wireless reliability [20].Considering the benefits of network coding, we assume that the packets sent from the source node or the packets exchanged among the destination nodes are all linear encoded at the source/transmitting nodes before sending.
In this paper, given the heterogeneous link loss probability and the transmission cost of the source transmission and data exchange among the destinations, we aim to design a hybrid source and cooperative data exchange scheme to minimize the total transmission cost consumed during multicasting.For the case when the clusters are predefined, we determine when the source should stop sending the packets to the destination nodes and the cooperative data exchange should start.For the case when the clusters are not predefined, we consider how to divide the destinations into the clusters.The main contribution of the paper can be concluded as follows.
(i) We theoretically derive the expected total transmission cost required with traditional source-dominated scheme and our hybrid transmission scheme.
(ii) Our analysis shows that under a special condition, the source node should keep sending the packets until all the destinations get the complete information, so as to reduce the total transmission cost.
(iii) We also propose an efficient algorithm to determine how to group the destination nodes into the clusters, so as to make sure that the destination nodes within each cluster can collectively recover all the original packets with energy efficiency.
(iv) We compare the performance of the proposed scheme with some existing schemes.Simulation results show that the proposed scheme can significantly reduce the total transmission cost.
The rest of the papers are organized as follows.In Section 2, we introduce the background and some related works.The system model and the problem description are presented in Section 3. The expected total transmission costs with source-dominated and our scheme are discussed in Section 4. In Section 5, we consider how to group the destinations into clusters after receiving the sufficient number of packets from the source.The simulation results are presented in Section 6.We conclude the paper in Section 7.

Background and Related Work
In this section, we provide a brief introduction to the existing wireless network coded multicast scheme and summarize some related works.

Wireless Network Coding.
Network coding was originally proposed in information theory [17] and recently has become a promising approach to improve the network performance in throughput [21,22], reliability [19,23], security [18,24], and so forth.Instead of forwarding the original packets directly, network coding allows the source node/intermediate node to combine multiple packets together before sending it out.The work in [25] ensures that all the encoded packets generated by the same peer are linearly independent with a high probability, if we use linear network coding based on a sufficient large field size.
Network coding in wireless lossy networks was also considered in the literature.The work in [21] proposed a first wireless network coding architecture, named COPE.By exploiting the broadcast nature of wireless medium, each node stores the overheard packets for a while.When a node transmits a packet, it uses its knowledge of what its neighbors have overheard to perform opportunistic coding [21,22].After receiving an encoded packet, multiple neighbors can decode their wanted packets with their overheard packets.In other words, the sender or transmitting node can deliver "multiple packets" to different neighbors in a single transmission, which thus improves the throughput.The work in [19,23] theoretically shows that network coding significantly reduces the expected number of retransmissions in lossy networks compared to traditional ARQ scheme.The work in [5] considers the impact of both wireless unreliable communication and sleep scheduling of sensor nodes and proposes a deterministic code design at the source node so as to accomplish the data dissemination process at the earliest time.

Reliable Multicast in Lossy Wireless
Networks.Traditionally, wireless single-hop multicast mainly focuses on source-dominated transmission, where the source node keeps sending the packets until all the destination nodes in the system obtain the complete information.However, the performance of such a source-dominated transmission degrades significantly when the packet reception probabilities of the destination nodes are heterogeneous.In this case, the source node cannot stop sending the packets until the destination node with the worst link state successfully gets the packets, even if all the other destination nodes received the packets in a much earlier time.
Recently, cooperative data exchange [7] among the users (e.g., mobile users) becomes one of the most promising approaches in designing efficient data transmissions.In cooperative data exchange, each client initially holds a subset of the packets and wants all the packets that others have.In the literature, it is assumed that there is a common communication channel among the users to receive/send the packets from/to all other users.Cooperative data exchange is to make sure that each client finally can get the complete packets by exchanging packets among themselves through the common channel.Compared with source-dominated transmissions, cooperative data exchange appears ti have two benefits: (1) short-range communications among the users are often more reliable and faster, (2) the bandwidth saved at the server can serve more other clients in the system.
Most recent works on cooperative data exchange mainly focus on how to minimize the total number of packets to be exchanged/transmitted among the users [7][8][9], or the total transmission cost consumed at the users [10,11], such that all the users in the system can finally recover/receive the complete packets.Current works also show that network coding can reduce the number of transmissions or the total transmission cost required for cooperative data exchange process [7][8][9][10][11].
However, in the literature, most of the works either focus on pure source-dominated multicast or focus on cooperative data exchange by assuming that each user initially holds a subset of packets.The only work that considers the hybrid architecture of source transmission and data exchange transmission is in [16].Specifically, the source node is set to stop sending the packets once the destinations in each cluster can collectively reconstruct the original packets, followed by the cooperative data exchange within each cluster.The numerical results show that with this hybrid transmission, the transmission delay of the multicast can be reduced compared with pure source-dominated multicast.However, the source node may stop too early, as the sum of the energy consumed by the cooperative data exchange within multiple clusters may consume more energy, which is inefficient for wireless sensor networks.

System Model and Problem Description
In this section, we first introduce the system model of our problem.Then, we discuss the problem to be solved in two different cases: with predefined clusters and without predefined clusters.

System Model.
In this paper, we consider a multicast application, where a source node s needs to send n packets to m destination nodes in D = {d 1 , d 2 , . . ., d m }.Before transmitting the packets, the source node may generate more than n encoded packets over the original n packets.It is typical to assume that with network coding, after receiving any n encoded packets, the destination node can recover the original n packets.
Suppose that the destination nodes are formed into multiple clusters in C = {C 1 , C 2 , . ..},where C k ∈ C denotes the set of destination nodes in cluster k.Without loss of generality, we assume that each pair of clusters maintain disjoint destination nodes, that is, As in cooperative data exchange, we assume that local data exchange within each cluster is conducted through a common channel.
Our multicasting process consists of two transmission stages.In the first stage, the source node sends the packets to all the destination nodes in D. In the second stage, the destination nodes within the same cluster perform the cooperative data exchange among themselves.These two stages need to make sure that each destination finally can reconstruct the original n packets.Due to unreliable wireless communications, suppose that the packet loss probability from the source node to every destination node is l s , and the packet loss probability between the destination nodes in cluster C k is l k .We also assume that the transmission cost of the source node is t s , and the transmission cost for the data exchange within cluster C k is t(C k ).Without loss of generality, we assume that the transmission cost of the source node is higher than the transmission cost of the destination node, that is, t s ≥ t Ck for ∀k, and the reception probability of the packet sent from s is lower than the reception probability of the packet exchanged within the same cluster, that is, l s ≥ l k .
In this paper, we aim to minimize the total transmission cost consumed during transmissions, while we ensure that all the destination nodes in D can successfully reconstruct the original n packets.Let x hs be the number of packets that the source node sends, and x k be the number of packets sent by the destination nodes within the cluster C k .Thus, the total transmission cost can be written as which needs to be minimized.An example of the system model is shown in Figure 1.The source node s needs to send three packets to the destinations in D = {d 1 , d 2 , . . ., d 7 }, and the original packets are generated into three encoded packets p 1 , p 2 , and p 3 .The set of the packets received at each destination is given near the node after the source node sends three packets p 1 , p 2 , p 3 , for example, the set of packets received by node d 4 is {p 3 }.In the next subsections, we will discuss the problems to be considered in this example.

Problem Description with Predefined Clusters.
In this section, we consider the case when the destination nodes in each cluster are predetermined.
Due to unreliable wireless communications, some packets may be lost at some destination nodes.So, one challenge problem is to determine when the source node s stops sending the packets such that the destination nodes within each cluster can exchange the packets among themselves, and the total transmission cost defined in (1) is minimum.
(i) If the source node s stops too early, the destination nodes in some clusters may be unable to collectively recover all the original packets.
(ii) If the source node s stops too late, the low-cost transmission among the destination nodes themselves cannot be fully utilized, which may incur high transmission cost because of high packet loss probability and high transmission cost of the source transmission.
(iii) The sum of the transmission costs by the data exchange by all the clusters may exceed the transmission cost by the pure source transmission.In this case, the source node should keep sending until all the destination nodes obtain the complete information.
As shown in Figure 1, if the source node s stops after sending p 1 , p 2 , p 3 , the destination nodes in the left cluster (within a circle) cannot collectively reconstruct the original three packets.On contrary, if the source node s sends too many packets, all the destination nodes within each cluster may obtain the complete packets, which does not utilize the low-cost cooperative data exchange.However, if the sum of the transmission costs in these two clusters are higher than the transmission cost from the source node, it is better to let the source node keep sending until all the destination nodes get the complete information.
We refer to the above problem of determining when to stop the source transmission, as minimizing the multicast transmission cost problem with predefined clusters.

Problem Description with Nonpredefined
Clusters.In the above section, we assume that the destination nodes in each cluster are given.However, the predefined clusters may not perform well, as the destination nodes in some clusters may collectively hold the complete information much later than the destination nodes in other clusters, which thus delays The number of packets exchanged within cluster C k the starting time of the cooperative data exchange in other clusters.
Still take Figure 1 as an example, if we group the destinations in {d 1 , d 2 , . . ., d 7 } into two clusters, that is, the destinations in the same circle belong to the same cluster.After the source node s sends three packets p 1 , p 2 , p 3 , the destination nodes in the left cluster cannot recover the original packets, as they collectively have received only two packets so far.However, if we move destination node d 4 into the left cluster, the destination nodes in both clusters can reconstruct all the three original packets, and then s can stop sending the packets.
Thus, after the source node sends sufficient number of packets, we need to consider how to group the destination nodes into multiple clusters such that the destination nodes within each cluster collectively can reconstruct the original packets with energy efficiency.In this case, the number of clusters in C and the set of the destinations in each cluster C k need to be determined by our algorithm.

Minimum Multicast Transmission Cost with Predefined Clusters
In this section, we consider the case when the clusters are predefined.We first analyze the expected total transmission costs with source-dominated scheme and our hybrid transmission scheme.We then formulate the problem of minimizing the total transmission cost with our hybrid transmission scheme as an integer programming.
To ease the understanding, the main notations used in the paper are given in Table 1.successfully decode the original n packets.With linear network coding, after receiving any n encoded packets from s, the destination node can decode the original n packets.Suppose that x s is the expected transmission cost required with the source-dominated scheme.To make sure all the destinations can recover the complete packets, the number of packets that s sends should be at least the maximum number of packets to be sent to satisfy the decoding requirement of each destination.That is

Transmission Cost with
where x i means the expected number of packets required to be sent by s, so as to guarantee that destination d i can decode all the n original packets.Since the packet loss probability from s to each destination node is l s , we can derive x i as follows: In other words, we have With the above equations, we can obtain the expected transmission cost as follows:

Transmission Cost with Hybrid Transmission Scheme.
With hybrid transmission scheme, there are two stages.In the first stage, the source node s sends x hs encoded packets.
In the second stage, the destination nodes in the same cluster cooperatively exchange the packets that they received before, until each of them can decode all the packets.We now consider the first stage.For the success of the second stage, the number of packets sent by the source s should make sure that the destination nodes within each cluster can collectively reconstruct all the original packets.The probability that at least one of the destinations in cluster C k can receive the current transmission is 1 − l |Ck | k .Thus, to guarantee at least n transmissions of the packets are successfully received by the collection of the destination nodes in C k , the expected number of packets sent by the source node s, denoted as x hs , should satisfy the following condition: International Journal of Distributed Sensor Networks When considering all the clusters, ( 6) can be written as Note that if we set x hs = min Ck∈C {n/(1 − l |Ck | k )}, it means the source node s stops sending the packets exactly once the destination nodes in each cluster collectively can recover the complete packets.
We then consider the second stage.After the source node sends x hs packets, the expected number of packets required by each destination in cluster C k is Then, the expected number of packets to be sent by other destinations through cooperative data exchange, denoted as M k , such that one specific destination in cluster C k can decode, is According to the work in [16], the expected number of packets required to be exchanged among cluster C k such that all the destination nodes in C k can obtain its required M k packets is where W |Ck | is the expectation of the |C k |-th order statistic of a sequence of |C k | normal random variables, and σ 2 Mk (given in [16]) is the variance of M k .
Thus, the total transmission cost with hybrid transmission scheme can be formulated as follows: If we assume that the packet loss probability within each cluster is the same, that is, l k = l r for for all C k ∈ C, and the transmission cost within each cluster is also the same, that is, t(C k ) = t r for ∀C k ∈ C, we can obtain the following theorem.
, to minimize the expected total transmission cost, it is better to let the source node keep sending the packets until all the destination nodes in D successfully obtain the original n packets, that is, x k = 0.
Proof.Let x hs be the number of packets sent by the source node s in a hybrid transmission scheme.Since we assume that |C|t r (1 − l s ) ≥ t s (1 − l r ), we have Based on the above equation ( 12), we then compare the expected total transmission cost required by sourcedominated scheme with that by the hybrid transmission scheme for any x k > 0 as follows: From the above equation, we can see that the expected total transmission cost with hybrid scheme is not lower than source-dominated transmission scheme when |C|t r (1 − l s ) ≥ t s (1 − l r ).In other words, to reduce the total transmission cost, the source node should keep sending until all the destination nodes can successfully decode the complete information, that is, x k = 0, which thus proves the theorem.

Hybrid Transmission Scheme with Minimum Total Transmission Cost.
As discussed above, in some cases, the sum of the transmission costs within all the clusters may be more than transmission cost of the source transmission.Under such circumstances, the source node should keep sending the packets until all the destination nodes get the complete information, which is a special case of our hybrid scheme, that is, x hs = x s .
To determine x hs , the problem of minimizing the total transmission cost with hybrid transmission scheme can be formulated as follows: subject to (7).
With the above integer formulation, we can obtain the best time to stop the source transmission and start the cooperative data exchange process.

Cluster Determination in Cooperative Data Exchange Stage
As discussed in Section 3.3, after the source transmissions in the first stage, the destination nodes in some clusters may International Journal of Distributed Sensor Networks 7 be unable to reconstruct the original packets if the clusters are predefined.In this section, we consider after the source node sends x hs packets in the first stage, where x hs ≥ n, how to group the destination nodes into multiple clusters so as to make sure that the destination nodes in each cluster collectively can recover all the original packets.Assume that after the source node sends x hs packets, the set of the packets received by destination node d i is denoted as H i , for example, H 6 = {p 2 , p 3 } in Figure 1.Since the total number of packets sent by the source node s in the first stage is x hs , we have H i ⊆ {p 1 , p 2 , . . ., p xhs }.Before describing how to group the clusters, we first discuss given the exact packet reception states of the destination nodes, how many transmissions are required by the data exchange process within a local cluster.

Number of Transmissions Required for Data Exchange
Process within a Local Cluster.We now discuss the number of transmissions required for data exchange within a specific cluster C k , given the exact packet reception sets of the destination nodes in C k .
Without loss of generality, we assume that after the source s sends x hs packets, all the destination nodes in cluster C k collectively can recover the original n packets.As in the above section, we assume that the packets sent by the source node s are linear independent with each other, that is, after receiving any n packets from s, the node can decode the original n packets.In other words, after the first stage, the total number of packets collectively received by the destination nodes within each cluster C K should satisfy Before describing the number of transmissions for data exchange process within cluster C k , we first introduce a useful existing result in [26].
Theorem 2 (see [26]).Provided that the encoding field size is large enough, the minimum number of packets to be exchanged is given by where Note that in our problem, the packets sent by the source node s are linear independent with each other.Thus, fora given cluster C k , we can obtain that rank Given the packets received by the destination nodes in C k , according to Theorem 2, we can then derive the number of packets to be exchanged among the destinations in C k as follows: where P is a disjoint partition of the nodes in C k and S j ∈ P is the jth subset of the nodes by the partition P.

Optimal Cluster Division with Minimum Transmission
Cost in the Data Exchange Process.We now consider how to divide all the destination nodes into multiple clusters, so as to minimize the total transmission cost in the second stage.
For m destination nodes in D, the maximum number of clusters that can be formed is m.The problem of minimizing the total transmission cost of the data exchange within the clusters by cluster division can be formulated as follows: subject to In (19), t(C k ) means the transmission cost if the destination nodes in C k form a cluster, and x k is number of packets to be sent by the data exchange within cluster C k , which can be calculated with (18).Thus, the objective is to minimize the total transmission cost of the data exchange within all the clusters, by determining C k .The constraints in (20) and (21) denote that each destination node should be located in one and only one cluster.The constraint in (22) represents that the destination nodes can form a cluster if and only if these destination nodes collectively can recover all the n original packets.
Although the above formulation can derive the optimal cluster division to minimize the total transmission cost in the data exchange process, the complexity of calculating x k is too high, as we need to enumerate all the possible partitions of the nodes in C k , {P}.It makes the formulation difficult to be solved.Thus, in the next subsection, we propose a suboptimal algorithm to divide the clusters.

Algorithm Design.
In this section, given the packet reception state of each destination node, we consider to group all the destination nodes into multiple clusters.
Without loss of generality, we suppose that after x hs transmissions from source node s, all the destinations collectively can recover the original n packets.Otherwise, the source node s should send more packets before starting the second stage (i.e., cooperative data exchange).Let U be the set of all the destinations left in the system, that is, U = D.
We then introduce how to select the destinations in U into the kth cluster C k .When adding a new destination d i from U into C k , we make the following rules.
(1) The destination d i should be able to contribute at least one "innovative" packet to the destination nodes that have been added to C k so far.
(2) Deleting d i from U will not sacrifice the decoding capability of the destinations left in U, that is, ( (3) The sum of the transmission costs of the new cluster C k (including d i ) and the cluster formed by the left destinations in U (excluding d i ) is smaller than the transmission cost of the cluster formed by U C k .
We then describe the process of adding new destinations to C k as follows.
(i) We start from the destination node d i ∈ U that satisfies the above three rules and incurs the least transmission cost if it is added into C k .If there does not exist such a node in U, we can obtain that the destinations in U can only form one cluster, that is, C k = U.In this case, the algorithm terminates.
(ii) If we can find a feasible destination d i , we add d i into C k and correspondingly delete d i from U. We then check if the destinations in C k can collectively recover all the original packets.If they can, we finish the cluster C k and repeat the above process to find the (k + 1)th cluster C k+1 from U.
(iii) If they still cannot recover all the original packets, we continue the above two steps.
(iv) The algorithm continues until we cannot find a feasible node d i ∈ U that satisfies the rules (1), (2), and (3).In this case, the destination nodes left in U should be all added to the current considered cluster The detailed process of the above algorithm is shown in Algorithm 1.With the above algorithm, we can make sure that the destination nodes within each cluster C k ∈ C collectively can reconstruct all the original packets.

Illustration Example.
We take Figure 2 as an example to illustrate how Algorithm 1 works, where five destination nodes in {d 1 , d 2 , . . ., d 5 } need to get three original packets.Assume that the first stage stops after the source node s sends three packets {p 1 , p 2 , p 3 }.The set of the packets received by each destination in the first stage is given near the node.To simplify the understanding, we define the transmission cost of the transmission within a cluster C k as the square of the maximum distance between every two destinations within the same cluster [25,27], that is, where |d i − d i | denotes the distance between two nodes d i and d i (given on the edge between two nodes).Note that the above model is only an example, and our algorithm does not restrict the model of the transmission cost.
According to the Algorithm 1, initially we have U = {d 1 , d 2 , . . ., d 5 }.We can easily check that if all the nodes in U form only one cluster, its transmission cost is 4 2 = 16.
We first construct the cluster C 1 , by starting from any node d 1 .Since d 1 satisfies all the three rules defined in the above section, we add d 1 into C 1 and delete it from U, that is, C 1 = {d 1 }, U = {d 2 , d 3 , . . ., d 5 }.Since the nodes in C 1 cannot reconstruct the original three packets, we need to add more nodes to C 1 .Note that although adding d 3 incurs the least transmission cost of C 1 , d 3 violates the rule (1), as the packet it has is not innovative to the nodes in C 1 .We then find that d 2 can be added to C 1 and it incurs the least transmission cost among the nodes left in U \ {d 3 }, that is, With a similar approach, d 4 will be added to C 1 , that is, Since the destinations in C 1 now can collectively reconstruct the original three packets, the construction of cluster C 1 terminates, and the transmission cost of C 1 is t(C 1 ) = 3 2 = 9.
We then construct the cluster C 2 .We can easily check that any node in U = {d 3 , d 5 } cannot satisfy the rule (2).Thus, the destination nodes in U cannot be divided into multiple clusters.In other words, all the destinations in U are added into C 2 , that is, C 2 = {d 3 , d 5 }, and the transmission cost of C 2 is t(C 2 ) = 2.5 2 = 6. 25

Simulation Results
In the simulation, we study a connected network graph, where nodes are randomly deployed in a two-dimensional (2D) space.We use l s to simulate the packet loss probability on the link from the source node s to the destination nodes, and l k to be the packet loss probability by the data exchange within cluster C k .For the transmission cost, we use t s and t(C k ) to denote the transmission cost of the packet sent from source node and the packet exchanged within cluster C k , respectively.Generally, we set l s ≥ l k and t s ≥ t(C k ).
To demonstrate the performance of our proposed scheme, we also conduct two baseline algorithms: sourcedominated scheme, and the hybrid scheme by [16].As introduced before, in source-dominated scheme, the source node keeps sending until all the destination nodes get the complete packets.The difference between the hybrid scheme proposed by [16] and ours is that the scheme in [16] stops the source sending exactly once the destination nodes in each cluster can collectively reconstruct the full packets.
We define cost gain, denoted as δ, as the ratio of the transmission cost difference with source-dominated scheme and the hybrid scheme (our hybrid scheme or the scheme in [16]) to the total transmission cost with source-dominated scheme, for example, the cost gain by our hybrid scheme can be written as 6.1.The Impact of the Transmission Cost.In this section, we conduct the simulation to investigate the impact of the transmission cost on the cost gain.In this setting, we set |C| = 2, l s = 0.5, l k = 0.2, t s = 6, |C k | = 5 and vary the transmission cost of the data exchange within each cluster in [1,6].As shown in Figure 3, the cost gain with our hybrid scheme is much better than the scheme proposed by [16].We can observe that in some cases, for example, t(C k ) ≥ 3, the total transmission cost with the scheme [16] is even more than source-dominated scheme.This is because, when the transmission cost of the packet exchange within a cluster is high, the sum of the transmission costs in multiple clusters exceeds the cost consumed by pure source sending.In other words, the hybrid scheme in [16] stops the source transmission too early.From the figure, we can also find that when t(C k ) is no more than 3, the transmission cost with our hybrid scheme is better than the source-dominated scheme.However, when t(C k ) ≥ 4, it is the same as the sourcedominated scheme, which verifies the results of Theorem 1.In this case, the programming used in our hybrid scheme can detect and keep the source sending until all the nodes receive the complete packets.
From Figure 3, we can also observe that with the increase of the transmission cost within a cluster, the cost gain decreases.This is reasonable, as when the transmission cost within a cluster increases, the sum of the total transmission costs of multiple clusters increases quickly and correspondingly reduces the gain.

The Impact of Packet Loss
Probability.We now investigate the impact of packet loss probability on the performance of the total transmission cost.We fix n = 100, l k = 0.2, |C| = 2, t s = 6, t(C k ) = 3 and vary the packet loss probability on the link from the source to the destination in [0.4,0.9].
As shown in Figure 4, the cost gain with our hybrid scheme is better than with the scheme in [16], and our hybrid scheme also always consumes less transmission cost compared with source-dominated transmission scheme.From the figure, we observe that with the increase of the packet loss probability from the source to the destination nodes, the transmission cost gains with both hybrid schemes increase.This is because, when the links from the source to the destinations nodes are too bad, the source node should stop sending the packet as early as possible.
In addition, when the packet loss probability l s ≥ 0.8, the cost gains with both hybrid schemes are almost the same, as our hybrid scheme may stop the source transmission at almost the same time as in [16], that is, after the destination nodes in each cluster can reconstruct the complete packets.

The Impact of the Number of Clusters.
We now conduct the simulation to investigate the impact of the number of the clusters on the total transmission cost of the proposed schemes.In this setting, we fix n = 100, l s = 0.6, l k = 0.2, The cost gain Cost gain by [14] Cost gain by our scheme As shown in Figure 5, when the number of clusters is few, the total transmission costs with both hybrid schemes are less than with the source-dominated scheme.This is because, when the number of clusters is few, the sum of their transmission costs does not exceed the source transmission cost.However, when the number of clusters increases, for example, more than 5, pure source transmission is better, since the sum of the transmission costs in the clusters is more than the transmission cost from the source.However, we can observe that, the transmission cost with our scheme is not higher than with source-dominated scheme in all the cases, as our hybrid scheme always chooses the best time to stop the source transmission.

The Impact of the Number of Receiver Nodes within
Clusters.Finally, we investigate the impact of the number of destination nodes within each cluster on the performance of our transmission scheme.
As shown in Figure 6, we fix n = 100, t s = 6, t(C k ) = 2, l k = 0.2 and vary the number of destination nodes within each cluster in [2,10].From the figure, we can see that with the increase of the number of the destination nodes within each cluster, the cost gain with our hybrid scheme increases.This is because, with more destination nodes in a cluster, it is much more earlier that the destinations in the cluster collectively can reconstruct the original n packets.Thus, the second stage of our hybrid scheme, data exchange process, can start earlier, which decreases the number of high-cost transmissions from the source node s.
From Figure 6, we also observe that the cost gain with the setting l s = 0.8 is much higher than with the setting l s = 0.6.The reason is that with source-dominated scheme, a large number of the transmissions will be wasted when the packet reception state on the links from s to the destination nodes is bad.In this case, the cooperative data exchange process in our hybrid scheme should start as early as possible, as it consumes less transmission cost.In other words, our hybrid scheme performs well especially when the packet reception from the source node is bad.
By comparing Figures 6(a) and 6(b), we can also obtain that when the number of clusters in the network increases, the cost gain of our hybrid scheme decreases.As each cluster consumes its independent transmission cost in the cooperative data exchange process, the sum of the transmission costs of the data exchange by all the clusters increases, which thus decreases the cost gain.

Conclusion
In this paper, we proposed a hybrid source and cooperative data exchange transmission scheme for reliable multicasting over wireless lossy links.Our hybrid scheme determines when to stop the source sending and start the cooperative data exchange, so as to minimize the total transmission cost.We theoretically derive the total transmission cost required with traditional source-dominated scheme and our hybrid scheme.We give a condition under which the source node should not stop sending the packets until all the destination nodes successfully get the complete information, so as to reduce the total transmission cost.If the clusters are predefined, we propose an efficient algorithm to divide the destination nodes into multiple clusters, such that the destination nodes within each cluster can conduct the cooperative data exchange separately with energy efficiency.Finally, simulation results demonstrate the effectiveness of the proposed scheme in reducing the total transmission cost.

Figure 1 :
Figure 1: An example of system model after sending three packets in {p 1 , p 2 , p 3 } from source node s to receiver nodes in D = {d 1 , d 2 , . . ., d 7 }.

Figure 2 :
Figure 2: An example of five destination nodes.

Figure 3 :
Figure 3: The impact of transmission cost of each cluster on the transmission cost gain ratio.

Figure 4 :
Figure 4: The impact of packet loss probability on the transmission cost gain ratio.

Figure 5 :
Figure 5: The impact of the number of clusters on the transmission cost gain ratio.

6 (
The number of destinations within a cluster Cost gain with l s = 0.8 Cost gain with l s = 0.destinations within a cluster The cost gain by our scheme Cost gain with l s = 0.8 Cost gain with l s = 0

Figure 6 :
Figure 6: The impact of the number of destinations within the cluster on cost gain by our scheme.

Table 1 :
Main notations and their descriptions.
Source-Dominated Transmission.With source-dominated scheme, the source node s keeps sending the encoded packets that are generated based on n original packets, until all the destination nodes in D begin //after sending x hs packets from the source node s if the destinations in D cannot recover the original packets then The source s should send more packets before the . With the above operation, the destination nodes in D form two clusters, C 1 = {d 1 , d 2 , d 4 }, C 2 = {d 3 , d 5 }.Note that, the algorithm can start from any node, for example, if starting from node d 2 , we can also get two clusters {d 2 , d 4 , d 1 }, {d 3 , d 5 }.