Applying physical layer network coding in wireless networks

A main distinguishing feature of a wireless network compared with a wired network is its broadcast nature, in which the signal transmitted by a node may reach several other nodes, and a node may receive signals from several other nodes, simultaneously. Rather than a blessing, this feature is treated more as an interference-inducing nuisance in most wireless networks today (e.g., IEEE 802.11). This paper shows that the concept of network coding can be applied at the physical layer to turn the broadcast property into a capacity-boosting advantage in wireless ad hoc networks. Specifically, we propose a physical-layer network coding (PNC) scheme to coordinate transmissions among nodes. In contrast to"straightforward"network coding which performs coding arithmetic on digital bit streams after they have been received, PNC makes use of the additive nature of simultaneously arriving electromagnetic (EM) waves for equivalent coding operation. And in doing so, PNC can potentially achieve 100% and 50% throughput increases compared with traditional transmission and straightforward network coding, respectively, in 1-D regular linear networks with multiple random flows. The throughput improvements are even larger in 2-D regular networks: 200% and 100%, respectively.


INTRODUCTION
NE of the biggest challenges in wireless communication is how to deal with the interference at the receiver when signals from multiple sources arrive simultaneously. In the radio channel of the physical layer of wireless networks, data are transmitted through electromagnetic (EM) waves in a broadcast manner. The interference between these EM waves causes the data to be scrambled.
To overcome its negative impact, most schemes attempt to find ways to either reduce or avoid interference through receiver design or transmission scheduling [1]. For example, in 802.11 networks, the carrier-sensing mechanism allows at most one source to transmit or receive at any time within a carrier-sensing range. This is obviously inefficient when multiple nodes have data to transmit.
While interference causes throughput degradation on In the second category, PNC and channel coding are studied jointly. In [14][15][16], PNC was combined with Lattice code or LDPC code. It was proved that the capacity of the two-way relay channel can be approached in high SNR and low SNR. In [14][15][16], channel coding and PNC mapping are performed independently (i.e., successively). In [17], we proposed a novel scheme which treats channel coding and PNC in an integrated manner. We show that joint channel-PNC decoding can outperform the previous schemes significantly.
In the third category, the focus is on the performance impact and significance of PNC in large scale wireless networks. For one-dimensional wireless networks, [18] showed that PNC can improve the capacity by a fixed factor, although it does not change the scaling law. For two-dimensional wireless networks, [19] showed that PNC can increase capacity by a factor of 2.5 for the rectangular networks and a factor 2 for the hexagonal networks. However, the result in [18] is obtained based on a rough scheduling scheme which is established traditional network coding rather than physical layer network coding (the special properties of PNC is ignored). Our paper here also discusses the application of PNC in large scale wireless networks. It is different from [18] in that we provide the construction of an explicit PNC-scheduling algorithm (specially designed for PNC), upon which all our results are established. Compared with [19], we consider the many-to-many scenario with multiple sources and destinations, while [19] only considered the one-to-many scenario with one source.
The rest of this paper is organized as follows. Section II overviews the basic idea of PNC with a linear 3-node multi-hop network. Section III and Section IV investigate the application of PNC in the 1-D regular linear network and 2-D regular grid network, respectively. Section VI concludes the paper.

II. ILLUSTRATING EXAMPLE: A THREE-NODE WIRELESS LINEAR NETWORK
Consider the three-node linear network in Fig. 1. N 1 (Node 1) and N 3 (Node 3) are nodes that exchange information, but they are out of each other's transmission range. N 2 (Node 2) is the relay node between them. This three-node wireless network is a basic unit for cooperative transmission and it has previously been investigated extensively [20][21][22][23][24][25]. In cooperative transmission, the relay node N 2 can choose different transmission strategies, such as Amplify-and-Forward or Decode-and-Forward [22], according to different Signal-to-Noise (SNR) situations. This paper focuses on the Decode-and-Forward strategy. We consider frame-based communication in which a time slot is defined as the time required for the transmission of one fixed-size frame. Each node is equipped with an omni-directional antenna, and the channel is half duplex so that transmission and reception at a particular node must occur in different time slots. Slow fading is assumed throughout this paper for the ease of synchronization.
Before introducing the PNC transmission scheme, we first describe the traditional transmission scheduling scheme and the "straightforward" network-coding scheme for mutual exchange of a frame in the three-node network [20,25].

A. Traditional Transmission Scheduling Scheme
In traditional networks, interference is usually avoided by prohibiting the overlapping of signals from N 1 and N 3 to N 2 in the same time slot. A possible transmission schedule is given in Fig. 2. Let S i denote the frame initiated by N i . N 1 first sends S 1 to N 2 , and then N 2 relays S 1 to N 3 . After that, N 3 sends S 3 in the reverse direction. A total of four time slots are needed for the exchange of two frames in opposite directions.

B. Straightforward Network Coding Scheme
Ref. [20] and [25] outline the straightforward way of applying network coding in the three-node wireless network. Fig. 3 illustrates the idea. First, N 1 sends S 1 to N 2 and then N 3 sends frame S 3 to N 2 . After receiving S 1 and S 3 , N 2 encodes frame S 2 as follows: where ⊕ denotes bitwise exclusive OR operation being applied over the entire frames of S 1 and S 3 . N 2 then broadcasts S 2 to both N 1 and N 3 . When N 1 receives S 2 , it extracts S 3 from S 2 using the local information S 1 , as follows Similarly, N 2 can extract S 1 . A total of three time slots are needed, for a throughput improvement of 33% over the traditional transmission scheduling scheme.

C. Physical-Layer Network Coding (PNC)
We now introduce PNC. Let us assume the use of BPSK modulation at all the nodes. We further assume symbol-level and carrier-phase synchronization, and the use of power control, so that the frames from N 1 and N 3 arrive at N 2 with the same phase and amplitude. The combined bandpass signal received by N 2 during one symbol period is where ( ) i s t , i = 1 or 3, is the bandpass signal transmitted by N i and 2 ( ) r t is the bandpass signal received by N 2 during one symbol period; i a is the BPSK modulated information bit of N i ; and ω is the carrier frequency. Then, N 2 will obtain a baseband signal 1 3 a a + . Note that N 2 cannot extract the individual information transmitted by N 1 and N 3 , i.e., 1 3 and a a , from the combined signal in 1 3 a a + . However, N 2 is just a relay node. As long as N 2 can transmit the necessary information to N 1 and N 3 for extraction of 1 3 , and a a over there, the end-to-end delivery of information will be successful. For this, all we need is a special modulation/demodulation mapping scheme, referred to as PNC mapping in this paper, to obtain the equivalence of GF(2) summation of bits from N 1 and N 3 at the physical layer.  The BER analysis in [6] shows that the end-to-end BER for the three schemes is similar when the per-hop BER is low. Ignoring the slight BER difference, we have the following conclusion. For a frame exchange, PNC requires two time slots, 802.11 requires four, while straightforward network coding requires three. Therefore, PNC can improve the system throughput of the three-node wireless network by a factor of 100% and 50% relative to traditional transmission scheduling and straightforward network coding, respectively.

III. APPLYING PNC IN REGULAR 1-D NETWORKS
Our discussions so far has only focused on the simple 3-node network with one bidirectional flow. In this section, we discuss the application of PNC in more general networks.

A. Regular linear network with one bidirectional flow
Consider a regular linear network with N nodes with equal spacing between adjacent nodes. Label the nodes as node 1, node 2, …, node N, successively with nodes 1 and N being the two source and destination nodes, respectively. Fig. 5 shows a network with N = 5. Suppose that node 1 is to transmit frames X 1 , X 2 , …. to node N, and node N is to transmit frames Y 1 , Y 2 , …. to node 1. > REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 4 We could divide the time slots into two types: odd slots and even slots. In the odd time slots, the odd-numbered nodes transmit and the even-numbered nodes receive. In the even time slots, the even-numbered nodes transmit and the odd-numbered nodes receive. Fig. 5 shows the sequence of frames being transmitted by the nodes in a 5-node network. In slot 1, node 1 transmits X 1 to node 2 and node 5 transmits Y 1 to node 4 at the same time. In slot 2, node 2 and node 4 transmit X 1 and Y 1 to node 3 simultaneously; both node 2 and node 4 also store a copy of 1 X and Y 1 in their buffer respectively. In slot 3, node 1 transmits X 2 to node 2, node 5 transmits Y 2 to node 4 and node 3 broadcasts 1 1 X Y ⊕ simultaneously; node 3 stores a copy of 1 1 X Y ⊕ in its buffer. Adding the stored X 1 to 2 Node 4 can obtain 2 1 Y X ⊕ similarly. In slot 4, node 2 and respectively. In this way, node 5 receives a copy of X 1 and node 1 receives Y 1 in slot 4. Also, in slot 4, node 3 obtains 2 2 Y X ⊕ by adding stored packet 1 1 X Y ⊕ to the received packet With reference to Fig. 5, we see that a relay node forwards two frames, one in each direction, every two time slots. So, the throughput is 0.5 frame/time slot in each direction. Due to the half duplex assumption, this is the maximum possible throughput we can achieve.
As detailed above, when applying PNC on the linear network, each node transmits and receives alternately in successive time slots; and when a node transmits, its adjacent nodes receive, and vice versa (see Fig. 5). Let us investigate the signal-to-inference ratio (SIR) given this transmission pattern to make sure that it is not excessive. Consider the worst-case scenario of an infinite chain. We note the following characteristics of PNC from a receiving node's point of view: a) The interfering nodes are symmetric on both sides. b) The simultaneous signals received from the two adjacent nodes do not interfere due to the nature of PNC. c) The nodes that are two hops away are also receiving at the same time, and therefore will not interfere with the node. Therefore, the two nearest interfering nodes are three hops away. We have the following SIR: where P 0 is the common transmitting power of nodes and α is the path-loss exponent. Assume the two-ray transmission model where 4 The resulting SIR is about 16dB and the impact of the interference on BER is negligible for BPSK based on [26] (the capture threshold is often set to 10dB in wireless networks [3]). More generally, a thorough treatment should take into account the actual modulation scheme used, the difference between the effects of interference and noise, and whether or not channel coding is used. However, we can conclude that as far as the SIR is concerned, PNC is not worse than traditional scheduling (see Section V) when generalized to the N-node linear network.

B. Regular linear network with multiple flows
Part A considers only one bidirectional flow. Here we consider a general setting in which there are K unidirectional flows in the N-node linear network. Note that this generalization includes the scenario in which there is a combination of unidirectional and bidirectional flows in the network, since each bidirectional flow can be considered as two unidirectional flows.
To allow PNC to be applied, we compose bidirectional flows out of the K unidirectional flows by matching pairs of unidirectional flows in opposite directions. The bidirectional flows can then make use of PNC for transmission, while the remaining unmatched unidirectional flows make use of the traditional strategy of multi-hop data transmission.
The optimal way to compose the bidirectional flows and schedule the transmission of the links in the flows is a tough problem. Here we consider a simple heuristic which is asymptotically optimal for the regular N-node linear network when N goes to infinity as shown in Part C. For simplicity, we assume all flows have equal traffic.
We define the following terms with respect to the linear network. Let us label the nodes from left to right by 1 to N sequentially. Let ( , )  Fig. 6 shows an example of a dual packing. Flows 2 and 3 form a right packing, and Flow 1 forms a left packing. Note that some of the nodes are traversed by both a right-bound flow and a left-bound flow. Let us call these nodes the > REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 5 common nodes, and the other nodes the non-common nodes. A sequence of adjacent common nodes, flanked by but not including two non-common nodes at two ends (an ellipse in Fig. 6), forms a PNC unit, and we can use the PNC mechanism for transporting the bidirectional traffic over it. A sequence of adjacent non-common nodes, together with the two common nodes flanking them (a rectangle in Fig. 6), may or may not have traffic flowing over them. When there is traffic, the traffic is in one direction only, and the traditional multi-hop communication technique can be used to carry the unidirectional traffic. Essentially, by forming a dual packing, we also form many "virtual" bidirectional flows (each corresponding to a PNC unit) on which PNC can be applied. The dual packings yield a set of "virtual" bidirectional flows, each corresponding to a PNC unit. Scheduling can then be performed as follows. Let us refer to the time needed for all the K unidirectional flows to transfer one packet from source to destination as one frame. Each link (hop) of a flow is allocated one time slot for transmission within a frame. A frame is further divided into two intervals, as follows: 1) The first interval is dedicated to the PNC units (i.e., ellipses). Note that if there are M dual packings, 2M time slots are needed in the worst case; in the worst case, different dual packings use different time slots to transmit, and 2 time slots are needed for each dual packing 1 .
2) The second interval is dedicated to the non-PNC units (i.e., rectangles). The nodes of all rectangles of all dual packings are scheduled to transmit using the conventional scheme.
The number of time slots needed in the second interval depends on both the number and the lengths of the rectangles. As will be shown in Part C, it can be ignored compared to the time slots needed in the first interval as N goes to infinity.

C. Throughput of 1-D network with PNC
We now show that the packing and scheduling strategies presented in Part B can allow the upper-bound capacity of 1-D network to be approached when the number of nodes N goes to infinity. Furthermore, compared with the conventional schemes discussed in [27], PNC can achieve a constant factor of throughput improvement.
We first detail the system model. To avoid edge effects, we consider a "large" circle instead of a line. The N nodes are uniformly distributed over the circle with a constant distance between adjacent nodes. Without loss of generality, let the distance between two adjacent nodes be a unit distance. Each transmission is over only one unit distance (i.e., a node only transmits to its two adjacent nodes). Consider the receiver of a link. We assume that 1 Two caveats are in order. The first is that according to our construction, there could be "trivial" PNC units with two nodes only. In this case, the PNC mechanism is not needed, and each node gets to transmit directly to the other node. Regardless of whether the PNC unit is trivial or not, two time slots are needed for the bidirectional flows. The second caveat is that there could be two PNC units in the same dual packing next to each other. For example, suppose nodes 1, 2, and 3 form a PNC unit, and nodes 4, 5, 6 forms another. To avoid conflict, the scheduling of the transmissions on these two PNC units should be such that nodes 1, 3, 4 and 6 transmit in one time slot while nodes 2 and 5 transmits in another time slot. Again, two time slots are needed. > REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 6 simultaneous transmission by another link whose transmitter is two or more hops away from the receiver of the first link will not cause a collision to the first link. In our model, N/2 nodes are randomly chosen as the source nodes. The remaining N/2 nodes are the potential destination nodes. For each source node, a unique destination node is chosen among the N/2 potential destination nodes with equal probability. We assume matching without replacement in that the destination node chosen for a source node will not be put back to the pool before the destination node of another source is chosen. The route for a source-destination pair is also predetermined in a random way (note: there are two routes from a source to its destination, one in the clockwise direction and the other in the counterclockwise direction).
The analytical results for the traditional transmission scheme and straightforward network coding scheme in our circular model are similar to those in the 1-D linear network in [27] when N goes to infinity. Using similar approach, it is not difficult to obtain the respective per-flow throughputs in our circular network as where unit link bandwidth is assumed.
Let us now focus on the PNC throughput. We will show that PNC can achieve the per-flow throughput 4 / N ε − for any small positive value ε as N goes to infinity. Let us first provide further details to the scheduling strategy presented in Part B.
The packing and scheduling are as follows. For packing, we first unwrap the circle to a non-circular linear network by randomly selecting the source node of a clockwise flow, labelled s, on the circle as the start point of the linear network. The adjacent node of the selected source node in the counterclockwise direction in the circle, labeled e, will serve as the end point of the linear network. Next, we obtain one packing of the clockwise flows according to the packing algorithm in Part B. It is possible that the last selected flow crosses the start point. In that case, we cut the flow into two sub-flows by performing the cut between the start point and the end point, and only consider the first sub-flow in the aforementioned packing. After forming the above clockwise unidirectional packing, we form a matching counterclockwise unidirectional packing at choosing e as the start point and s as the end point. If there is an existing counterclockwise flow with e as its source node, we will start with this flow in the unidirectional packing. If not, we will choose the next flow with source node closest to e in the counterclockwise direction in our packing.
For "traffic balance", after getting the first dual packing as above, for the next dual packing, we will start with forming the counterclockwise unidirectional packing first (i.e., s and e will be defined with respect to the counterclockwise packing) before constructing the matching clockwise packing. Repeating the above procedure allows us to form a series of dual packings.
The scheduling of transmissions is the same as that in Part B except that here we also have to consider the transmission across the two sub-flows cut as above, if any. We assume the traffic from the destination of a preceding sub-flow to the source of its corresponding sub-flow is transmitted using the conventional scheme in the second interval.
With the above packing and scheduling strategies, we have the following theorem on the per-flow throughput of the 1-D circular network when N goes to infinity. , where the small positive quantity 1 ε goes to zero as N goes to infinity. The number of time slots needed in the second interval, on the other hand, is 2 N ε , where the small positive quantity 2 ε goes to zero as N goes to infinity. Then we can obtain the per-flow throughput with PNC: . A corollary of Theorem 1 is that PNC can improve the throughput of the 1-D network by a factor of 2 and 1.5 relative to the traditional transmission scheme and the SNC scheme (7), respectively.

IV. APPLYING PNC IN 2-D GRID NETWORK
Section III focused on the 1-D regular network. This section investigates the application of PNC in a 2-D regular gird network. We assume the same transmission protocol as in section III. Fig. 7 shows the grid network under consideration, in which N nodes are uniformly located at the cross points as shown. In this part, we first consider the case in which each line (horizontal or vertical) on the grid has one and only one bidirectional flow. Specifically, the two end nodes in > REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 7 each line, node 1 and node N , exchange information through the relay nodes in between.

A. 2-D Grid Network with one bidirectional flow in each line
The flows transmit with the following PNC schedule. Consider the horizontal lines (similar schedule applies for the vertical lines). The first two time slots are dedicated to transmissions on lines 1, ; and so on. The separation J must be large enough for acceptable SIR. In the example of Fig.  7, J=4.
For a group of simultaneous active lines, to reduce SIR, when the odd nodes transmit on one active line, then the even nodes will transmit on its two adjacent active lines, as shown in Fig. 7. where P 0 , l, d=1, and α are defined similarly as in section III-A. Without loss of generality, suppose that the receiver is an even node. The interference from the other active lines whose odd nodes are transmitting is  (9) is about 13.5dB, 12.3dB, and 10.0dB for J equals 5, 4, and 3 respectively. With an assumed 10dB target, J=3 is enough to guarantee successful transmission.

B. 2-D Grid network with multiple random flows
Let us now investigate the application of PNC in the 2-D grid network with a more general traffic pattern. With respect to Fig. 7, we now randomly choose N/2 of the nodes as the source nodes. The remaining N/2 nodes are the destination nodes.
Here we apply a simple routing scheme, as in [27]. When N goes to infinity, the number of nodes in each line or column, N , also goes to infinity, and the per-flow PNC throughput in each line or column will approach 4 / N , as argued in section III. Since the horizontal transmission and vertical transmission are scheduled in different time interval and in each interval every J lines (columns) transmit simultaneously, the per-flow transmission of PNC in the 2-D grid network can approach For comparison purposes, let us look at the per-flow throughput under the traditional transmission strategy and under the straightforward network coding strategy. With the routing/scheduling strategy and the corresponding throughput analysis in [27], we can show that the traditional transmission scheme and SNC scheme can achieve the following throughputs, respectively: In the 2-D grid network, the nodes are tightly packed than in the 1-D network, and the interfering nodes must be kept at least 3 hops away, i.e., 2 ∆ = , to obtain an SIR of no less than 10dB (note: in the 1-D network, ∆ could be 1 for SIR of about 10dB). When 2 ∆ = , we can verify throughputs better than (11) cannot be achieved. In other words, the throughput in (11) is also the upper bound for traditional transmission scheme and SNC scheme under all possible schedulings.
Therefore, setting J=3 in (10), we conclude that PNC can achieve a throughput improvement factor of 3 and 2 relative to the traditional transmission scheme and the SNC scheme, respectively. Note that the improvement factors under the 2-D network are larger than those under the 1-D network, which are 2 and 1.5, respectively (see section III). > REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 8 V. CONCLUSION This paper has introduced a novel scheme called Physical-layer Network Coding (PNC) that significantly enhances the throughput performance of multi-hop wireless networks. Instead of avoiding interference caused by simultaneous electromagnetic waves transmitted from multiple sources, PNC embraces interference to effect network-coding operation directly from physical-layer signal modulation and demodulation. With PNC, signal scrambling due to interference, which causes packet collisions in the MAC layer protocol of traditional wireless networks (e.g., IEEE 802.11), can be eliminated.
We have proposed explicit scheduling algorithms for PNC in 1-D and 2-D regular networks with multiple random flows. It is shown that PNC can potentially achieve 100% and 50% throughput increases compared with traditional transmission and straightforward network coding, respectively, in the 1-D regular linear network. The throughput improvements are even larger in the 2-D regular network: 200% and 100%, respectively. In particular, PNC can allow the upper-bound throughput of the 1-D regular network to be approached as the number of nodes goes to infinity.
Appendix: Proof of Theorem 1 This appendix proves Theorem 1 in three steps. First, the fact that 4/N is the upper bound for the throughput of the 1-D circular linear network can be argued as follows. Let us consider the number of time slots needed so that each flow can transport one packet from its source to its destination. Due to half-duplexity, there can be at most N/2 transmitting nodes in a time slot. In general, each transmitting node can transmit to at most two of its adjacent nodes simultaneously. Hence, in total, there can be at most Next, we prove that the number of time slots needed in the second interval is negligible compared to N, denoted by 2 N ε where 2 ε is a small positive quantity that goes to zero as N goes to infinity. The total one-hop transmissions in the second interval can be divided into two parts, the one-hop transmissions in the rectangles and the one-hop transmissions between sub-flows (created when we unwrap the circular network into a linear network).
Let us first consider the rectangles. As shown in Fig.  A-1, within a dual packing, the rectangles do not overlap. Furthermore, the two end nodes in a rectangle must be either a source or destination node of some flow. As a proof technique, let us artificially divide the rectangles into two groups according to the dual packings containing them. Recall that the dual packings are formed successively in our packing algorithm. Consider the first 3 (1 ) ε − fraction of all flows (including the original flows and the generated sub-flows) that are included successively into the dual packings. The first group of rectangles arises from these flows. The second group of rectangles belongs to remaining 3 ε fraction of the flows. We set 3 ε such that 3 1/ log N ε = . As discussed in Section III-B, when we perform packing on the circular network by unwrapping it to a linear network, it is possible for a flow to be cut into two subflows. Each clockwise unidirectional packing contains at least one flow that does not generate subflows (a flow cannot have more than N hops). As a corollary, if the clockwise packing contains a flow that has been cut into two subflows, then the packing must contain at least two flows to start with. One of these subflows will be relegated to a future packing exercise. So, each clockwise packing reduces the number of remaining flows to be packed by at least one. For the matching counterclockwise packing, at most one flow will be cut into two subflows. Thus, the matching counterclockwise packing does not increase the number of remaining counter-clockwise flow. Recall from the discussion in Section III-B that for "traffic balance" successive dual packings will start with clockwise and counterclockwise packings in an alternate manner. Thus, successive dual packings will reduce the numbers of remaining clockwise and counterclockwise flows by at least one alternately.
In the beginning, there are N/2 original flows (N/4 of which are clockwise and N/4 of which are counterclockwise flows). From the argument in the previous paragraph, there are altogether at most N/2 dual packings. Each dual packing will at most generate at most two extra flows to the flow pool (because of cut between s and e). Thus, altogether there could be at most N extra flows being generated. Hence, the total number of flows (including the original flows and the subflows) is 3N/2.
In general, since the two end nodes of a rectangle must be either a source or a destination of some flow, the number of rectangles in a dual packing is no more than the number of flows in that dual packing (note: some non-end nodes within a rectangle could also be sources or destinations; thus the "no more than" rather than "equal to"). Therefore, the number of rectangles in the first group is therefore no more than 3 (1 )N ε − . For these rectangles, as shown in Lemma 2 at the end of this appendix, the number of nodes in each group-1 rectangle is no more than 4 4 (1 ) log( ) N N ε ε − + w.h.p., where 4 ε is a small positive quantity that goes to zero when N goes to infinity. Similarly, the number of rectangles in the second group is upper bounded by 3 N ε . As a trivial bound, we will upper-bound > REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 9 the number of nodes in each group-2 rectangle by N. Note that each node will at most transmit once within a rectangle (group-1 or group-2) for traffic forwarding. Thus, the total number of one-hop transmissions needed for the rectangles is upper bounded by [ ] Now, consider the transmissions across sub-flows. A one-hop transmission is needed for two adjacent sub-flows generated by the cut when we unwrap the circular network to a corresponding linear network. In other words, there is a one-hop transmission whenever there is an extra sub-flow, which is upper bounded by N/2 according to the above argument. Thus, the total number of one-hop transmissions between all adjacent sub-flows is upper bounded by ε is determined by 3 ε , 4 ε and N . It is easy to show that 2 ε will go to zero as N goes to infinity. Finally, we prove that the number of time slots needed in the first interval is less than . In a unidirectional packing, a residual node is an idle node that through which no packet passes (i.e., none of the flows of the unidirectional packing passes through the node). Thus, the number of nodes through which one packet passes in one unidirectional packing is N, minus the number of residual nodes. Consider a dual packing to which group-1 rectangles belong. According to Lemma 1 immediately after the proof of Theorem 1 here, the number of residual nodes in each of the unidirectional packings of the dual packings is less than log(N) w.h.p.. That is, the number of non-residual nodes in a unidirectional packing is more than N-log(N) w.h.p., and the number of non-residual nodes in both the unidirectional packing of the dual packing is more . That is, the traffic handled by each dual packing (in terms of packet flows across all nodes in the dual packing) is more than 2( log ) N N − . Now, consider an arbitrary node in the network. According to our model, it is either the source or destination of some flow. The packet of that flow passes through it with probability 1. For the other N/2 -1 original flows, a packet passes through the node with probability 1/2. By the Chernoff-Hoeffding theorem, the number of packets that go through each node is . Considering all N nodes, the number of packets passing through them is . Note that this is the total traffic which is more than the traffic in the dual packings to which group-1 rectangles belong. Therefore, the number of dual packings to which the group-1 rectangles belong is upper bounded by ( ) Similar to the argument for group-1 rectangles, for the flows containing the group-2 rectangles, there are at most 3 N ε flows which will generate at most 3 N ε unidirectional packings, i.e., 3 / 2 N ε dual packings. Then we can obtain that the total number of dual packings is no more than with high probability, where 1 ε is determined by 3 ε and N. It is easy to verify that 1 ε goes to zero as N goes to infinity. Since each packing needs at most two times slots, the time slots needed for the first interval is at most 1 1 (1 ) / 4 k N ε = + . With the help of k 1 and k 2 , we can obtain the lower bound of the per-flow throughput as where ε can be obtained from 1 2 , ε ε and N, and it goes to zero as N goes to infinity. Then Theorem 1 is proved.

Lemma 1:
For any clockwise (counterclockwise) unidirectional packing contained in the dual packings to > REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 10 which group-1 rectangles belong, the number of residual nodes is less than log(N) w.h.p.
Proof: Let P denote the set of dual packings to which group-1 rectangles belong. Let us focus on one clockwise unidirectional packing p in P. The proof for the counterclockwise case is similar. Let c P be the clockwise packings in P. Let m denote the number of clockwise flows in c P . According to our way of partitioning the rectangles into the two groups, we have 3 1 (1 ) m N ε ≤ − , where N 1 is the total number of clockwise flows.
Recall that in our traffic model, we randomly select N/2 nodes to be sources and N/2 nodes to be destinations. In other words, any node among the N nodes is either a source or a destination. This applies to any residual node in p as well. In particular, a residual node in p is either 1) a destination node (of a clockwise or counter-clockwise flow); 2) a source node of a counter-clockwise flow; or 3) a source node of a clockwise flow. In case 3, since the residual node is a residual node in p, it must be a source node of a clockwise flow already packed (i.e., already belong to c P ) prior to packing p.
For a unidirectional packing, consider the first flow from the start point s. Suppose this flow ends at node i. Let us consider the probability of node (i+1) being a residual node with respect to this unidirectional packing. Due to the randomness of our packing procedure and our random selection of sources and destinations for flows, node (i+1) is a destination node with probability p 1 =1/2, it is a source node of a counter-clockwise flow with probability p 2 =1/4 w.h.p, and it is a source node of a pre-packed clockwise flow with probability 3 3 (1 ) / 4 p ε ≤ − w.h.p. Then the probability that node (i+1) is a residual node given that node i is not a residual node is In out notation above, the 1 in (1| 0) P refers to the fact that we have found one residual thus far, and the 0 refers to the fact that we have not found any residual node so far. Given node (i+1) is a residual node, the probability that the node (i+2) is also a residual node is (2 |1) (1| 0) P P ≤ (due to sampling without replacement). The probability of a sequence of l or more residual nodes is given by , which will approach zero. Thus, Lemma 1 is proved. Figure A-1, An example of a dual packing, where flow 1 and flow 2 belong to the clockwise unidirectional packing, flow 3 and flow 4 belong to the counterclockwise unidirectional packing. The white nodes are non-residual nodes, the red nodes are the residual nodes of the clockwise unidirectional packing, the green nodes are the residual nodes of the counterclockwise packing and the blue nodes are the residual nodes of both the two unidirectional packings. The nodes in the rectangles are the uncommon nodes.

Lemma 2:
For group-1 rectangles, the number of nodes in each rectangle is no more than 2 log( ) N with probability 4 1 ε − , where 4 ε is a small positive quantity that goes to zero when N goes to infinity.
Proof: With respect to Fig. A-1