Relay Ability Estimation and Topology Control Using Multidimensional Context Parameters for Mobile P2P Multicast

We focus on mobile P2P multicast, in which mobile end nodes not only act as receivers but also relay the received stream forward to others. In mobile P2P multicast, negative e ﬀ ects caused by the change of available bandwidth and the disconnection of mobile nodes are propagated to the downstream nodes. To solve this problem, we developed a novel node-allocation framework using the multidimensional context parameters of each mobile node, which include available bandwidth, disconnection rate, and the remaining battery capacity. Considering the signiﬁcance of each parameter, our method integrates these parameters into a single parameter called relay ability. Taking the relay ability into account, each node is allocated to the multicast topology to minimize the negative e ﬀ ects mentioned above. To test our method, we applied our framework to conventional P2P multicast topology and show the results from comparative evaluations through computer simulation.


Introduction
The demand for large-scale live streaming services, in which live content is simultaneously distributed to a large number of users, has been increasing. Enhancements to transmission speed and mobile-node capability in wireless access networks enable people to use such services via their personal mobile devices. While those services are still provided using serverclient type techniques in mobile networks, they have been shifting to peer-to-peer-(P2P-) based technology in wired networks [1][2][3]. P2P multicast is a method of sending a stream over the application layer, in which end nodes not only act as receivers but also relay the received stream forward to others. This peer-to-peer solution enables us to quickly and easily deploy multicast applications without any improved routers. However, P2P multicast has rarely been applied to mobile networks because the transmission speed remains much slower than that in fixed wired networks. Even when many mobile nodes simultaneously request the same content, one centralized server can handle the requests, because the required bit rate for each mobile node is not high in conventional mobile networks. However, in the next five to ten years, mobile users will require higher bit rates as wireless access networks continue to increase transmission speed to a few Mbps. This will require P2P multicast for mobile networks to handle the large numbers of requests from mobile nodes.
However, of course, there remain several problems we need to solve to apply P2P multicast to mobile networks. First, the available bandwidth of a node in mobile networks is instable; it is determined mainly by the radio signal strength of the node, which dynamically changes depending on the position and the movement of the node. Second, disconnections of nodes occur easily when they get out of the wireless coverage range or when their batteries run out [4]. In P2P multicast, the negative effects caused by disconnection and changes in available bandwidth are propagated to the nodes that receive the forwarded stream [5]. Therefore, one key effort is how to locate better nodes as close to the content source as possible to suppress the negative impact on the whole P2P network. Previous efforts, including our previous work, have discussed how to locate nodes on a P2P multicast topology on the basis of only single-context parameters [6][7][8][9]. However, a single-context parameter is not enough to appropriately locate nodes in the topology; in mobile networks, we would need to consider at least three context parameters: available bandwidth, movement, and remaining battery capacity, which can be observed by the mobile  devices. However, we cannot easily determine which node is better because the above parameters are multidimensional; for example, what is better for a network: a node with wide available bandwidth or with large remaining battery capacity?
To address this, we developed a novel node-allocation framework using the multidimensional context parameters of mobile nodes for mobile P2P multicast topology. First, to deal with multidimensional parameters, our method analyzes the statistical relationship between each context parameter of a node and the service quality experienced by the analysis. To assess how significantly each context parameter affects the experienced service quality, we introduce two metrics: the number of received bits, which is how many bits a node has received within a period, and receiving time, which is how long a node has been able to decode the video stream. Second, considering the significance of each parameter, our method integrates these parameters into a single parameter called relay ability. Taking the relay ability into account, each node is allocated to the P2P multicast topology to minimize negative effects caused by the dynamic change of available bandwidth and frequent disconnection. Finally, to test our method, we applied our framework to P2P multicast topology, and we show the results of comparative evaluations through computer simulation.

Overlay Topology Construction for P2P
Multicast. One of the major issues in P2P multicast is how to construct overlay topologies. Tree-based topology construction is popular because it is simple and because it enables load balancing among nodes [1,6]. However, in tree topologies, a stream is forwarded from parents to their children in a hop-byhop manner; the stream is easily terminated and downstream nodes cannot receive it when an upstream node is disconnected. Therefore, to improve robustness, a multitree multicast has been proposed in [8], while mTreebone [6] is a hybrid tree/mesh design.
Moreover, there have been several studies on overlay multicast structures that take mobile terminals into consideration. An overlay multicast architecture that locates instable mobile nodes to the outskirts of the multicast tree can reduce the effect of bandwidth instability on overlay multicast [7]. However, when the majority of nodes in an overlay network are mobile, which we assume in this paper, it would be hard to apply this method to the network.

Statistical Analysis in Conventional
Research. As mentioned in Section 1, our method uses statistical analysis for dynamic P2P topology control. To the best of our knowledge, statistical analysis has been conventionally used only for static network analysis and design, not for dynamic control. We introduce several conventional studies below.
The work in [10] measures radio channels and analyzes general channel characteristics using statistical analysis. The work in [11] takes a statistical analysis approach to optimal design in mobile satellite broadcast systems. The proposed scheme in [12] performs QoS mapping between the application level and the user level by multiple regression analysis, which is a statistical analysis method. The work in [13] analyzes the characteristics of the multicast tree for many-tomany communications through multiple regression analysis. The study found that the index that most affects the performance of the tree depends on the number of multicast members. Figure 1 illustrates the mobile P2P multicast system we assume in this paper. It is based on a hybrid P2P structure, which is often adopted [2,3], especially when the capability of the peers is limited like in mobile P2P networks [1]. In Figure 1, each mobile node  opportunistically establishes a physical link with a wireless base station and is allocated in one of the distribution stubs in the overlay layer. The distribution tree is split into many stubs to limit the maximum number of hop-counts in each distribution tree; a small number of hop-counts suppresses propagation of negative effects, including disconnections and delay jitters [14]. The system server in Figure 1 sends a stream to the first receiver in each distribution stub and exchanges control messages directly with nodes to maintain the P2P topologies. The P2P topologies are updated at regular intervals.

System Model.
In our method, the system server has a database for storing context parameters observed in every node and can estimate their relay abilities. In this paper, we assume nodes ideally observe their own context parameters and inform the system server of them via background control paths, and we consider control traffic negligible compared to the stream rate, that is, a few Mbps.

Integration of Multidimensional Context Parameters to
Relay Ability. We can imagine, for example, that rich available bandwidth, link stability, and battery capacity provide high throughput, improve robustness, and increase service lifetime, respectively. However, it is difficult to integrate these parameters into a single parameter because we are not allowed to directly adding or multiply to process them. One simple solution is adding two or more parameters with appropriate weights This means that multiple context parameters of node i, x i1 , x i2 , . . . , x im are integrated into the relay ability of node i, Y i using appropriate weights β 1 , β 2 , . . . , β m . The next question is how to determine β k . Our key idea is to address the significance of weight β k by determining how parameter x ik affects the experienced service quality. Multiple linear regression analysis (MLRA) [15], which is a multivariate analysis method, derives β k as follows.
(1) The statistical database, which is in the system server, manages parameter set where y i is the actually observed relay ability of node i.
(2) The system server obtains stored in the statistical database.
(3) For every i, the system server estimates the relay ability of node i, Y i , by assigning β k and x ik (k = 1, 2, . . . , m) to the regression equation shown in (1).
(4) The system server constructs the P2P multicast topology based on the estimated relay abilities of the nodes.
(5) After a certain service interval (topology update interval), for every i, node i observes its actual relay ability y i and multiple context parameters x i1 , x i2 , . . . , x im , and updates parameter set to the statistical database, and then it returns to Step 2).
In general, the computational complexity of MLRA is O(nm 2 ) when the numbers of nodes and context parameters are n and m. Since m is at most 10 and there is no exponential relation between n and m, this should not be a problem.

Topology Construction Using Relay Ability.
Our framework is illustrated in Figure 2. We first integrate the multidimensional context parameters of the nodes into relay abilities, as explained above. Then P2P multicast topology is built on basis of the estimated relay abilities of the nodes. Each node position in the topology is based on its estimated relay ability. Nodes with higher estimated relay ability are located at more upstream positions in the trees, while nodes with lower estimated relay ability are positioned downstream to avoid the propagation of negative effects. One of the benefits of our method is that it allows us to use conventional topology construction algorithms that use a single-context parameter, as introduced in Section 1, by replacing the single-context parameter with our integrated parameter called relay ability. Note that the context parameter most likely to be dominant is used as the initial relay ability. For instance, in the following simulation, we set the available bandwidth as the initial relay ability.

Simulation Model
The main simulation parameters are listed in Table 1. The details of our simulation model are as follows.

Proposed and Compared
Methods. In our method, we consider the average of the available bandwidth, the remaining battery capacity of the nodes, and the moving distance as the context parameters x i1 to x i3 in Section 3.2. Moving distance means how far they moved during the topology update interval. Then, the actual relay ability y i is defined as follows: where t 0 , τ, and θ i f (t) represent the start time of the current update interval, the duration of the update interval, and the forwarding bit rate of node i at time t. That is, (2) is equivalent to the number of bits forwarded by node i within the update interval. We call our method that uses the definition in (2) MLRA-AR (Amount of Relayed data). Since MLRA-AR locates nodes by forwarding a larger number of bits upstream in the tree, we could expect the total received data in the whole network to increase. On the other hand, the actual relay ability y i can be also defined as We call our method using this definition MLRA-RD (Relaying time Duration). This method increases the time during which nodes can continuously receive at least the minimum number of received bits, because it locates frequently disconnected nodes downstream. Note that, available bandwidth is commonly used as the initial relay ability. That is because at the beginning, users have not started moving and batteries have not been consumed.
We compare our methods with the three following methods.
(i) BW: available bandwidth is used as the only context parameter.
(ii) MULTIPLY: x ik is used as the relay ability. Multiplying is one of the simplest ways to integrate multiple context parameters, because we do not need to be careful of scale and dimensional difference between parameters. This method uses available bandwidth as the initial value of relay ability as our methods do.
(iii) RANDOM: nodes are located in the topology at random. The RANDOM method is a good benchmark of the lower-bound performance.

Bandwidth.
The wireless channel quality depends on multiple factors, including fading, shadowing, interference, channel contention, handover, and traffic. However, we do not need an overly complicated model to initially assess a new mechanism. We built an event-driven simulator written in C++. In the simulation field, each access point (AP) is located at the center of each hexagonal cell as in [16]. Nodes move around the field with a human walk model and carry handover so as to always connect to the closest AP for each. The parameters related to the wireless channel were set as shown in Table 2. In this simulation, available bandwidth is calculated based on the model represented in [17], in which signal-to-noise ratio, path-loss and MAC layer overheads are taken into account. The human walk is patterned on the model proposed in [18] with some simplifications; we only placed waypoints for nodes in a rectangular pattern. However, to capture the effect of link stability on performance, we assumed two types of nodes: large (unstable) and small (stable) moving ranges. We define the available bandwidth for each node from its wireless link as available bandwidth. Additionally, to simplify the problem, we made two reasonable assumptions. First, the transfer rate of an overlay link between two users was equal to the minimum bandwidth they spared for the link. Second, each node could optimally adjust its transfer rate according to the link bandwidth.

Energy Consumption.
In a mobile P2P multicast, energy can be consumed by three factors: stream receiving, stream forwarding, and other processing, including encoding and image displaying. Since an accurate energy consumption model is outside the scope of this paper, we used a simplified model for our evaluation. We introduce an energy consumption model E i , E f , and E r represent normalized E i , E f and E r respectively. Nodes constantly consume 1 (normalized) unit of energy per second without forwarding and receiving. In the recent mobile devices, energy consumption for receiving and decoding E r can be 10 to 100 times as large as the one only for keeping the device on E c . In our simulation, we set E f and E r to 15 and 15. The ratio of E f to E r can be also different from device to device. Therefore, in our simulation, we represent E f as 15 and discuss the effect of on the performance later in Table 3. We set the initial battery capacity for each node to a random value between zero and the maximum (6000 normalized unit of energy).

User Behavior.
In each simulation trial, the initial number of nodes is n. Then, at a regular interval shown in Table 1, a new node joins the network. In general, nodes with less remaining battery capacity try to reduce their forwarding data rate to other nodes. Therefore, we assumed that, although nodes accept forwarding at double their receiving data rate, they reduce it to the same rate as their receiving data rate when their remaining battery capacities fall below half the maximum. Nodes leave the service when their batteries run out.

P2P Multicast Topologies.
As we mentioned in Section 3.3, we can use conventional topologies in our framework. We used tree and mTreebone in our simulations. We include the topology parameters in Table 1. mTreebone has both tree and mesh parts in its topology. To capture the effect of node allocation clearly, we simply set the ratio of tree nodes to 0.5. We set the number of assigned links for every mesh node to 3. As the number increases, the route diversity effect increases but the complexity of session handling and video decoding becomes more complicated. We assume that considering the maximum stream quality, each node should equip a few Mbytes of received buffer. In this assumption, the time difference between received streams from different routes does not cause any problems.
Considering the system overhead, we should not make the topology update interval too small, while it would be hard to track the changing speed of wireless channels if we set it too long. In this simulation, we varied the topology update interval from 10 to 50 s.

Evaluated
Metrics. The first metric, total number of received bits, is defined as T represents the simulation period. This calculates the total number of bits received in the system.
The second metric, receiving ratio is defined as where R i (t) and B i (t) represent the transmission rate and remaining battery capacity of node i, respectively. This metric indicates the system stability, because (7) is the ratio of the time during which nodes can receive data to the time during which nodes can connect to the network.
In the above metrics, the received bit rate is considered to be continuously changing. However, it is more realistic that, in streaming services, the bit rate changes discretely. Therefore, we assume the forwarded bit rate decreases or increases by 0.2 Mbps, and 4.0 Mbps corresponds to the highest quality. Furthermore, in mTreebone, mesh nodes can receive streams from two or more parent nodes. Therefore, we assumed that multiple description coding (MDC) technology [20,21] allows us to ideally combine the multiple streams. Figure 3(a) shows the total number of received bits versus the topology update interval, which varied from 10 to 50 seconds. In general, if the update interval is too short, the parameter estimation becomes inaccurate, while too long update intervals do not detect dynamic changes. We used a tree topology in Figure 3. This figure shows that the two proposed methods, MLRA-AR and -RD, are much superior to the others in terms of this metric. Particularly, as we expected, MLRA-AR works better than MLRA-RD here because, as described in Section 4.1, MLRA-AR considers "forwarded bits" of nodes as the relay ability of the nodes to locate nodes with a large number of forwarded bits upstream in the topology, resulting in an increase of total received bits. Figure 3(b) shows the receiving ratio, defined in Section 4.6, versus the topology update interval. Our method maintain the performances superior to the other methods. In addition, unlike in Figure 3(a), MLRA-RD is better than MLRA-AR here. This is because MLRA-RD considers "forwarding time duration" to be relay ability and locates nodes with long forwarding time upstream, which increases the receiving time. This implies that we can control the objective of the network topology by changing how we define relay ability. However, MLRA-RD keeps its superiority to MLRA-AR in shorter topology update intervals. This means the accurate number of forwarded bits is estimated more easily than the forwarding time duration. Figure 3(c) shows the total number of received bits divided by the total consumed energy versus the topology update interval. This is an energy efficiency metric. Moreover, the characteristics in Figure 3 Table 1. consumed energy between different methods here. The result showed that the energy efficiency of our methods is superior to other methods.

Simulation Result in Tree Topology.
Another thing we can learn from Figure 3 is that, as the topology update interval decreases, basically, the performance gradually decreases. The reason BW is the most sensitive to topology update interval is that, within a topology update interval, nodes with large bandwidth were allocated upstream in this method. Because there is no consideration of their remaining battery capacities, these upstream nodes leave the network if their battery capacity runs out in BW. Figure 4 shows the total number of received bits and the receiving ratio versus the topology update interval in mTreebone. Even in mTreebone, our methods are superior to other methods. Compared with the result in Figure 3(b), the received ratio of MLRA-AR in Figure 4(b) is about 5% larger, which shows that mesh nodes in mTreebone improve stability by using the principle of route diversity.

Performance versus Topology Update Interval in mTreebone.
We also observe how total number of received bits and receiving ratio performance depend on the ratio of battery consumption of receiving to forwarding , which was defined in Section 4.3. We summarize the results of our methods and the MULTIPLY method, which gave the best performances among the compared methods in Figure 4, as seen in Table 3. As the table shows, our methods are better than MULTIPLY for various values of .

Performance versus Node Stability in mTreebone.
In this section, we observe the total number of received bits and the receiving ratio as a function of the ratio of the number of stable nodes to the total number of nodes. As we defined in   Table 1.   Table 1. Section 4.2, the difference between stable and unstable nodes is their moving range. Both Figures 5(a) and 5(b) show us that, as the ratio of stable nodes increases, the performance increases, as we can easily imagine. As we explained in the previous evaluations, MLRA-AR and MLRA-RD basically work best when our objectives are improving the total number of received bits and receiving time ratio, respectively. However, in Figure 5(b), MLRA-RD is slightly inferior to MLRA-AR, counter to our expectation, when the ratio of stable nodes is larger than 0.5. This is because the number of forwarded bits, which is used in MLRA-AR as the relay ability, enables us to capture the difference of relay ability between any two nodes more precisely than the forwarding time, which is used in MLRA-RD. In other words, forwarding times include time-dimensional information only, while the number of forwarded bits reflects time and bandwidth information of nodes. As we noted in the previous evaluations, we can control the network by changing the definition of relay ability, but we realized we need to consider how much information the definition of relay ability reflects.  Table 1.  In this section, we observe the receiving ratio as a function of the number of initial nodes n in a distribution stub we are observing. We found that the superiority of our MLRA methods to conventional methods does not depend on n. However, Figure 6 shows us that as n increases, the receiving ratio decreases. That is because negative effects, like decrease of bandwidth and disconnection, are propagated downstream via the large number of hops when n is larger.

Individual Service Quality.
In this section, we discuss fairness of service quality between users. Figure 7 represents the cumulative distribution function of receiving time, which is defined in (7) and is the time during which a user can receive data. We do not show the function of the RANDOM method because of its poor performance. The parameters listed in Table 1 were used, except that we set the initial battery capacity for every user to the maximum. In Figure 7, MLRA-AR is the fairest of the four methods because, if the remaining battery capacity of an upstream node has been reduced because of forwarding, MLRA-AR immediately relocates the node downstream before the battery is fully consumed. However, the receiving time for users in MLRA-RD is longer than that in MLRA-AR.

Alternative Multivariate Analysis Methods
We used MLRA, which is a basic multivariate analysis method, to integrate multiple context parameters into relay ability. Of course, it is possible to use other multivariate analysis methods. It can be meaningful to compare MLRA with other multivariate analysis methods. Although it should be included in our future work, we would like to introduce several alternative methods.
A Bayesian network is a probabilistic graphical model that represents a set of random variables and their conditional independences [22]. A Bayesian network first builds a directional graph in which initial multiple parameters are integrated into a smaller number of new parameters. Then, it produces a conditional probability table that represents the transit probability from an initial parameter to a new parameter. Repeating this, lastly, this probabilistic graph enables us to estimate the correlation between the initial parameters and the final integrated parameters, like relay ability in our study.
Conjoint analysis [23] was originally used for marketing, but it is now also used to analyze user preference. A user's set of overall responses to factorial designed stimuli can be decomposed so that the utility of each stimulus attribute can be inferred from the user's overall evaluation of the stimuli.

Conclusion
We developed a novel node-allocation framework that uses multidimensional context parameters of mobile nodes for mobile P2P multicast. We introduced a statistical analysis MLRA to integrate multidimensional context parameters into a single parameter that represents relay ability. Relay ability is used to allocate nodes at appropriate positions in the P2P multicast topology. To test our framework, we compared it to several other methods through simulation of total throughput, stability, and energy efficiency. The simulation results showed that our framework works better than the other methods and can change the definition of the relay ability depending on the objective, including total throughput, stability, and fairness. For future work, we will evaluate alternative multivariate analysis methods in our framework. Also, there remain several common issues in the research field of P2P applications: selfish user behaviors [24] and high-churn networks [25], in which users leave and join very frequently.