Self-Organization Scheme for Balanced Routing in Large-Scale Multi-Hop Networks

We propose a self-organization scheme for cost-effective and load-balanced routing in multi-hop networks. To avoid overloading nodes that provide favourable routing conditions, we assign each node with a cost function that penalizes high loads. Thus, finding routes to sink nodes is formulated as an optimization problem in which the global objective function strikes a balance between route costs and node loads. We apply belief propagation (its min-sum version) to solve the network optimization problem and obtain a distributed algorithm whereby the nodes collectively discover globally optimal routes by performing low-complexity computations and exchanging messages with their neighbours. We prove that the proposed method converges to the global optimum after a finite number of local exchanges of messages. Finally, we demonstrate numerically our framework's efficacy in balancing the node loads and study the trade-off between load reduction and total cost minimization.


I. INTRODUCTION
Large-scale wireless networks employing multi-hop transmissions are an integral component of the Internet of Things [1].For example, such networks can consist of a massive number of sensors that collect data from the environment and send it to central controllers.Since in multi-hop networks each wireless node can relay other nodes' messages, it is highly relevant to direct the information flows from the source nodes to the destinations efficiently in terms of, e.g., energy consumption or reliability.Sending the flows along the minimim-cost paths towards the destinations potentially leads to overloading those nodes that provide favourable routes, which can cause quick battery depletion or decrease the resilience of the network against node failures [2]- [5].Therefore, information should be routed through the network so as to minimize costs while trying to balance the node-loads.Moreover, given their scale, such networks must be designed to be self-organizing and adaptive.
There is a large body of work studying energy efficient routing protocols (see, e.g., the survey [6]).
A typical objective is to maximize the network lifetime by maximizing the minimum lifetime over all nodes, where the lifetime of a node is defined as the ratio between its residual energy and its energy expenditure [2]- [4].However, the network lifetime objective does not account for the total routing cost (total energy in this case) and thus can be inefficient in this respect, similar to minimum-cost routing being suboptimal for node balancing.It is therefore relevant to investigate objectives that favour solutions that are somewhere "in-between" these two extremes.
In this work, we propose an algorithmic strategy for distributed multi-hop networking whereby the nodes coordinate and organize themselves so as to route the information to the destinations in an efficient and balanced way.To this end, we model balanced routing as the minimization of a network objective function, which includes the overall cost of the routes (given by generic link costs) and an additional term that penalizes the node-loads.The objective function provides a tunable trade-off between total cost efficiency and fairness of the distribution of the node loads.The possible routes from source nodes to destinations are coupled in the objective function, which creates a competition for the shortest (i.e., least cost) routes to the sinks.To solve the optimization problem, we use the min-sum version of the belief propagation (BP) method [7].In this way, we obtain a distributed algorithm which finds globally optimal routes in a decentralized manner with low-complexity local computations and message exchanges between neighbouring nodes.We also show that the proposed method converges to the global optimum in a finite number of iterations.

II. NETWORK MODEL AND PROBLEM FORMULATION
We assume a data collection scenario in which a set V s = {1, . . ., n} of n nodes generate and/or relay information that has to be delivered to any subset of the m destination nodes (e.g., gateways, access points) in V d = {n + 1, . . ., n + m}.The nodes in V s are simple devices with constrained resources (energy, memory, processing capabilities, etc.) and can participate in routing each other's packets towards the destination nodes.Packets generated by a source node in V s can travel to a destination in V d over different routes; moreover, they can be delivered to different destination nodes.
We model the wireless network as a directed graph G(V, E), with V = V s ∪ V d and E being the set of edges (links).An edge (i, j) ∈ E indicates that node i can transmit to node j directly.For each i ∈ V, E i denotes the set of all edges incident to i, while E out i and E in i stand for the sets of its outgoing and respectively incoming edges.Node i ∈ V s generates information at a rate of r i units (we assume a certain unit rate [r]), where r i ∈ N; if r i = 0, the node is just a relay node.The capacity of edge e ∈ E is u e units, u e ∈ N >0 , such that the amount of flow x e units carried by e satisfies 0 ≤ x e ≤ u e .The assumption that the rates and capacities are integer multiples of [r] is not restrictive, because any set of rational numbers can be expressed in this way by finding an appropriate unit [r].Moreover, if any of the rates and capacities have irrational values, it is necessary to convert them to rational numbers to represent them on a computer.We associate each link e ∈ E with the weight c e > 0 representing the cost of transferring a unit over edge e.For example, the cost can be the transmit power required to ensure a certain data rate, the expected transmission count (ETX), or hop-count (when c e = 1).We further assume that the network is in the unsaturated traffic regime and packets are transferred between neighbours according to a medium access scheme, which we do not concern ourselves with here.
The routing solution space consists of those configurations {x e } e∈E which satisfy the flow conservation constraints and the capacity constraints 0 ≤ x e ≤ u e , for all e ∈ E. The two constraints ensure that all generated flows are delivered to the destinations such that edge flows do not exceed the respective capacities.We assume that the solution space is non-empty.The total cost of a configuration {x e } e∈E is e∈E c e x e .
Furthermore, we define the load of node i to be the amount of flow e∈E out i x e it has to forward.
In general, there are many feasible configurations, each implying different sets of routes, path lengths, total costs, distribution of node loads, etc.A common objective is to minimize the total cost, which, as one can notice, turns data collection into a (linear) minimum cost network flow problem [8].However, such an approach may yield solutions wherein some nodes that provide low-cost forwarding edges experience high loads.We are therefore interested in balancing the node loads in a cost-effective manner.

III. PROPOSED OBJECTIVE FOR LOAD BALANCING
We seek a trade-off between minimization of the total cost and minimization of the loads of individual nodes.To this end, for each i ∈ V s we introduce the strictly-increasing convex function to penalize the load of the ith node.The functions can vary over the nodes to reflect their different load-tolerances depending on residual energies, capabilities etc.Now, we formulate the optimization problem minimize where w is a parameter that balances cost-efficiency and load minimization.When w = 0, we recover the linear minimum cost flow problem [8], which gives the most cost-efficient flow configuration; however, this setting usually does not provide well-balanced loads and therefore we focus on w > 0.
In the following, we assume that the functions φ i are piecewise-linear convex (PLC) with integral breakpoints, which is very convenient for obtaining a simple message-passing algorithm with provable convergence to the correct solution, as we show next in Prop. 1 and Prop. 2. An example of such function is one that takes the value y α , with α > 1, at each breakpoint y ∈ N and varies linearly between consecutive breakpoints; the higher the value of α, the stronger the load y is penalized.Such a choice provides a simple way to select the efficiency-fairness trade-off by tuning the parameter α.

IV. BP ALGORITHM FOR BALANCED ROUTING
BP is a generic message-passing algorithm for solving large-scale inference and optimization problems in graphical models.It has a distributed nature whereby the nodes of the graph perform simple local computations and exchange messages with their neighbours.While BP provides correct solutions when the underlying graph is a tree, its correctness and convergence cannot be generally guaranteed for graphs with cycles, with few exceptions [7], [9].Nonetheless, for graphs with cycles, the BP heuristic often performs very well.In network problems, the min-sum algorithm is applied to find the shortest path between two nodes [10] or minimize path lengths and link congestion [11].For the min-cost network flow problem with linear or PLC costs on edges, BP was shown in [9] to converge to the correct solution (if the solution is unique).Compared to [9], our objective (2) (with w > 0) additionally includes node costs given by the PLC functions {φ i }; therefore, the application of BP gives the novel algorithm described next. 1   For each node i ∈ V s , we define a function ψ i that reflects the flow conservation constraint at node i, i.e., it maps each vector z ∈ R |Ei| + of edge flows to Furthermore, we define which additionally includes the load penalty for node i ∈ V s .On the contrary, destination nodes do not have any constraints and "accept" any flows on their incoming edges, so we set f i (z) = 0, for any i ∈ V d and z ∈ R |Ei| + .Next, we capture the cost and capacity constraint of edge e ∈ E by introducing the function g e : R → R ∪ {∞} given by otherwise.
We can now reformulate (2) as the equivalent problem minimize where x Ei includes those components of x with indices in E i .
We apply the min-sum version of BP to solve (3).Given that each edge variable node has exactly two neighbour function nodes from the set {f i }, we simplify the standard message updates by defining the messages (4) in Algorithm 1.At iteration t, for each node i ∈ V s and incident edge e ∈ E i , where either e = (i, j) ∈ E out i or e = (j, i) ∈ E in i , the algorithm computes the message m t i→e , which becomes an input to neighbour j at the next iteration.Since f i is the zero function for all i ∈ V d , the messages computed by destination nodes do not change with t and thus are not updated. 1 Alternatively, by using the node splitting technique [8, p. 41], one can transform (2) into a min-cost network flow problem with PLC costs on edges, which can be solved using BP [9, Th. 6.1].However, BP on the transformed graph is different from Algorithm 1 that we obtain here, see footnote 2.
August 18, 2018 DRAFT Algorithm 1 Distributed algorithm for balanced routing.
Input: The graph G(V, E), edge costs {c e } and capacities {u e }, data rates {r i }, parameters α and w Output: Estimates {x e } e∈E of the optimal edge flows of (2) 1: Initialize m 0 i→e (z) = g e (z), for all i ∈ V, e ∈ E i , z ∈ R + 2: for t = 1 to T do 3: For each i ∈ V s and e ∈ E i , update m t i→e (z) = g e (z) + min for all z ∈ R + , where e ′ = (i, k) or (k, i).
4: end for 5: For each e = (i, j) ∈ E, compute the belief function and determine its minimizer Algorithm 1 has the following interpretation.Every node is seeking to determine the flow on each of its incident edges while satisfying its local flow conservation constraint and minimizing its load.The message m t i→e (z) can be viewed as a local cost that node i attributes to allocating z units to edge e; thus, the message is a function of the flow.For any z, the message update (4) includes: (i) the cost of sending flow z over edge e and (ii) the minimum cost of allocating flows to the rest of the edges that are incident to i such that flow conservation is ensured.The latter cost is the result of a local optimization, which looks for the feasible configuration of the flows on the incident edges that minimizes an objective function that includes the cost of the load of node i and the local costs (messages) estimated by the neighbouring nodes. 2 The message updates have low-complexity, as we show next.

Proposition 1 (Complexity):
For each i ∈ V s , e ∈ E i and t ≥ 1, the message m t i→e is a piecewise-linear convex (PLC) function with breakpoints in {0, 1, . . ., u e }.The complexity of its update (4) is linear in the total capacity of the input and output edges of node i and logarithmic in |E in i | and |E out i |.Proof: The proof is by induction on t.At t = 0, Algorithm 1 initializes the messages to trivial PLC functions.Suppose at iteration t−1 all messages are PLC functions with integral breakpoints.We provide the proof for m t i→e with e ∈ E out i , as the case e ∈ E in i is very similar.Let ψ (1) and define ψ (2) Now, we define the function The minimization in the r.h.s. is a so-called interpolation of PLC functions whose complexity is logarithmic in the number of functions and linear in the total number of their linear pieces [9].Since m t−1 k→e ′ has breakpoints in {0, 1, . . ., u e ′ } and φ i is also PLC with integral breakpoints, it follows that the function h is itself PLC with integral breakpoints and at most U in i pieces, where operations.Now, we write (4) Given that h and the messages at t − 1 are PLC with integral breakpoints, the interpolation in the second line gives again a PLC function; its computation takes O (U e i log |E out i |) operations, where U e i = e ′ ∈Ei\e u e ′ .The addition of g e , which is linear in [0, u e ], makes m t i→e PLC with integral breakpoints.
We establish that Algorithm 1 outputs the optimal solution after a finite number of iterations.
Proposition 2 (Convergence): Suppose (2) has a unique optimal solution x * . 3Then, there exists a finite integer T * such that the output of Algorithm 1 satisfies xt = x * , for any t ≥ T * . 3When the costs {ce} are generic (e.g., random), it is highly likely that (2) has a unique solution.Otherwise, it is possible to add small noise to the costs such that the modified problem has a unique solution which very closely approximates the solution of the original problem [9].
August 18, 2018 DRAFT Proof: Although our objective function ( 2) is different than that of the min-cost network flow problem with linear (or PLC) edge costs, we can use the same proof strategy as in [9,Th. 4.1,Th. 6.1].The difference is that we need to define an appropriate residual graph [8].Denote by G(x) the residual graph of G(V, E) with respect to the flow x ∈ R |E| .G(x) has the same vertices V, while we define its edges and their costs as follows: for any e = (i, j) ∈ E, if x e < u e , then e is also an edge in G(x) with capacity u e − x e and cost c x e = (1 − w)c e + w lim z→0 + (φ i (y + z) − φ i (y)) /z, where y = e ′ ∈E out i x e ′ is the load of node i; if x e > 0, then G(x) additionally includes the directed edge e ′ = (j, i) with capacity x e and cost c x e ′ = −(1 − w)c e + w lim z→0 − (φ i (y + z) − φ i (y)) /z.At the unique optimal solution x * , all the directed cycles of the residual graph G(x * ) must have positive costs (according to the negative cycle optimality criterion [8]).The proof relies on this property and follows the same steps as that of [9,Th. 4.1]; therefore we omit the details.

V. NUMERICAL RESULTS
We consider n = 50 nodes independently and uniformly distributed inside the unit square and m = 1 sink node at the center of the square.Any two nodes i and j that are spaced by less than 1.6/ √ n ≈ 0.23 are connected by the directed edges (i, j) and (j, i).We discard the network realizations that are not connected.For each realization, we randomly select k sources out of the n nodes; the sources generate information at unit rate, while the remaining n − k nodes act as relays.The cost associated with each link is the expected transmission count (ETX), which is drawn uniformly at random from the interval [1,3].
For the proposed balanced routing scheme (Algorithm 1 with w = 0.5 and power α > 0), we evaluate the total cost, the maximum of the node loads {y i } n i=1 , the Jain's index, , as a measure of fairness in the distribution of the loads and the empirical distribution of the minimum number T * of iterations required for Algorithm 1 to converge.We compare the results obtained using our algorithm against minimum-cost routing which is instantiated by setting w = 0 in Algorithm 1.
The results in Fig. 1 are obtained by averaging from 200 independent trials.In Fig. 1a, we observe that balancing with α = 1.5 reduces the maximum load by 20-25% compared to minimum-cost routing across all fractions of source nodes, while the total cost increases by < 5%; increasing α to two brings larger reduction of the maximum load, of about 30-40%, and a higher relative total cost of about 5-10%.Fig. 1b shows that the balanced routing scheme provides significantly fairer load-distributions.As illustrated in Fig. 1c, the number of iterations required to find a balanced solution is higher than for min-cost routing and increases with α.We also evaluated BP on the graph transformed by node splitting (see footnotes 1 and 2) and, while it outputs the same solutions, it requires a higher number of iterations than our method, as shown in Fig. 1c.

VI. CONCLUSION
We formulated balanced routing in large-scale networks (such as Internet of Things) as optimization of an objective function that provides a tunable trade-off between total cost efficiency and fairness of the distribution of the node loads.In the proposed decentralized scheme, the nodes collectively find the globally optimal routing solution through low-complexity local computations and exchanges of messages with neighbours.The scheme provides significantly fairer solutions than minimum-cost routing at the expense of slightly increased total cost and higher number of required iterations.
There are several interesting directions to explore further, such as adapting the framework to specific models of energy consumption, including in the design the notions of reliability, trust among nodes and security, but also extending the framework to take into account the scheduling of the transmissions.

Fig. 1 .
Fig. 1.Simulation results for n = 50, m = 1 and various fractions k/n of source nodes: (a) Improvement of the total cost and maximum load relative to minimum-cost routing; (b) Jain's fairness index for the node loads; (c) empirical cdf of the minimum number T * of iterations required for Algorithm 1 to converge when k/n = 0.3.