King’s Research

Wireless access networks are ever-changing their setup, topologies and provisions of coverage. Forthcoming deployment trends include densification of cells and convergence of the ‘native IP’ with cellular access networks with more than best-effort expectations. Greater meshing in networks due to practical deployment requirements is inevitable and needed. This paper proposes a routing protocol to address the ‘‘randomness’’ of topology interconnections, routing paths and sizes in emerging IP access networks. Termed as Minimum Set Cover (MSC) approach, it is a generalization of the NP and NP complete mathematical problem. The offline component of the MSC multiplies intra-domain routing installations, called Routing Planes (RPs) modelled as graphs, to suitably cover the whole routing topology prior to the traffic injections in the online component. We introduce novel offline optimization features using a dynamic cost function that plans availability of capacities and correlation of routing paths via the chosen set of RPs. Our simulations verify effective path diversity using MSC with its modest protocol overhead and a heuristic used for RPs selection. For randomly constructed access network topologies with 7, 18 and 33 routers and various meshing levels, our results show convincing suitability and key performances gains compared with rival routing solutions.


Introduction
The Internet network started life as a local platform with limited interactions.Ever since, Internet has evolved into an open global platform with a widespread information infrastructure, enabling everyone, everywhere to share information, access wide-ranging content, and redefine geographic and cultural boundaries.Internet's growth has transformed it into a multi-faceted environment connecting users on a global scale.Correspondingly, this growth has significantly impacted our daily interactions.The Internet's routing fabric is currently confronted by a traffic management challenge with the emergences of highly demanding applications and new devices in recent years.There is also an increase in density of availability of Internet access and the associated users distributions.All of these have put significant burden on the Internet in recent years making it increasingly difficult for IP Network Providers (INPs) to cope with the rising traffic demands by simply over-provisioning, thus, resorting to Traffic Engineering (TE) [3].In parallel, in case of the cellular/mobile networks, the surges in traffic demands and increased mobility at network edges extend the focus from core to new forms of access.The unpredictable structural layout of these wireless access networks marked by the advent of the new backhaul structure for 5G mobile networks [4] and beyond [5], with heterogeneous and very dense cells deployments [5,6], opens up recent cost effectiveness considerations in deployments of ''densified'' networks [7] as well as demands for new TE solutions.We generalize this problem space by jointly considering both the INP and cellular mobile networks. 1Hence, the focus of the work is on Intra-domain TE in all-IP access network structures with topologies that reflect these novel practical engineering deployment necessities: dense layout of wireless access routers/points, and, rather improvised installations of wired interconnections between routers inside the access network and towards the wireless access.As a practicality, we generalize and term these wireless IP access network structures as random all-IP access networks located at the edges of the constellation of routing in the Internet (e.g today's campus, metropolitan, private networks etc.).
Intra-domain TE is traditionally categorized into Multi-Protocol Label Switching (MPLS) and IP-based TE approaches [8,9].The concept of traffic engineering was first introduced for MPLS-based environments [3] for some time now and is still being applied in access networks where IP packets are encapsulated and tunnelled over the Labelled Switched Paths (LSPs).Hence, MPLS conducts explicit routing https://doi.org/10.1016/j.comnet.2022.109418Received 19 October 2021; Received in revised form 4 August 2022; Accepted 9 October 2022 and arbitrary splitting of traffic via LSPs.Individually building and maintaining LSPs induces complexity, delay and overhead giving rise to scalability and robustness issues in addition to MPSL being a sublayer to IP routing.IP-based TE relies on inbuilt TCP/IP suite's routing features, e.g. through manipulation of link weights in case of Open Shortest Path First (OSFP) being an intra-domain dynamic link-state routing protocol.IP TE does not intrinsically enable control via explicit routing and arbitrary splitting of traffic as done in MPLS TE.Equal-Cost Multi-Path (ECMP) is an add-on option of OSPF initially adopted and evaluated in [10] that can facilitate equal multi-path traffic splitting by tuning link weights and enabling load balancing.Due to the weights tuning, OSPF and ECMP take considerable time to converge and are both highly intractable in cases of large and random topologies especially when sources and destinations are numerous [11].Even the best setting of the weights can diverge considerably from optimal as discussed in [11,12].
Numerous studies have attempted to enhance the optimality of IP intra-domain TE approaches and replace MPLS due to its drawbacks.The proposed approaches mostly aim to improve network load balancing through models for optimum path diversity.ECMP-based approaches resort to link weight tuning in order to achieve path diversity which renders slow convergence, and decline in performance diverging from optimal TE [13,14].A notable proposal in [15,16] uses a Network Entropy Maximization (NEM) based protocol that enables arbitrary splitting of traffic in case of ECMP.Nonetheless, the proposal demands entire equal paths' calculations by routers [15] rendering lengthy path re-configurations [16] in the order of minutes.While ECMP can be a good solution for specific scenarios, it is intractable and inefficient for large flows (sub-optimal) [17].In addition, from its early consideration as an alternative to MPLS [18], Segment Routing (SR) has recently gained significant popularity in some industry usecases [19].The SR inherently uses IP source-based routing where packets are routed through a list of routers carried in the packets' headers.Hence, path control can be executed with less protocol overhead than MPLS.However, the SR still incurs overhead due to per-packet processing of packet headers, and limits the number of segments due to computational complexity.
Our work is positioned on improving the limited and slow-to-react use of the routing resources in the all-IP random access networks when the shortest-or equal-cost-paths IP-based TE is used, while at the same time maintaining a relatively simple and scalable solution compared with MPLS TE and SR.

Related work
Multi-Plane Routing (MPR) was initially proposed in [20] as a compromise for deficiencies associated with MPLS and IP-based TE, with further investigations in [1,21].MPR is the conceptual foundation from which the work in this paper has evolved.We have analysed practical aspects of MPR in extensive simulations for elementary treelike all-IP access network topologies [1].These topological structures consist of interfacing gateway(s) with outside Internet and a transit routing space connecting numerous wireless access routers that form a region of coverage.MPR is constructed from an existing IETF protocol specification Multi-Topology OSPF (MT-OSPF), which facilitates multiple coinciding instances of OSFP (i.e.routing tables and sets of paths) in IP intra-domain networks.MT-OSPF was originally aimed for enabling fast re-routes in case of failures, but in MPR it is employed for a different purpose: comprehensive network-wide load balancing by maximizing diversity over multiple paths. 2 By doing so, the overall networks' throughput capacity is increased, without explicit reservations, as its whole routing topology is constantly active [23] by utilizing multiple instances of OSPF, called Routing Planes (RPs), for efficiency purposes [24].In simple words, the shortest path routing becomes a network-wide multipath routing via basic multiplication of the OSPF instantiations (i.e.RPs).MPR TE algorithms are comprised of: (a) offline component, that plans traffic distribution prior to its insertion by constructing multiple paths/RPs; (b) online component, that governs the traffic flows insertions over these RPs in real time.The potential benefits of path diversity in access networks was studied in [25].The study underlines the evolution of next-generation access networks to more meshed topologies3 by exploiting path diversity through multipath routing.With MPR, benefits of MPLS TE namely explicit routing and arbitrary splitting of traffic, are enabled through an IP-based TE that avoids the overhead and complexity of MPLS TE by a comparatively simple reuse of default OSPF functions.We embrace conclusions from a fundamental study on path diversity [28] in inter-domain networks confirming that much of the congestion related performance degradation occurs due to constant overuse of a small choice of routing paths.Using both short and longer paths can be a performance enhancing tool, where the offline/online approach in intra-domains offers immediate shuffling of paths (with awareness of the network traffic dynamics or as a statistical educated guess) instead of applying slower recalculations.
A routing optimization applicable to the random all-IP access network topologies with varying capacities and demands is required.These networks are imminent in emerging practical situations of network installations and configurations, e.g.HetNets [2,29], 5G backhaul [2,7,27,30].An approach combining offline and online TE mechanisms could represent an comprehensive solution.Studies and applications of offline algorithms frequently apply predictions of traffic distributions in the network via input of traffic matrices [3].MPR's offline algorithm [21] built upon [31], differs here and prepares and balances the access network for unplanned traffic occurrences and variable distributions and users' demands.Furthermore, its aim is achieving full path diversity in elementary configurations of meshed tree topologies.However, it is ineffective in handling the more randomly constructed topologies.This is because the MPR's offline algorithm constructs RPs directly based on link weights computations for each RP, thus implicitly forming the routing paths.That makes it inflexible and inaccurate in cases of random graphs/topologies.In this work, we revisit the MPR's offline TE aspect (network planning phase) but apply the unconventional, random physical topologies with diverse link capacities as the input.The construction method for RPs is accordingly redesigned for these novel network scenarios.The reconstruction is facilitated through graph-based network modelling in order to practically cover the randomly formed topologies and ensure maximum path diversity [1,20,32] by utilizing the entire routing resources.
The rest of this paper is organized as follows: after the nomenclature, the elaboration of novelty is laid out in Section 2, Section sets out our analytical model comprising the problem formulation and the optimization problems.Algorithms and heuristic choices for constructing the finite set of RPs from candidates RPs are outlined in Section 4. Section 5 explains the MSC RPs-set selection approach.Performance evaluation is presented in Section 6.Finally, Section concludes the paper.

Nomenclature
The main abbreviations and symbols are listed below for a quick reference.Additional ones are introduced throughout the text.

Elaboration of novelty
In this section, we describe the underlying concepts of our solution followed by laying out of the paper's contributions.

Concept
Our MSC-based problem builds on MPR that consists of a pathsdiverse offline TE method combined with an online TE mechanism but with a fundamentally different offline algorithmic approach.The new MSC based approach aims to be much more broadly applicable than MPR when the all-IP access network topologies are ''taken in the wild'' by incorporating the unpredictable future access networks' complex random structures (i.e.random graphs).In such topologies, optimal distributions of multiple paths and use of the routing resources in the whole topology is a challenge, promoting the offline algorithm as the critical performance facilitator.We lay out a comprehensive novel offline RP construction method that suites this problems space.Specifically, the previous MPR solution is based on an OSPF-facilitated link weight penalization-based cost heuristic, whereas MSC explicitly compares how accumulated paths in RPs use edges/links in graphs.It does so by adapting a novel dynamic cost function with correlation of paths and path/link capacity elements.It then converts the connected graphs back into instances of OSFP/RPs by Dijkstra reverse engineering of the paths into link weights for every RP.Fig. 1 shows the basic concept in a simple topology with a single source (S) and destination (D) ingress/egress pair illustrating how three RPs correspond to three sets of (shortest) paths that maximize path diversity.Correspondingly, the physical topology is decomposed into smaller logical topology instances named RPs.The RPs differ and each RP is a unique instance of OSPF from the physical topology and associated with a dedicated set of link weights.The RP construction problem relaxation will allow paths between RPs to overlap and share subsets of the underlying network's routing resources rather than being strictly edge-disjoint.Path diversity is achieved through offline MSC-based algorithm which leads to full utilization of resources (i.e.links and their capacities) in access networks.Routes or paths are defined through Routing Information Bases (RIBs) and Forwarding Information Bases (FIBs), each representing one RP and stored in every router accordingly.MPR originally exploited three bits in the IP(v4) header allocated for the IETF DiffServ [33] in the Type of Service (ToS) field [33], and found them sufficient [20,21].In case DiffServ is not used or in IPv6 packets, more planes can be supported as more ToS bits are disposable.Therefore, routers recognize the RPs through the unused bits.Thus, MSC utilizes the IP header as it is without imposing extra overhead onto each packet.Compared with MPLS, this is a significant overhead reduction.MPLS imposes a 32-bit label stack onto each IP packet by encapsulating and installing of each label in routing paths causing overhead and router's configuration inflation.With increasing flows in the network MSC offers a scalable overhead as the number of RPs need not increase proportionally with traffic surges (e.g.shown in study conducted for MPR in [21]).Furthermore, the offline algorithm is an infrequent or periodic network-wide planning decision, while, the online algorithm is a relatively simple per-flow calculation at each Ingress node/router that continually chooses between the RPs to dynamically balance the traffic flows over short and longer paths.

Contributions
This paper contains the following contributions: • The offline RPs construction problem is extended to suit the complex random IP intra-domain access network topologies.The problem is confirmed as NP-complete after proving it rigorously as a generalization of the MSC problem.Correspondingly, a novel, heuristic, path-diverse MSC-based routing solution is composed.• MSC uses graph-based modelling of access networks to determine the most optimal distributions of routing paths.Previously in MPR, link weights where set first then routing paths were rendered, which lacked immediate control over their layouts.MSC adapts to randomly formed access networks with variable meshing/sparseness and sizes.To the best of our knowledge, such a TE mechanism that reflects upon heterogeneity and future ''network densification'' topological issues is absent in literature.• The MSC offline algorithm introduces two novelties: projections of capacities, and, correlations of the routing paths between Ingress-Egress pairs in RPs.This maximizes both path diversity and routing resources prior to insertion of traffic in the online part.Thus, planned residual capacities are balanced and effective path correlations (bottlenecks) are minimized.• The novel MSC-based offline algorithm finds a minimum set cover, i.e. a finite RPs-set from a larger set of candidate RPs, by using an MSC-based cost-function.Firstly, we show two popular graph tree building polynomial time algorithms for constructing candidate RPs.Ultimately, we resort to hill-climbing, a greedy heuristic (algorithm) for handling computational complexity in large topologies, i.e. more than 7 routers/nodes networks.• Full performance analysis of the MSC's online algorithm is conducted extending the offline-only analysis in [32].Varying meshing configuration with up to 33 routers/nodes are tested.MSC is compared with: its QoS-enabled extension -QMSC (done in MRP as QMPR [1,21]), and, some of rival protocols -MPLS, OSPF and OSPF InvCap (termed InvCap in the remainder).We broaden the previous upper limit on topology size of 25 routers [34] for tangible benefits from multipath routing.• The validity and comprehensiveness of MSC is indicated via its operational viability in wide ranging scenarios of random capacities and topologies [29].More than one GW in a given access network is introduced.

Analytical model
We here revisit the problem of building multiple RPs (i.e.sets of paths) by explicitly considering path diversity alongside path correlations and formulate the associated optimization problem.

Problem formulation
The rudimentary objectives of our RPs construction approach are initially explained as follows.Given an underlying physical network topology, a set of disjoint RPs is to be extracted such that each RP would end up with a path between every source-destination pair while every link in the network would appear at least once in the RPs-set.This disjoint requirement is specifically imposed to ensure maximum diversity across the RPs-set.In our network flow problem formulation, an RPs-set contains multiple paths by having a path in each RP between every pair of Ingress/source and Egress/destination nodes/routers in the network.In the problem formulation, commodities as flow demands between Ingress-Egress pairs correspond to real TCP/IP network layer sessions, which are specific finite flows of packets between Ingress-Egress pairs in the Internet identified by a source/destination IP addresses, ports and protocol number.
Making sure that a link appears at least once in one of the RPs ensures that all the links in a network topology would be utilized for routing.Alongside, maximized path diversity through expedited packet routing capacity, while balancing the constrains such as cost of paths in networks, form the basis for our approach and its comparisons.Our rudimentary objectives can be summarized as follows: Problem 1.Given a network represented by an underlying arbitrary graph topology  = (, ), with the vertex and edge sets  and  respectively, retrieve a set  = {  }  =1 of || =  RPs, such that the following properties hold: (a) Each RP constructed contains a valid path for every sourcedestination pair in the underlying network ; in other words for This means that for each Finally, the cost, as prescribed by some function  (e.g. of edge weights or their capacities) is optimal between every sourcedestination pair for each RP.
It is not difficult to see that the constraints, specifically (b) and (c) in Problem 1, are very restrictive and might even mean that no feasible solution would exist to the problem in some proverbial cases.In Fig. 2 (representing a full-duplex example where all nodes are Ingress/Egress), a connected subgraph of  is  1 = (, { 1 ,  2 ,  3 }), however it is no longer possible to construct another disjoint subgraph of  that simultaneously satisfies (b) and (c).In particular, the only unused links are  4 and  5 , but the subgraph  2 = (, { 4 ,  5 }) is not connected hence not representing a valid RP as it does not encompass all the  − − pairs.To avoid such scenarios, it makes sense to relax the restriction (c) such that the links could be reused.However, despite this relaxation, we would still aim for the subgraphs to be as different as possible; sticking with the same example, this means we may prefer to choose  2 = (, { 1 ,  4 ,  5 }) instead of  2 = (, { 1 ,  2 ,  5 }) given the choice  1 considering the latter reuses two links instead of one.In light of this, we would need to define a suitable graph similarity metric to enable us to compare any pair of graphs and determine precisely how similar or dissimilar they are.Hence with this similarity metric, a penalty for reusing links in our optimization problem can be properly imposed.To this end, we provide definitions in the following subsection.

Measuring the similarity of subgraphs (planes)
Definition 1 (Graph Correlation Metric).The graph correlation metric is a measure of similarity between a pair of graphs.This metric is defined herein to be the total number of different links present between the graphs, normalized by the number of edges in the graph containing the highest number of edges.Mathematically, let   = (  ,   ) and   = (  ,   ) then it follows that: It is easy to see that (  ,   ) ≥ 0, in fact (  ,   ) ∈ [0, 1] with (  ,   ) = 1 if and only if   =   ; whilst (  ,   ) = 0 if   ⋂   = {∅} (as a technicality, to avoid dividing by zero, we may assume that at least either   or   is nonempty).In addition, we can also deduce that if (  ,   ) = 0, then for any two nodes in   joined by a single edge, the same pair of nodes in   will be joined by a path with at least two different edges.
Furthermore, from Definition 1, we can also deduce an expression for (  ,   ) in terms of their corresponding adjacency matrices i.e.   ,   ∈ R × (where  = || is the number of nodes in the network) 6 : where ⊙ represents the element-wise (Hadamard) product between the two matrices and  ∈ R  is a vector with all its entries being equal to unity.

Enforcing diversity by minimizing link presence
Since it has been allowed for the links to be reused across multiple RPs (as a result of our relaxation in Section 3.1), different restrictions must be applied to this re-usage in order to minimize the chance of a link being used in all RPs.The following definitions represent the measures of diversity: Definition 2 (Link Presence).The link presence is a function with a binary output indicating whether or not a link  ∈  is used in the RP   .Denoting it as  ∶  ×  → {0, 1}, then: Definition 3 (Full Link Presence).The full link presence indicates the appearance of an edge in all RPs; defined mathematically as:

The optimization problem
In this subsection, Problem 1 and its relaxed version is formally described in terms of the relevant optimization problems, i.e. ( 6) & (7).At the start, the objective in our problem is to maximize the projected capacity utilization 7 in a network topology for each RP constructed.Let  be the capacity matrix, which we assume to be normalized so that its entries only take on values between 0 and 1, also denote by   ∈ [0, 1] the capacity of the edge  and by  ,  the th route/path (which is simply an ordered sequence of edges) between the  − − pair  and , so that the collection of all possible routes between  and  is { ,  }  , 𝑚=1 .With these definitions, we can state our main objectives: (1) In the path connecting the nodes (, ) (Ingress -Egress), the link with the smallest available capacity could be the source of the potential bottleneck when packets are transmitted in the online mode, hence we aim to alleviate such projected bottleneck in a given path. 6Which can be more useful when implementing matlab based algorithms. 7We could easily replace this with some other well defined associated cost function, that may include in addition to the capacity, path length and so on.Where: (2) We aim to maximize the minimum projected residual capacity (potential bottleneck) considering all pairs of nodes, i.e. for all ,  = 1, 2, … ,  and  ≠ ; across all the candidate  RPs (i.e paths).
It is noted that here we assume the existence of a composed set of RPs (i.e., the derivation of which will be presented in Section 4) in our optimization problem formulations as follows.Correspondingly, the objective of selecting the most fitting RPs-set (as laid out in Problem 1) can be stated in terms of the following optimization problem: maximize where  , (  ) is used to denote the path between the pair of nodes (, ) in the subgraph   and P is the desirable number of RPs.The objective function aims to select the RPs-set that contains paths which utilize the highest capacity between every  −  pair.Furthermore, in accordance with Problem 1, the imposed constraints must ensure the following: first, each   (representing an instance of OSPF) is a connected subgraph of  ( is assumed to be connected); second, it is guaranteed that any pair of planes are edge disjoint; third, no link is used in all the planes (which is automatically satisfied if the second constraint is); and finally, up to P RPs are rendered and allowed subject to the disjoint criterion specified in properties (b) and (c) of Problem 1.
As previously explained in Section 3.1, realistically, there could exist several circumstances where Problem 1 is infeasible and P cannot be rendered to reach the full utilization of all the links in the network (e.g.illustrated in Fig. 2).Hence, we would need to relax constraints (b) and (c) accordingly, that were described in Problem 1, and apply the diversity rules defined in the previous Section 3.3.This relaxation requires us to consider minimizing the overlap between the chosen planes in a set in addition to capacity provisioning.Overlap, in this case, refers to the potential statistical reuse of links in paths belonging to different planes.Hence, a link could appear across multiple planes and carry traffic between multiple ingress-egress pairs, i.e. analogous to time-division multiplexing.To this end, relevant constraints have been added into the objective to reflect the projected overlaps of the paths onto each link and render the new problem as follows: where  ∈ (0, 1) and  = 1 −  are arbitrary tuning parameters, chosen a priori and can be interpreted as a way of assigning more importance to the terms in the objectives.For example,  ≫  indicates that more importance is to be assigned to finding planes with high capacity which would not necessarily be diverse.This optimization problem is the basis of the construction problem discussed in the following subsection and applied in the dynamic cost function definition presented in Sections 5.1 and 5.2.

Minimum set cover plane construction problem
The problem of building multiple RPs planes (i.e.paths-sets imposed via MT-OSPF) is associated with the minimum set cover problem.This is explained herein by showing that the minimum set cover problem, which is -complete, is reducible in polynomial time to our problem of composing multiple RPs.Therefore, we conclude that the RPs composition problem is -complete.We firstly summarize the minimum set cover and its counterpart the minimum -set cover problems [35] below: Definition 4 (Minimum Set Cover Problem).Given a finite collection  = {  }  =1 of subsets of a universe  , a set cover  ⊆  is a subcollection of the subsets whose union is  , i.e.
⋃ ∈ =  .Moreover, each   ∈  has an associated non-negative cost   .The minimum set cover problem is to compute such a subcollection  ⊆  such that it is a set cover for  and its associated cost ∑   ∈   is simultaneously minimized.Moreover, assuming that instances of the weighted set cover are such that each   ∈  has at most  elements then the problem extends to the -set cover problem.Note that the unweighted set cover and unweighted -set cover problems are special cases of the weighted set cover and weighted -set cover problems, respectively.Furthermore, it is known that the MSC problem is -complete.To show that our problem is -complete it suffices to show that it is in  and that the weighted MSC problem is reducible to our problem in polynomial time.
Let the network topology being considered be defined by its connectivity graph  = (, ) and its associated link weight function  ∶  ↦ R + , that assigns a non-negative weight () to each edge  ∈ .It is understood that  and  are the set of all vertices/nodes and edges/links respectively of the underlying network.Define  to be a collection of distinct subsets of , that is to say:  = {  }  =1 where   ⊂  for each  = 1, 2, … ,  so that for any  ≠ ,   ≠   .Such a collection of subsets can be obtained, for example by constructing all (spanning) trees of the underlying graph  to include all ingressegress pairs, see for example [36].Each tree   is simply a subset with at most ||−1 elements (i.e.edges '') chosen from ; with an associated cost    = ∑ ∈  ().It is easy to conclude that ⋃ ∀   = .Now given , each member element   has a path connecting all possible Ingress-Egress pairs hence representing a valid RP.We aim to find the subcollection  of RPs of minimal dynamic cost (Section 5.1) that utilizes every link in the network.In other words, we desire an  such that ⋃   ∈ =  and ∑   ∈    is minimized.This is clearly the minimum set cover problem, by definition.It is now clear that our MPR problem is a generalization of this MSC problem, therefore it is in  and also -complete.
Problem 1 and its relaxation as formulated in Eq. ( 7), is -complete as shown herein, 9 thus we will need to resort to heuristic algorithms for computing useful solutions in Section 4 followed by Section 5.For some network , our novel approach is as follows: firstly, a highly redundant collection of network connected subgraphs (subsets) is constructed (this advances the previous MPR offline algorithm that used link weights immediately to derive RPs); then, from this pool of connected subgraphs, extract a specific set of connected subgraphs, hence forming an RPset.Several approaches are presented in Section 4 and followed by RPs construction mechanism in Section 5.

Choice of algorithms for constructing Routing Planes
We here elaborate on graph-based algorithms for constructing multiple valid candidates RPs ({  }  =1 ) in a network topology.In doing so we provide explanations on constructing a valid collection  from where a MSC  plane-set (i.e.RPs-set) can be selected as the final routing installation choice in a network.We note that this collection would represent a cover for the set . Every subset of edges   must contain a path between all Ingress-Egress pairs in , in other words, each subgraph   def = (,   ) of  must connect all Ingress-Egress pairs. 10e adopt the following approach (shown as the offline algorithm in Fig. 3): (1) Finding all or as many viable routing paths between all Ingress-Egress pairs in network topology  subject to its size.In graph representation terms, these are the ''trees'' between each Ingress and Egress pair in a network.The nodes that are only used for routing and can be part of the routing path/tree but are not the Ingress/source nor Egress/destination are called transit nodes/space including the associated edges.(2) Initially, for a given underlying network topology , we use and shuffle the rendered routing paths/trees from step (1) for all Ingress-Egress pairs in the network (including the transit space) to derive a set   with a constraint that each   must render a connected network/graph.Thus, each   would represent a candidate RP and form a graph model of an OSPF instance in the network.Further goal is to construct a set  whose elements {  }  =1 are subsets of the set with all edges  in the underlying network.(3) Subsequent to obtaining the candidate set of RPs as a collection of all possible choices of RPs, each candidate RP would be associated with a dynamic cost    = ∑ ∈  () Consequently, we aim to derive the Minimum Set Cover  , i.e. the RPs-set, based on the cost by applying any state-of-the-art algorithm such as [35] as explained in more detail in Section 5.This final RPs-set, shown as step 5 in Fig. 3, would be used as the MSC routing installation in the network.

Approaches for constructing multiple Routing Planes
Approaches for computing trees as candidate RPs are presented in the following text: (i) two from the well-known polynomial time algorithmic approaches in the literature: Edge Deletion and All Trees; (ii) a heuristic solution based on hill-climbing local search iterative algorithm, as a compromise.Reason for including the discussion and details of all the algorithms is to elaborate and justify the options for the performance evaluation applied in our simulations.While the two wellknown algorithms provide enough comprehension and methodology for most fittingly choosing the trees and consequently constructing the  candidate RPs, their computational costs become an issue when sizes of topologies increase.This gives ground to resorting to hill-climbing heuristic for larger topologies as a compromise between computational cost and optimality.
Edge Deletion Algorithm: This approach aims to construct several subgraphs of  = (, ) that would represent a candidate RP each of which would be associated with a dynamic cost.The algorithm is based on computing the shortest (least cost) paths from a fixed node to the rest of the nodes in the network, removing the most used edge and then repeating the process for the remaining nodes until all edges have been deleted.This reduces the number of times a link is reused and ensures that all paths for each Ingress-Egress pair have been identified and stored.From these paths a subgraph corresponding to a candidate plane   connecting all Ingress-Egress pairs is formed by selecting paths with the overall least cost.These subgraphs, i.e. {  } −1 =1 represent the collections of candidate planes each of which being associated with a cost (  ).

All-Trees Algorithm:
This approach builds the distinct trees of  that correspond to the candidate RPs.A corresponding result in graph theory is that all trees of  are recovered.Some definition help describe the approach: is a tree graph structure 11 containing all the vertices in  that connect all the Ingress-Egress pairs.The problem of extracting all trees from a given  has a long history (e.g.[36,37] and references therein).The method in [37] can be adapted which enumerates trees by swapping edges in a fundamental cycle; in fact, their algorithm finds and lists all trees for the unweighted and undirected graph in (  +  + ) where the   ,  , and  are used to denote the number of trees, vertices and edges, respectively, in .In fact, we can count the number of spanning trees a priori given the underlying network topology using Kirchoff's Matrix-Tree Theorem (see texts [38,39] for details) stated below.It is notable that while considering the variable combinations of Ingresses and Egresses across the random topologies (with the possibility of every node being an Ingress or an Egress), we also consider existence of a transit space.Hence, the spanning tree problem can be reduced to our special case of finding the trees that ensure the connectivity between every Ingress-Egress pair corresponding to a plane and not necessarily including all the transit nodes, i.e. routers. 11Unlike a spanning tree that would contain all vertices in a graph (tree assumes the existence of a transit space).
Equivalently, if  is a diagonal matrix with   = deg(  ) and  is the graph adjacency matrix, then  =  − .
Theorem 1 (Kirchoff's Matrix-Tree Theorem).Let  def = (, ) be an undirected graph and  its associated graph Laplacian, then the number of spanning trees contained in  can be computed as follows: (1) Select any vertex   and eliminate the th row and column from L to obtain a new matrix L ; (2) Then the number of spanning trees in  is These trees constructed using the adopted algorithm from [37] (shown in Appendix) now form the elements of the collection .In practice, finding all trees as such can be extremely expensive computationally for large topologies.We performed tests for topologies up to 60 routers [32] and encountered this for networks larger then 7 routers.Furthermore, finding the trees in case of the complete graphs is in the order of ( −2 ).In real network scenarios, the offline algorithm is not an extremely time critical process and would be continually applied on a single and static real network topology.For the sake of our study of different topologies and as a compromise or complementary solution for the polynomial time algorithms' complexities, we resort to an approximation and a heuristic approach.Therefore, an iterative greedy heuristic is to be applied for each step of -complete MSC problem for finding a suitable tree at each iteration subject to a cost.Such rendered trees concatenate to represent the near-optimal MSC cover set solutions, i.e.RPs-set, subject to the dynamic cost function defined in the next Section incorporating the selection criteria.For a given graph corresponding to a real physical topology, our chosen heuristic is the hill-climbing, a local search method rendering a near-optimal RPs-set.

Hill-Climbing heuristic (algorithm):
The first tree from the input graph is the minimal tree and it is constructed by considering capacity only as defined in Eqs. ( 6) and ( 9)(a).Thus, the first RP's cost equates to that of InvCap.As shown in Fig. 3, hill-climbing then changes every edge used in this initial tree to construct a new RP and to achieve near-optimal path diversity in the RPs-set that is being formed.This step of the heuristic search is repeated and in case a more suitable concerning candidate tree (RP) is obtained in each iteration subject to the cumulative cost, it replaces its predecessor tree.Here the criteria of capacity projection and path correlation is applied through the objective function in order to terminate the algorithm and deliver the near-optimum minimum set cover.This reveals no memory requirements of the heuristic and limits its time component to a finite value of iterations where the worst case corresponds to a count of maximum possible candidate RPs in a topology.

Finding a set of multiple minimum cost RPs: The minimum set cover
Our dynamics cost function used in composing the RPs-set is formulated in Section 5.1, followed by outlining of the steps for the RPs-set selection algorithm shown in Section 5.2.We define the algorithmic objective of compacting multiple RPs in a set subject to given constraints, i.e.minimum set cover.Having constructed the candidate RPs  (i.e.The set) using the methods described in Section 4, we aim to obtain the minimum set cover as defined in Section 3.5.To this end, several RPs from  = {  }  should be selected such that their union coincides with the underlying network topology, rendering a minimum cost set cover solution (wrt subject to a cost function value for each   ).As explained previously, this is reformulated as the MSC problem.
Formally described, the MSC problem starts with no covered item and an empty collection of subsets.An essential algorithm is as follows: (i)    is the number of uncovered items in   and    =     (  ) is the current ratio, also,  * is set with minimal   * ; (ii)  * is added to the collection of subsets of the solution; (iii) mark the items in  * as covered and assign a cost   * to now covered items in this iteration.Many relevant approximate algorithms for the MSC problem can be found in [40].We apply a modification to the essential algorithm, termed as the greedy algorithm with withdrawals [35], to reduce the complexity and construct the minimum set cover from .We propose a further modification to the simple greedy algorithm derived from [41], wherein the cost  (  ) of each unused subset   changes upon every greedy iteration.This aims to ensure that the selected RPs are comparably as diverse as possible in terms of the associated links.The similarity metric (1) is therefore applied to calibrate the cost associated with a RP selection at each greedy iteration  ∈ N 0 .This new dynamic cost function is formulated as   (  ), with underlining that it changes with .

The cost function
The dynamic cost function used for determining the minimum set cover is defined as: where  () = 1  and  is a set of RPs already selected in the forming RPs-set by the MSC algorithm.At the start of the MSC algorithm  set is empty, i.e.  = ∅ with cardinality zero.Thus, the first RP to be chosen is based on capacity (a).This starting term of the cost function   is inversely proportional to capacity   , hence only RPs with high capacity will be favoured in the first selection.Next, the summing ensures that the planes with the highest contributing capacity between all Ingress-Egress pairs are favoured.The other term in the cost function then values the correlation though the graph similarity measure between the currently considered RP   and all the selected RPs in the previous iterations of the MSC algorithm.Diversity is ensured as the cost sum is smaller when the new candidate RP   differs the most from the already chosen RPs.

The MSC-based RPs-set construction algorithm
The following iterative procedure finds the near-optimal cover for the edge set of .
(S.1) Set  = ∅ and  = 0. (S.2) If  = ⋃   ∈  =  then stop and return  =  , since  is a cover and practically the termination point of the optimization and the resulting MSC problem as defined in Sections 3.4 and 3.5.Otherwise, compute the ratio: and select the plane   that maximizes this ratio.

Performance evaluation
The proposed MSC-based framework is extensively simulated as outlined in Fig. 3. Firstly, we generate a random topology and then RPs in the offline mode.Then, the MSC online mode performance evaluations are conducted with other routing protocols when traffic is injected in the generated networks.

Experiment settings
MSC offline algorithm builds sets of RPs in Matlab for each topology prior to simulation of (online) injections of traffic flows into these networks using NS-2.Different sets of random IP access network topologies are used with 7, 18 and 33 nodes and varying degrees of sparseness/meshing and numbers of: ARs, GWs and RPs.Links capacities are normalized between 16 to 32 MBs, with random links distributions and descending magnitude from GW(s) to ARs.As expected, the same topologies are used for both the offline and online parts.Performances of MSC and QMSC are tested and compared with MPLS, and both OSPF and InvCap.Legacy OSPF and InvCap are essentially used for baseline analysis where link weights are respectively set to either 1 (hop-count based) or as inverse of link capacity.For the MPLS case, Dijkstra-based K-path routing is applied in order to construct the same multiples of LSPs as RPs in the MSC case, creating the same number of multiple paths choices  for every Ingress -Egress pair ( ≡ ).In MPLS offline TE, i.e. network planning phase, we ensure the existence of at least one if not all edge-disjoint paths with a hop-count threshold (metrics are detailed in [3]).Bandwidth reservations are not included to ensure that solely routing comparisons are revealed.
Extensive online packet level NS2-based simulations are interfaced with Matlab where an MSC RPs-set is rendered for each topology.Using the two simulation tools builds on our preceding line of work on multipath routing for MPR that was tested for smaller and lessrandomly configured all-IP access network topologies [1,20,21].Traffic Ingresses/sources and egresses/sinks as well as the durations of traffic sessions are all randomly selected throughout each simulation.The unpredictable user populations, attachments and distributions in novel heterogeneous access networks [2,4,29] are reflected in our dynamic and random traffic modelling, which differs from the typical injections of pre-planned Traffic Matrices (TMs) or slower time frames of traffic variations.This traffic model is also reused from the MRP investigations in IP access networks [1] where the traffic rate at each Ingress is increased by reducing the inter-arrivals times between sessions.The effective time of each simulation corresponds to real time of 12 s creating a highly congested network between 11th and 12th second where most of the packet droppings occur.This is the traffic saturation point beyond which the network becomes heavily congested and the transmission rate drops significantly.Upon a new session arrival at the Ingress (AR or GW), all potential paths towards Egresses are checked for bandwidth availability.This is conducted in all routing protocols tested.Ingresses are assumed to have a real-time view of the traffic distributions over links throughout the network.

Offline setup
The method of generating the random graphs that represent physical network topologies can be envisaged as a ''random walk''.Our Matlab algorithm consecutively adds a node from  and connects it to the previous one with an edge.Desired target levels of meshing/sparseness, i.e. average node degree, are achieved by randomly augmenting missing edge(s).Tested examples of random topologies sets (Topology Setup, TxSy) with indications of various numbers of GWs and source/sink nodes (i.e.ARs) are shown in Table 1.The other nodes constitute the transit space and are neither source/Ingress nor sink/Egress nor GW.
The greedy heuristic hill-climbing is applied as the MSC RPs construction approach for the chosen topologies to reduce computational complexity, as explained in Section 4.1.This is crucial for topologies with more than 7 nodes, that is, for cases of 18 and 33 nodes shown (we tested even larger topologies as mentioned further on).The hillclimbing method applies our cost function onto each topology to find a near-optimal tree at each iteration as generally set out in Eq. ( 10): the minimum tree is initially built by checking capacity (Eqs.( 6) and (9)(a)) making the cost of this first graph/RP selected based on InvCap.Hill-climbing consequently attempts to find the near-optimum RPs-set with enough diversity by replacing the latest added RP with another successor RP if it is more suitable.The heuristic terminates at the best near-optimum Minimum Set Cover subject to maximized capacity overlaps and path correlations for a given topology based on parameters in Table 1.It ought to be mentioned that the earlier MPR's offline algorithm comes short of generating a set of sufficiently diverse RPs for the investigated random topologies emphasizing the scalability of MSC.This is due to MPR's immediate application of links weights with no graph-based comparison.
Fig. 4 depicts graphs generated in Matlab shown as such for authenticity.The images are derived from four  1 network topology specifications in Table 1 having four different levels of average node degrees (random meshing).Each  1 network topology is termed as Original Graph located at the left of each of the four rows.In each row, each Resulting Tree corresponds to one RP derived from the Original Graph and representing a GW-to-AR configuration of a connected graph.All the Resulting Trees/RPs in each row in Fig. 4 constitute an RPs-set.Wired link capacities are allocated randomly and are shown next to graph edges as non-integer numbers.As mentioned, links closer to GW have higher capacities [29] to reflect heterogeneity and deployment practicalities.Hill-climbing makes the Resulting Trees/RPs in each row differ, e.g. in some cases in Fig. 4 the paths do not branch out to each AR separately, but, include AR(s) in the path(s) to edge AR(s).Other topologies from Table 1 are not depicted as the density of links and nodes is too high and indiscernible as revealed in Fig. 5 displayed for indicative purposes.Note: We run the offline algorithm for all TxSy and even larger topologies with 60 routers/nodes in [32], but omitted these from the online evaluations as explained in Section 6.4.

Online NS2-based performance analysis
After the RPs-set is rendered for each topology in the offline part, reverse engineering then associates each RP with an independent Dijkstra-based cost array to transform it into an instance of OSPF (as in MT-OSPF).It is then integrated in NS2 online part packet level network simulations.We adopt the same online TE algorithm in MSC and QMSC for inserting traffic over the chosen RPs as used in MPR and QMPR respectively [1,21], shown as steps 8 and 9 in Fig. 3. RP selection policy is applied at the Ingresses (GW(s) and ARs) for each commodity (i.e.IP session) to regulate the traffic flows.
For the case of MSC, the suitability cost of an RP is associated with the unused paths' bandwidths, to avoid potential bottleneck(s).If several RPs meet the criteria for bandwidth availability, one RP is picked randomly.In QMSC, RPs are ranked in terms of the highest bottleneck bandwidth available as defined in [1].Subsequently, based on a session's classification and its QoS/Service Level Requirements (SLR): jitter, latency and packet loss, the RP selection policy is enforced based on the lowest cost.For both MSC and QMSC, all of each session's packets remain traversing the same RP to avoid transport layer disruptions due the packet reordering and jitter.
We now proceed with analysing the performances in terms of various metrics that are averaged over many simulations conducted based on the traffic classes in Table 2 and relative to the range of normalized link capacities.
Throughput: Table 3 shows the throughput results for the protocols tested.The mean throughput is generally the highest in case of MSC showing sustained improvements in load balancing.This is due to availability of more diverse set of routes, which results in sessions being distributed over many available RPs.The smaller throughput in case of MPLS in comparison with MSC is caused by more traffic blocking.This is due to the lower path diversity, hence, relatively smaller utilization of routing resources in networks.QMSC blocks a higher number of sessions due to enforcement of QoS requirements, causing a generally lower throughput compared with MSC and MPLS.The worse performance is observed, as expected, in cases of OSPF and InvCap methods due to the availability of only one path between Ingress-Egress pairs.The divergence of results in case of OSPF and InvCap is because different sets of shortest paths are rendered due to different link weights.Furthermore, it can be noticed that the throughput is generally higher as the topology size increases in terms of links and nodes.This is justified as a relatively higher traffic volume is injected into the network thanks to a higher number of traffic Ingresses and a higher overall network capacity.

Blocking rate:
We define the blocking rate percentage as the ratio between the number of sessions that were injected into the network with the number of sessions that were successfully delivered in entirety during the simulation.In case of MSC, as observable from Table 4, the blocking rate is generally the lowest.Higher path diversity in case of MSC facilitates availability of a comparatively higher number of paths that fully meet the minimum required bandwidth.The exertion of QoS criteria for RP selection in case of QMSC often means a lack of available qualified path.This leads to a generally higher blocking of sessions in QMSC in comparison with MSC and MPLS especially when the network becomes congested to a point of over-utilization towards the end of simulation time.Meanwhile, the blocking in case of OSPF and InvCap is the highest due to a much fewer number of paths being available.available between every Ingress-Egress pair hence shortest-hop routes are not solely used.Therefore, sessions forwarded over many RPs/LSPs experience higher delays.The delay is relatively lower in case of MSC for smaller topologies as the routes are relatively similar and shorter in hop length resulting in the delays being contained much better particularly with higher throughput where shorter paths (and queues) get congested.As the network topology size grows, routes are relatively longer in lengths (hops) and carry more traffic volume in cases of MSC, MPLS and QMSC.This leads to a higher overall delay, but, increases the throughput while reducing the blocking rate.Increased traffic delivery through multipath routing using longer paths/routes, meets the trade-off of higher average delays as already noted in [42,43].QMSC generally experiences lower end-to-end delays than MSC or MPLS due to the QoS enforcement and higher blocking.

Packet loss rate:
We define the packet loss rate as the percentage ratio between the packets dropped and the packets successfully delivered during a simulation.It occurs when queues capacities become insufficient to meet the surge in traffic and increasing congestion in the network.The packet loss in case of the MSC method is generally higher as it can be observed from Table 6.This can be justified by a higher amount of traffic being transmitted in MSC compared with the other approaches as the blocking rate for MSC was observed as the lowest on average.Meanwhile, QMSC performs generally better than MSC and MPLS due to the QoS enforcement regulating the packet distribution into the network and resulting in lower losses.The loss rate is the lowest in case of InvCap and OSPF in general as a significantly smaller amount of traffic is accommodated within the network.Correspondingly, the loss rate is smaller in case of OSPF relative to InvCap as a smaller throughput is achieved.MPLS also performs better than MSC in most cases as less throughput is achieved relatively.Moreover, the loss rate is generally rising with the increase in the topology size as more traffic load is accommodated in case of larger topologies.It should be also noted that as we use ''random walk'' to build very random topologies, it can be the case that in a given topology some links get heavily reused by many RPs.This causes increased packet loss for MSC.In cases of OSPF and InvCap this is not a great problem since there is only one shortest path used at a time.occurs due to traversal of longer routes in case of MSC and higher traffic load being admitted in the network correspondingly resulting in more delay variations.The possibility of the existence of heavily loaded links for MSC, as explained in the case of packet loss, can also be a contributing factor to higher jitter.Jitter is lower in case of QMSC as compared with MSC and MPLS due to the QoS criteria being enforced at every Ingress.The rise in the supported traffic volume within the network in case of the multipath approaches leads to a larger jitter specially for the larger topologies as compared with InvCap and OSPF.

Further analysis
Results are not shown for larger topologies with 60 routers nor for the extremely meshed TxS4 topologies [32].The networks with 60 routers contain thousands of links and we were unable to differentiate conclusions from the performances.For the TxS4 topologies the results were similar to the TxS3 cases, which are very dense as seen in Fig. 5 and on the verge of practicality.Our tested topologies are deemed sufficiently large and meshed and surpass relevant recommendation on the limit of up to 25 routers networks [34] where multipath routing reaps benefits.For more random cases, the MSC offline algorithm criteria would need to be revisited.We believe that such large access networks are currently out of scope (as the large data centre and interdomain networks as discussed in the Introduction) and could be either partitioned or rearranged.
Even in the shown performances, slight discrepancies due to randomness of topological configurations were noticed across all schemes, e.g.MPLS has a higher packet loss in smaller network T3S2 than in T3S3, whereas MSC achieves better packet loss consistency while having nonlinear delay variations between the topologies.Since bandwidth availability is checked for every session, as reflected in the blocking rate, performance degradation are less severe for delays, packet losses and jitter for all protocols tested.Thus, multipath via MSC is a comprehensive solution and confirms our starting premise from Section 1.1 that utilizing both shorter and longer paths subject to a topology, leads to performance enhancements.As the throughput increases, overuse of shorter path(s) causes performance degradation [28].The trade-off between distributing higher traffic loads and delays is thus attenuated especially if the delays are negligible to the end-to-end delays emanating to and from the Internet.This confirms a high suitability of MSC for the considered IP access networks topologies in their entirety that are located the edges of the Internet's routing fabric.Finally, our experiments and comparisons are setup to demonstrate a balanced approach between capacity projections and paths correlation in the offline algorithm, that is, the arbitrary tuning parameters  and  respectively from Eqs. ( 7) and ( 9).This is due to the first RP chosen being the InvCap and link capacities set to decrease in magnitude nearer to ARs.Essentially, the balance sought is between having more RPs (i.e.greater ) and higher diversity between the RPs (i.e.greater ) within a cover (Section 5.2).

Conclusion
This paper describes a novel TE approach for future IP access networks termed MSC after a mathematical problem with the same name.The essence of the novelty is in the offline TE part that is specifically suited for random and large topologies of the access networks, and, which is based on modelling of graphs.Sets of graphs are built in Matlab, called RPs, that represent multiple instances of intradomain OSPF.Our dynamic cost function applies optimality criteria

Fig. 1 .
Fig. 1.Top figure: A simple topology with 3 RPs between a single ingress/egress pair (S,D) and transit nodes.Coloured arrows indicate paths for each RP, and numbers indicate link IDs.Bottom figure: Link weights for one RP.

Fig. 2 .
Fig. 2.An underlying network topology  with four vertices and five edges.

A
.Mihailovic et al.

Fig. 3 .
Fig. 3.A schematic diagram of the key components of the MSC offline and online algorithms.

Fig. 5 .
Fig. 5. Matlab generated images -Original graph is T2S3 topology from Table 1 with 18 nodes and some examples of its resulting trees/RPs with link capacities (lack of clarity and extreme link density shown for indicative purposes).

∼
(  ) represents the projected available capacity (i.e. the projected residual capacity 8 that could potentially cause bottleneck) on path    that is associated with plane  and commodity .The available bandwidth is calculated by taking into consideration the bottleneck on every path at various instances:

Table 1
Parameters of the tested random topologies sets.
Table 5 displays the end-to-end packet delivery delays across the different topologies.With MSC, QMSC and MPLS multiple routes are

Table 2
Traffic types a and associated QoS requirements.