Incentivizing Stable Path Selection in Future Internet Architectures

By delegating path control to end-hosts, future Internet architectures offer flexibility for path selection. However, there is a concern that the distributed routing decisions by end-hosts, in particular load-adaptive routing, can lead to oscillations if path selection is performed without coordination or accurate load information. Prior research has addressed this problem by devising path-selection policies that lead to stability. However, little is known about the viability of these policies in the Internet context, where selfish end-hosts can deviate from a prescribed policy if such a deviation is beneficial fromtheir individual perspective. In order to achieve network stability in future Internet architectures, it is essential that end-hosts have an incentive to adopt a stability-oriented path-selection policy. In this work, we perform the first incentive analysis of the stability-inducing path-selection policies proposed in the literature. Building on a game-theoretic model of end-host path selection, we show that these policies are in fact incompatible with the self-interest of end-hosts, as these strategies make it worthwhile to pursue an oscillatory path-selection strategy. Therefore, stability in networks with selfish end-hosts must be enforced by incentive-compatible mechanisms. We present two such mechanisms and formally prove their incentive compatibility.


INTRODUCTION
The past 20 years of research on next-generation Internet architectures have shown the bene ts of path awareness and path control for end-hosts, and multiple path-aware network architectures have been proposed. Many of these architectures, including RON [2], Platypus [29], MIRO [38], Pathlets [16], Segment Routing [10], and SCION [3], allow end-hosts to select the inter-domain paths over which their data packets are forwarded. One principal argument for such path control is that it enables load-adaptive routing, i.e., allows the end-hosts to avoid congested links, and should therefore lead to a relatively even tra c distribution. However, load-adaptive routing creates new challenges, in particular the introduction of instabilities under certain conditions. Instability due to load-adaptive routing typically appears in the form of oscillations, i.e., periodic up-and downswings of link utilization, leading to a large variance of the tra c load in a short time span. According to the IETF, a central obstacle to deployment of path-aware network architectures are 'oscillations based on feedback loops, as hosts move from path to path' [6]. Indeed, such oscillations can be shown to occur if path-selection decisions are taken on the basis of outdated load information [14,35], which is the case in any real system. Such oscillations are undesirable for many reasons, both from the perspective of the end-hosts and the perspective of the network operator. If oscillation occurs when a link is near its capacity limit, there is a danger of queue build-up, jitter, and, as a result, unpredictable performance. Moreover, oscillation temporarily leads to a heavily skewed load distribution over paths, causing higher overall queuing latency than with a more equal tra c distribution. Due to the large variance of the load level over time, network operators have to perform substantial overprovisioning of link capacities, which is undesirable from a business perspective. Moreover, oscillation of inter-domain tra c imposes additional overhead for intra-domain tra c engineering (e.g., MPLS circuit setup), as oscillating inter-domain ows may constantly switch between inter-AS interfaces. From the end-host perspective, oscillation causes packet loss and thus forces the congestion-control algorithms to recurring restarts, negatively a ecting throughput.
To avoid these damaging e ects, researchers have devised numerous schemes that aim to guarantee stability of load-adaptive routing. However, to the best of our knowledge, no scheme so far has aimed at providing stability in Internet architectures with end-host path control. Many systems have been designed under the assumption of network-based path selection, i.e., hop-by-hop forwarding according to decisions taken by intermediate routers [11,17,23,25]. These systems achieve convergence by appropriately adjusting how much tra c is forwarded to each next hop towards a destination and cannot be used if packets must be sent along paths selected by end-hosts. Other systems allow end-point path selection, but are targeted to an intra-domain context where the end-points (typically ingress and egress routers) are under the control of a network operator [7,14,[20][21][22]27]. In an intra-domain context, network operators are able to prescribe arbitrary path-selection procedures that generate stability. Conversely, in an inter-domain context, the end-points are not under control of network operators and can thus not be forced to adopt a non-oscillatory path-selection strategy. Instead, as end-hosts must be assumed to be sel sh, they can only be expected to adopt path-selection strategies that optimize performance from their individual perspective.
By performing a game-theoretic analysis, we show in this paper that the non-oscillatory path-selection strategies traditionally proposed in the literature on stable source routing [7,14,[20][21][22]27] are incompatible with the self-interest of end-hosts. Assuming that such non-oscillatory path-selection strategies are universally adopted, an end-host can increase its utility by deviating in favor of a strategy that is oscillatory. Therefore, stability of load-adaptive routing in an inter-domain context cannot be achieved by relying only on end-point path selection. Instead, network operators have to incentivize end-hosts to adopt one of the well-known convergent path-selection strategies with stabilization mechanisms. These mechanisms have to be incentive-compatible, i.e., the mechanisms must create an incentive structure such that it is in an end-host's self-interest to adopt a non-oscillatory path-selection strategy. In this work, we present two such stabilization mechanisms, FLOSS and CROSS, and formally prove their incentive compatibility. These mechanisms employ di erent techniques to disincentivize oscillatory switching between paths, namely limiting the migration rate between paths (FLOSS) and imposing a cost on switching between paths (CROSS). To complement our mainly theoretical work, we also discuss how our ndings could be practically applied.

Contribution
This paper revisits the theoretical study of the dynamic e ects of end-point path selection, for the rst time focusing the analysis on inter-domain networks where the end-points are sel sh and uncontrolled. We present a game-theoretic model that allows us to investigate which path-selection strategies will be adopted by sel sh end-hosts. In particular, we introduce the notion of equilibria to path-selection strategies (PSS equilibria). Moreover, we formally show that the non-oscillatory path-selection strategies proposed in the existing literature do not form such PSS equilibria. Thus, we provide evidence towards the hypothesis that stability in load-adaptive routing over multiple domains cannot be achieved by exclusively relying on end-hosts' path-selection behavior. To remedy this problem, we leverage insights from mechanism design to devise two incentive-compatible stabilization mechanisms enforced by network operators. While these mechanisms build on existing insights from intra-domain tra c engineering, their methods of incentivization represent a novel approach to achieve stability in inter-domain networks with load-adaptive routing. We formally prove the incentive compatibility of both mechanisms and discuss their practical application.

OSCILLATION MODEL 2.1 Parallel-Path Systems
In order to study oscillation in network architectures with end-host path selection, we build on the well-established Wardrop model [37], which is the standard model for studying the interactions of sel sh agents in computer networks [28,32,33]. In the Wardrop model, an in nite number of end-hosts, each controlling an in nitesimal tra c share, select one path π among multiple paths Π between two network nodes. Every path π has a load-dependent cost, where the path-cost function c π is typically interpreted as latency. The end-hosts' path-selection decisions form a congestion game, where the path-selection decisions of end-hosts both determine and follow the load f π on every path π [5,19,30].
In this work, we analyze congestion games with a temporal component, i.e., end-hosts take path-selection decisions over time based on currently available information. More precisely, an end-host performs an average of r > 0 re-evaluations per unit of time. The aggregate re-evaluation behavior is uniform over time, i.e., when dividing time into intervals of length ϵ ∈ (0, 1], rϵ re-evaluations are performed in any interval Whenever an end-host performs a re-evaluation, it chooses one path π to its destination according to a freely chosen path-selection strategy σ . We thus formalize the environment of congestion games as parallel-path systems: De nition 2.1. A parallel-path system O := (Π, r, p,T , A 0 , ) is a tuple, where a total demand normalized to 1 is distributed over parallel paths π ∈ Π among which end-hosts can select; r > 0 is the average number of re-evaluations per end-host and unit of time; p ≥ 1 is the steepness of the path cost as a function of the load (i.e., c π = (f π ) p ); T ≥ 0 is the average time that it takes for cost information to reach the agents; A 0 ∈ [0, 1] |Π | is the initial load matrix, where the entry A 0π = f π (0); and is the strategy pro le, de ning for every available path-selection strategy σ the share (σ ) of end-hosts that permanently apply strategy σ .
Every congestion game possesses at least one Wardrop equilibrium, consisting of a tra c distribution where no single agent can reduce its cost by selecting an alternative path [30]. If the agents take path-selection decisions based on up-to-date cost information of paths (T = 0), convergence to Wardrop equilibria is guaranteed and persistent oscillations can thus not arise [12,13,34]. However, in practice, the cost information possessed by agents is stale (T > 0), i.e., the information describes an older state of the network. If such stale information is present, undesirable oscillations can arise [14]. Therefore, parallel-path systems can be oscillation-prone: De nition 2.2. A parallel-path system O is oscillation-prone if and only if T > 0.

Path-Selection Strategies
In a congestion game, end-hosts select paths according to freely adopted path-selection strategies. In order to enable a theoretical treatment, we follow Fischer and Vöcking [14] in assuming that path-selection strategies are memory-less, i.e., not dependent on anything else than currently observable information. Therefore, any path-selection strategy σ can be fully characterized by two elements, σ = (R, u), which we will describe in the following.
First, every strategy is characterized by the expected time R between re-evaluations of an end-host. The expected re-evaluation period R re ects the reallocation behavior of end-hosts that nondeterministically re-evaluate the costs of path options, decide for one option based on the perceived costs, and keep sending on the selected path until the next re-evaluation is due. The expected re-evaluation period R has to be in accordance with the parameter r of the parallel-path system, which describes the average number of re-evaluations per end-host and unit of time. Hence, R = 1/r . Second, every strategy σ is based on a path-selection function u(π , t | π ), which gives the probability for selecting path π at time t if the currently used path is π . Given universal adoption of a strategy σ and rϵ re-evaluations per interval of length ϵ, the number of end-hosts on path π changes by the amount ∆ ϵ f π (t) = −rϵ · u(π , t | π ) · f π (t) + rϵ · u(π , t |π ) · fπ (t) within an interval starting at time t, given a two-path system. If ϵ is chosen to be in nitesimal, we obtain the rate of change: Throughout the rest of the paper, we describe oscillation dynamics by such di erential equations. An example of a path-selection strategy is the greedy pathselection strategy σ g , which selects the path perceived as cheaper: Conversely, the probability of staying on a path is u g (π , t |π ) = 1 − u g (π , t |π ). At time t, the number of end-hosts on a more expensive path π thus changes with rate −r · f π (t).
Whether an oscillation-prone system in fact experiences oscillation entirely depends on the path-selection strategies adopted by end-hosts. In the next section, we present the example of an oscillation-prone system that experiences oscillation for some pathselection strategy, but converges to stability for a di erent strategy.

Example of Oscillation
For every T > 0, oscillation occurs in a system in which all agents adopt a greedy path-selection strategy σ g presented in the previous section. The dynamics of a system with universal adoption of the greedy strategy are given by the partial di erential equation: 1 1 An analogous equation holds for f β .
fβ(t) Figure 1: Oscillation structure for oscillation-prone system A and W are calculated according to Equation (6).
We henceforth refer to turning points as all points in time t + where c α (t + − T ) = c β (t + − T ), as f α (t) switches between increasing and decreasing at these moments, and write t + (t) for the most recent turning point t + < t.
Solving the di erential equation piece-wise yields the following recursive function: 2 Since T is constant, f α (t) is periodic after the rst turning point t + 1 irrespective of the initial imbalance A 0 . Therefore, the oscillation can be described by the non-recursive function: where and t + (t) = t − (t mod W ) is a multiple of W . Figure 1 shows an example of f α (t) for the oscillation where A 0 has been chosen as A in order to skip the irregular starting phase. Figure 1 also highlights the time interval during which path α is the cheaper path (in green, between t * 1 and t * 2 ) and the time interval during which path α is perceived to be the cheaper path (in red, between t + 1 and t + 2 ). Clearly, the discrepancy between reality and perception of path costs is the source of oscillation, as the discrepancy leads to increasing load on a path even when it is no longer the cheaper path (i.e., between t * 2 and t + 2 ). Due to the periodicity of this phenomenon, there exists no limit ∆ * of load di erence and the oscillation-prone system experiences oscillation. An interesting observation is that both amplitude (A) and oscillation period (2W ) increase with the staleness of the information (T ); any T > 0 leads to oscillations, only T = 0 ensures stability.
In contrast, if the strategy pro le contains di erent pathselection strategies, an oscillation-prone system may experience stability (cf. example in Appendix A).

Equilibria on Path-Selection Strategies
In general, Nash equilibria refer to strategy pro les that do not allow for bene cial sel sh strategy changes by individual agents. In the context of path-selection strategies, a Nash equilibrium is thus given if every end-host cannot improve its utility by switching to an alternative path-selection strategy. More formally, a Nash equilibrium on path-selection strategies can be de ned as follows: De nition 2.5. A strategy pro le * is a Nash equilibrium on path-selection strategies (PSS equilibrium) in an oscillation-prone system O = (Π, r , p,T , A 0 , * ) if and only if all strategies σ with * (σ ) > 0 have cost C(σ | O) = C * and all strategies σ with It remains to formally de ne the cost C(σ | O) of a strategy σ in an oscillation-prone system O with global strategy pro le . First, we note that a global strategy pro le , together with an initial strategy-adoption distribution for each path, uniquely de nes the ow dynamics f (t) = (f α (t), f β (t)) in oscillation-prone systems with two paths. As the ow share controlled by each agent is assumed to be negligible in the Wardrop model, the ow dynamics f (t) are not a ected by the choice of σ when varying σ for a single agent. The basic costs of the two path options α and β at any moment t are thus given by c α (t) and c β (t), both uniquely de ned by an oscillation-prone system O = (Π, r , p,T , A 0 , ).
Given expected re-evaluation periods of duration R, an end-host deciding for path π at time t incurs the usage cost At time t, the cost c(σ , t) of applying a strategy σ is where π is the current path of the end-host before the decision at time t and u(π , t | π ) is the probability that path π is selected at time t given the current path π . Furthermore, the strategy also determines the probability distribution (π | t) that de nes the probability of the current path being π at time t. The expected cost for applying a strategy σ at time t is thus given as follows: The expected cost of applying a strategy σ in general can be derived as the average time-dependent strategy cost during a certain relevant time span t 0 , t 1 : For systems that converge to stability at equal load, the relevant time span extends from t 0 = 0 until time t δ when the system has converged according to some criterion δ > 0, i.e., ∀t > t δ . ∆(t) < δ . The time after convergence does not have to be considered as all strategies have the same cost for a system with equal path costs. For periodic oscillating systems, the relevant time span is de ned as every interval that contains the periodically repeated sub-function. For an example of a PSS equilibrium analysis, see Appendix B.

LIMITS OF STABLE STRATEGIES
In this section, we investigate whether the stability-inducing pathselection strategies proposed in the literature form PSS equilibria. The question is whether an end-host can minimize its cost with a stability-oriented strategy if that strategy is universally adopted.
We perform this investigation by means of two case studies. In §3.1, we analyze the convergent rerouting policies designed by Fischer and Vöcking [14] and show that such rerouting policies are not compatible with the sel shness of end-hosts. In §3.2, we analyze the MATE algorithm [7] and show its equivalence to the rerouting policies discussed in §3.1.

Rerouting Policies by Fischer & Vöcking
A typical example of a convergent path-selection strategy has been proposed by Fischer and Vöcking [14]. The proposed path-selection strategy, which we henceforth refer to as the convergent strategy σ c , works as follows: If an end-host discovers a path with lower cost according to stale information, the end-host switches to that path with a probability that is a linear function of the perceived latency di erence. More formally, the probability u(π , t |π ) to switch from path π t to path π π at time t is: Here, µ is a parameter in [0, 1] and the latency di erence is normalized by ∆ max , which is 1 in parallel-path systems as de ned in §2.1. The dynamics of a two-path oscillation-prone system where strategy σ c is universally adopted can thus be described by the delay-di erential equation (DDE) where . This DDE describes a damped oscillator with delayed feedback and does not have an explicit solution [4]. However, we can numerically compute a solution using the method of steps [8].
As Figure 2 shows, the choice of the parameter µ is critical for the strategy to actually lead to convergence. For high values of µ, such as 1, the strategy fails to produce convergence and yields undamped periodic oscillations. For low values of µ, such as 0.1, the system monotonically approaches the equilibrium without overshooting, i.e., it is overdamped (or, if nearly avoiding overshooting, critically damped). For values in-between, such as 0.5, the system eventually converges to stability at equal load, but only after overshooting, i.e., it is underdamped. However, for both the overdamped and the underdamped convergent strategies, we can make the following observation: Universal adoption of the convergent path-selection strategy σ c does not represent a PSS equilibrium, neither in its underdamped nor in its overdamped variant.
In the case of the overdamped strategy (e.g., σ c with µ = 0.1), the link loads monotonically approach each other and thus the greedy strategy allows an end-host to make use of a cheaper path sooner, making it the best-response strategy given universal adoption of σ c . In the case of the underdamped convergent strategy (e.g., σ c with µ = 0.5), the fact that the strategy is not a PSS equilibrium in general is not obvious. However, we can show that there exist alternative strategies to the underdamped rerouting policy that reduce a deviant agent's cost, see Appendix C.

MATE Algorithm
The MATE algorithm [7] was designed for the intra-domain context, where an ingress router has to distribute its demand d between multiple label-switched paths to a given egress router. As these ingress routers are under control of the domain operator, the MATE algorithm pursues convergence to the socially optimal tra c distribution, which minimizes latency from a global perspective, but is generally unstable given sel sh end-hosts. In the context of interdomain networks, the MATE algorithm is instantiated such that it converges to a Wardrop equilibrium, a type of equilibrium that is stable under the assumption of sel sh agents.
We analyze whether applying the MATE algorithm is rational from an end-host's perspective. An end-host in an oscillation-prone two-path system would execute the MATE algorithm as follows.
In every re-evaluation, the end-host sel shly optimizes its tra c allocation F α , F β , where F α = d − F β . In order to conform to the Wardrop model, the demand d is negligible from a global perspective. A MATE optimization step is de ned as follows: In order to reach convergence despite stale information, the coecient γ has to conform to a certain upper bound [7]. Moreover, F + represents a projection of allocation vector F to the feasible allocation set de ned by As we show in Appendix D, the dynamics of an oscillationprone system with universal adoption of the MATE algorithm are described by the following di erential equation: This equation is clearly equivalent to Equation (12) for a choice of µ = γ /2. An oscillation-prone system with universal adoption of σ c and a system with universal adoption of the MATE algorithm thus exhibit the same ow dynamics, which allow for bene cial deviation: The path-selection strategy as prescribed by the MATE algorithm is equivalent to the path-selection strategy σ c . Thus, universal adoption of the MATE algorithm neither constitutes a PSS equilibrium.

Conclusion
In summary, the kind of convergent path-selection strategies proposed in the literature cannot be assumed to be adopted by sel sh end-hosts, as deviating from these strategies (e.g., by switching faster than prescribed by the strategy) is bene cial to an end-host.
Stability in a path-aware network architecture with sel sh endhosts can thus not be guaranteed by non-oscillatory path-selection strategies that prescribe a maximum rate of change to be respected by end-hosts. Instead, the network could employ mechanisms that incentivize end-hosts to follow non-oscillatory path-selection strategies. This nding re ects a similar result [1,15] in the context of congestion control, namely that socially desirable behavior of endhosts can only be enforced with network support.

STABILIZATION MECHANISMS
As argued in the previous section, rational end-hosts in networks with unrestricted path choice are unlikely to adopt convergent path-selection strategies. Therefore, there is a need for mechanisms that allow network operators to incentivize the adoption of path-selection strategies that induce stability at equal load, i.e., incentive-compatible stabilization mechanisms. First, we integrate the concept of tra c-steering mechanisms into our game-theoretic model ( §4.1). Second, we specify in §4.2 the conditions under which these mechanisms are incentive-compatible.

Tra c-Steering Mechanisms
In order to a ect the path-selection decisions of end-hosts in an oscillation-prone system O, a tra c-steering mechanism M needs to alter the strategy cost C(σ |O) for at least one path-selection strategy σ . A mechanism M thus de nes a function c M (π , t) that quanti es the mechanism-imposed cost for using path π at time t. This cost is imposed onto the user of a path π in addition to the load-dependent path cost.
If a mechanism M is active, the usage cost c M u extends the standard usage cost c u from Equation (7) as follows: The cost formulas c M (π , t |π ), C M (σ , t), and C M (σ |O) can be constructed from c M u (π , t), analogously to §2.4.

Incentive Compatibility
In general, incentive-compatible mechanisms are mechanisms that incentivize a certain form of desirable behavior. In our context, we consider tra c-steering mechanisms to be incentive-compatible if these mechanisms incentivize the desirable behavior of adopting a non-oscillatory path-selection strategy. In other words, an incentive-compatible mechanism creates a PSS equilibrium, i.e., a situation where every end-host minimizes its cost by adopting a non-oscillatory path-selection strategy, given that all other endhosts do so: De nition 4.1. A tra c-steering mechanism M is an incentivecompatible stabilization mechanism for an oscillation-prone system O if there is a strategy pro le * such that (i) * leads to stability at equal load and (ii) * represents a PSS equilibrium with respect to the cost function C M (σ |O).
In the following two sections, we present two instances of stabilization mechanisms, namely FLOSS and CROSS, and prove their incentive compatibility. The two mechanisms di er in the methods for achieving stability: Whereas FLOSS reduces the imbalance between two paths by regulating the migration rate between the paths, CROSS achieves stability by repetitive reshu ing of ows between paths and increasing the cost of path migration.

THE FLOSS MECHANISM
In this section, we present the FLOSS mechanism (Flow-Loyalty Oscillation-Suppression System).

Overview
As shown in §3, convergent path-selection strategies are characterized by careful path-switching behavior: An end-host only switches to a seemingly cheaper path with a modest probability that depends on the measured latency di erence, translating into a relatively low migration rate between paths. It is well known that system stability can be achieved by by limiting the rate of change (also known as the system gain [22]). However, the challenge is to develop methods that achieve this change-rate limitation in the face of sel sh, uncontrolled end-hosts. Such a method is given by FLOSS.
As sel sh end-hosts do not voluntarily conform to a modest path-migration rate, the path-migration rate has to be regulated by network operators. The FLOSS mechanism performs such regulation by rewarding end-hosts that are loyal to a certain path and by restricting arbitrary path migration by oscillating end-hosts.
In order to regulate path migration, the FLOSS mechanism makes use of registrations and proceeds in intervals. Figure 3, which shows a simulation of the FLOSS mechanism in a two-path system, illustrates the FLOSS approach. Initially, the FLOSS mechanism announces at time t that all end-hosts are required to obtain a registration for one path π of their choice. This registration allows an end-host to use path π during a future time interval End-hosts that use path π without a registration are punished in the interval (e.g., by dropping packets).
This call for registration produces a distribution of ows over the two paths, which is stable during the interval as no end-host can switch to the path which it is not registered for. However, this load distribution is unlikely to be perfectly equal. The FLOSS mechanism iteratively reduces this imbalance: In every following time interval, a small set of ows are allowed to migrate from the more expensive path to the cheaper path. This allowance is enforced by selectively granting registrations: Whereas end-hosts with a pre-existing registration for a path (loyal end-hosts) always obtain a registration for that path, end-hosts without a pre-existing registration are not always allowed to register. Once the imbalance is su ciently small, the end-hosts do not have an incentive anymore to switch paths, at which point the enforcement of the mechanism can be suspended (e.g., at the end of interval I 2 in Figure 3).
The FLOSS mechanism is an incentive-compatible stabilization mechanism.
As de ned in §4.2, incentive compatibility implies the existence of a strategy pro le that leads to stability at equal load and is a PSS equilibrium during mechanism enforcement. For FLOSS, such a strategy pro le is given by universal adoption of the FLOSScompliant path-selection strategy σ F . The strategy σ F prescribes to use the path with the lowest expected cost which the end-host is entitled to use. Our incentive-compatibility proof thus builds on the following two concrete lemmas, which are proved in §5.2 and §5.3, respectively: Universal adoption of the FLOSS path-selection strategy σ F leads to stability at equal load. L 5.3. Universal adoption of the FLOSS path-selection strategy σ F represents a PSS equilibrium during enforcement of the FLOSS mechanism.

Stability Analysis
In order to prove Theorem 5.2, we assume universal adoption of path-selection strategy σ F , i.e., an end-host always uses the path with the lower expected cost provided that the end-host is entitled to use that path.
When registering before the initial interval, all end-hosts simultaneously decide for one path to use during the upcoming interval [t 0 , t 1 ). Confronted with such a choice, each end-host aspires to commit to the path π that will be selected by fewer other end-hosts, i.e., the path π with f π (t 0 ) < fπ (t 0 ). In absence of inherent di erences between the two choices, the only Nash equilibrium of such a speculative game is given if every end-host commits to each path π with probability 1/2.
In expectation, the load on both paths α and β is thus Since no migration occurs during the interval [t 0 , t 1 ), the load distribution is expected to remain equal during the . When mechanism enforcement ends at time t 1 , the end-hosts are again free to arbitrarily select paths. However, since t 0 + T < t 1 , any end-host performing a re-evaluation after t 1 perceives the Wardrop equilibrium c α (t −T ) = c β (t −T ) and will thus not switch paths. Therefore, the system is stable at equal load even when the mechanism is not enforced anymore.
In reality, however, variance makes it likely that the load on paths α and β is not perfectly equalized at t 0 . In that case, the FLOSS mechanism attempts to eliminate the remaining load di erence the end-hosts can again register on paths for an upcoming interval [t 1 , t 2 ). At t , all end-hosts correctly perceive the cost di erence between a cheaper path π and a more expensive pathπ , as for every pathπ , cπ (t − T ) = cπ (t 0 ) = cπ (t ) due to the constant load in [t 0 , t ). The core idea of the FLOSS mechanism is to determine and enforce a migration allowance ρ π (t 1 ), which is an upper bound on the amount of end-hosts that are allowed migrate from pathπ to path π at time t 1 .
Importantly, ρ π (t 1 ) is chosen such that which implies c π (t 1 ) ≤ cπ (t 1 ) (i.e., the cheaper path π will remain the cheaper path in the next interval even if a share ρ π (t 1 ) of endhosts on the more expensive pathπ migrate to path π ). This choice of ρ π (t 1 ) ensures the correct incentives for the end-hosts. Given such an assurance, end-hosts registered on the cheaper path π during [t 0 , t 1 ) minimize their cost by remaining on path π . Since these end-hosts are considered loyal to path π , their registration at path π will be renewed for the upcoming interval [t 1 , t 2 ). Conversely, all end-hosts registered on the more expensive pathπ would minimize their cost by migrating to the cheaper path π . However, the FLOSS mechanism restricts this migration by only granting a registration for π to a share ρ π (t 1 ) of end-hosts onπ . The non-migrating endhosts on pathπ are considered loyal on pathπ and are thus allowed to renew their registration atπ . Therefore, exactly ρ π (t 1 ) · fπ (t 0 ) migrate from pathπ to path π at time t 1 , which reduces the di erence in load and cost between the paths π andπ . By repetitive mechanism application with appropriately chosen migration allowances, the FLOSS mechanism can arbitrarily minimize the cost di erential between the paths π andπ . When the cost di erence becomes so small that end-hosts perceive a Wardrop equilibrium, the mechanism has achieved stability at equal load that continues to hold even without mechanism enforcement.

PSS Equilibrium Analysis
We now prove Theorem 5.3, i.e., we show that path-selection strategy σ F is the optimal strategy for an end-host given that all other end-hosts have adopted σ F . Concretely, we show that the FLOSS mechanism induces a PSS equilibrium * = {σ F → 1}, where σ F is the universally adopted path-selection strategy with the following path-selection function: if t > t 0 and E e (π , t) and c π (t − T ) < cπ (t − T ), 0 otherwise (17) where E e (π , t) is true if and only if end-host e is entitled to use path π at time t. We assume that an end-host always knows whether it is entitled to use a path. For the initial interval, every path is selected with equal probability 1/2. For all subsequent intervals, a path π is selected if the path is perceived to be cheaper than the current pathπ and end-host e is entitled to use path π . For remaining on a pathπ , it holds that u F (π , t |π ) = 1 − u F (π , t |π ).
The FLOSS mechanism makes strategy σ F the equilibrium strategy by imposing the additional cost c M (π , t) for using path π at time t. End-host e incurs a cost c a for attempting to register and a penalty cost c p for using a path without a registration. We assume c p = ∞, i.e., the penalty cost makes a path unusable. Let A e (π , t) be true if and only if end-host e applies to register for using path π at time t and let R e (π , t) be true if and only if end-host e obtained a registration for using path π at time t, i.e., R e (π , t) = A e (π , t)∧E e (π , t). Using these predicates, the cost imposed by the FLOSS mechanism can be expressed as where [P] = 1 if the predicate P is true and 0 otherwise. A sel sh end-host e chooses its actions such that its cost from the mechanism is minimized. Therefore, an end-host e requests a registration if and only if the end-host is entitled to the registration, as there is no bene t of a registration request that will be refused. Thus the relevant mechanism-imposed cost for end-host e is Concerning the initial interval with start t 0 , both paths α and β have expected cost c π (t 0 ) = 1/2 p if all other end-hosts choose each path with probability u F (π , t |π ) = 1/2. As both paths have the same cost and both paths require a registration, the usage cost of both paths is c M u (π , t 0 ) = 1/2 p + c a . Independent of the current pathπ , the cost of applying strategy σ F at time t 0 is thus c M (σ F , t 0 |π ) = 1/2 p + c a for any choice of u(π , t 0 |π ). Therefore, end-host e cannot reduce its cost by choosing another path-selection probability than u F (π , t 0 |π ) = 1/2, which makes σ F an equilibrium strategy for the initial interval.
Concerning subsequent intervals with start t i > t 0 , we have to distinguish two cases for the current path π of end-host e, namely whether end-host e is on the cheaper path π or on the more expensive pathπ . 3 (1) If end-host e is on the cheaper path π , the cost of remaining on π is c M u (π , t i ) = c π (t i ) + c a , whereas the cost of switching toπ is c M u (π , t i ) = cπ (t i )+c a if E e (π , t i ) and cπ (t i )+c p otherwise. As always c M u (π , t i ) < c M u (π , t i ), the current path π must be selected with probability u(π , t |π ) = 1 to minimize the end-host's cost. (2) If end-host e is on the more expensive pathπ , the cost of remaining onπ is c M u (π , t i ) = cπ (t i ) + c a , whereas the cost of switching to π is c If endhost e is entitled to use the cheaper path π , the cheaper path π must thus be selected with probability u(π , t |π ) = 1 to minimize the end-host's cost, and with probability 0 otherwise. In summary, for all intervals with start t i > t 0 , an end-host e optimizes its cost by switching to an alternative path π if and only if path π is cheaper than the current pathπ and end-host e is entitled to use path π . This path-switching behavior is exactly captured by the path-selection function u F (π , t |π ). Therefore, pathselection strategy σ F is an equilibrium strategy for both the initial interval and the subsequent intervals of the mechanism, which proves Theorem 5.3.

THE CROSS MECHANISM
In this section, we present a second stabilization mechanism called CROSS (Computation-Requiring Oscillation Suppression System).

Overview
While the FLOSS mechanism (cf. §5) deterministically achieves stability at equal load, its strict enforcement of the migration allowance represents a problem in case of path failures. When a path fails, an end-host on that path is not allowed to switch to an alternative path immediately. Only when the path failure is detected after some time by the mechanism, enforcement of the mechanism can be stopped and the end-hosts can be allowed to use an alternative path. For highly critical transmissions, such in exibility is undesirable.
The CROSS mechanism allows end-hosts to obtain an insurance against such cases of path failure. Basically, the CROSS mechanism works similarly to the initial interval of the FLOSS mechanism: Endhosts are required to register for one path of their choice, which in general cannot be changed during the upcoming interval. Unlike FLOSS, however, the CROSS mechanism o ers the possibility of registration for a second path that can be immediately used in case of a path failure, even if the path failure is not yet veri ed.
However, the question is how to avoid that end-hosts always register for both paths and, if on the more expensive path, falsely claim to be a ected by a path failure and switch to the cheaper path. Such opportunistic behavior would cause oscillation. To solve this problem, the idea of the CROSS mechanism is that endhosts must prove that they need the immediate-switching option for insurance against path failures, not simply for opportunistic cost reduction. End-hosts can prove their truthfulness by paying a price for the immediate-switch option. This price must be higher than any cost gain that can be achieved by switching to a cheaper path in a scenario without path failure. An end-host that paid this price thus only switches to the backup path if a path failure has occurred; if no path failure occurred, the end-host would not trade its insurance option against the cost gain, as the insurance option is more valuable to the end-host than any cost gain. Immediate switching during the interval can thus be allowed to the end-hosts with a backup-path registration. Moreover, immediate switching behavior by those end-hosts is an indication of path failure, which means that all other end-hosts must be allowed to migrate as well.
As a price for the backup path registration, the CROSS mechanism requires the solution to a computationally hard puzzle. This puzzle is structured such that only end-hosts with a su ciently high valuation of the backup path will obtain a solution. More precisely, each puzzle E is associated with a cryptographic hash function h : {0, 1} * → [0, 1] and a di culty level δ ≥ 0. An end-host e can solve a puzzle E(π ) for registering at a backup path π by nding a value s such that h(π , t i , e, s) ≤ 2 −δ , where t i is the start of the next balancing trial. Given a cryptographic hash function, a puzzle E(π ) can only be solved by brute force, i.e., varying s in a series of hash computations. By nding an appropriate s, an end-host can obtain a backup-path registration.
Also unlike FLOSS, the CROSS mechanism allows end-hosts to register at a path of their choice not only for the initial interval, but for every interval. Therefore, even if the path failure is not detected for some reason (e.g., because no end-host obtained a backup registration), the end-host can use the alternative path in the interval after a path failure. The CROSS mechanism thus has a non-deterministic approach for achieving stability: Intervals in CROSS serve as balancing trials and are repeated until the load imbalance is small enough that end-hosts do not switch paths anymore. Since the end-hosts select each path with probability 1/2 in any balancing trial, the probability that an approximately equal load distribution results after a few balancing trials is substantial. Still, the additional exibility of CROSS results in a loss of convergence guarantees: Instead of convergence to an equal-load distribution, the CROSS mechanism only guarantees convergence to a tra c distribution with approximately equal load. A simulation of CROSS enforcement is visualized in Figure 4, which also shows the convergence produced by the CROSS approach. T 6.1. The CROSS mechanism is an incentive-compatible stabilization mechanism that achieves stability at approximately equal load, i.e., for every ϵ > 0, lim t →∞ ∆(t) < ϵ.
The CROSS mechanism achieves stability at approximately equal load by incentivizing the universal adoption of path-selection strategy σ C , which prescribes that end-hosts only use a path if they have a corresponding registration and only use a backup in case of path failures. More formally, Theorem 6.1 directly follows from Theorems 6.2 and 6.3: L 6.2. Universal adoption of the CROSS path-selection strategy σ C leads to stability at approximately equal load. L 6.3. Universal adoption of the CROSS path-selection strategy σ C represents a PSS equilibrium given enforcement of the CROSS mechanism.
While the proof of Theorem 6.2 is intuitive and can thus be found in Appendix E, Theorem 6.3 is proven below.

PSS Equilibrium Analysis
In this section, we prove Theorem 6.3 by showing that universal adoption of path-selection strategy σ C is a PSS equilibrium, i.e., if all other end-hosts adopt σ C , σ C is the optimal strategy for a single end-host e. The path-selection strategy σ C is characterized by the following path-selection function for π π : where t i is the start time of any balancing trial, cπ (t − T ) = ∞ designates a path failure and R e (π , t) is true if and only if endhost e has a backup registration for path π at time t. Moreover, u C (π , t |π ) = 1 − u C (π , t |π ). As in FLOSS, registering has cost c a , whereas using a path without registration imposes a penalty cost c p = ∞. Additionally, an end-host incurs cost by solving puzzles, where each hashing operation has cost c h . To an end-host with valuation ω of a backup path, a hash operation has the expected utility E[U h ](δ, ω) = 2 −δ ω − c h . Given puzzle-di culty level δ , an end-host thus solves a puzzle if and only if it has a backup valuation ω such that E[U h ](δ, ω) > 0. If an end-host does not solve a puzzle, it simply obtains a regular registration for one path at cost c a , where every path is selected with probability 1/2. Obtaining no registration and using any path would incur a much higher penalty cost c p c a and is thus not rational. Therefore, an end-host with a registration for one path uses this path from the start t i of the balancing trial. If an end-host solves a puzzle, the end-host obtains a backup registration for the path corresponding to the puzzle and obtains a regular registration for the other path at cost c a . Since CROSS enforces that an endhost can only switch once to its backup path and never switch back during the balancing trial, every end-host with a backup-path registration starts by using the path with its regular registration at time t i . In summary, the optimal path-selection function for all t = t i is u C (π , t |π ) = 1/2 if ¬R e (π , t).
During the balancing trial, no reallocation decisions are taken before t i +T , as the expected path costs during | between a more expensive pathπ and a cheaper path π becomes visible to the end-hosts. If the end-hosts on pathπ with a backup registration for path π switched at that point, they would save ∆C = ∫ t i +1 t i +T (cπ (t) − c π (t)) dt, which is bounded above by ∆C max = t i+1 − t i − T . However, such a switch would erase the backup value ω of path π for the end-host, which is why an endhost with backup registration for path π only switches to path π if ω < ∆C. In order to disincentivize such migration and keep the load distribution constant, the CROSS mechanism chooses the puzzledi culty level δ such that E[U h ](δ, ) > 0 if and only if ω > ∆C max . This choice of δ leads to a situation where the end-hosts with a backup registration will only switch to the backup path in case of a path failure, as these end-hosts value the backup option higher than any cost reduction obtainable without path failure. In case of a path failure, however, trading the backup value ω of path π against the in nite cost of failed pathπ is rational and the end-hosts with a backup registration switch the paths. In summary, the optimal path-selection function for end-host e and for all t t i is thus u C (π , t |π ) = 1 if R e (π , t) and cπ (t − T ) = ∞, and u C (π , t |π ) = 0 otherwise. Thereby, path-selection strategy σ C has been established as the PSS equilibrium strategy.

PRACTICAL APPLICATION
While the focus of this paper is on the theoretical exploration of sel sh path selection and stabilization mechanisms, this section lays out a pathway toward practical application of our ndings. First, we discuss practical requirements for inter-domain stabilization mechanisms in §7.1. In §7.2, we present a mechanism-enforcement architecture that conforms to these requirements. In §7.3 and §7.4, we outline how the FLOSS and CROSS mechanisms could be practically implemented.

Requirements
If a stabilization mechanism is to be practically applied by network operators in an inter-domain architecture, the mechanism must conform to the following requirements: (1) Limited overhead: The stabilization mechanism must only induce a small overhead on the systems of network operators. In particular, the genuine function of AS border routers (forwarding tra c at line rate) must not be compromised by expensive mechanism-enforcement tasks. Note that both mechanisms only need to be enforced by routers in case of oscillation and until stabilization is achieved; however, the mechanisms should induce little overhead even during this short time span.

(2) No explicit inter-AS coordination (coordination-freeness):
The stabilization mechanism must not rely on explicit inter-AS coordination. Such explicit coordination may not be feasible or scalable, as the domains that perceive the same oscillation pattern may be mutually unknown, mutually distrusted, or very distant from each other.

Mechanism-Enforcement Architecture
To enforce a stabilization mechanism, an AS operator needs the means to detect, inform, and punish the sel sh entities that employ an oscillatory path-selection strategy. In this section, we describe a mechanism-enforcement architecture that provides these means to an AS operator while conforming to the requirements in §7.1.
From an inter-domain perspective, the most important architectural question is the question of coordination, i.e., how each AS (d) Figure 5: Oscillation patterns.
perceiving an oscillation pattern contributes to oscillation suppression. As explicit inter-AS coordination is undesirable, an implicit method for responsibility assignment is necessary. We leverage a fundamental property of paths in inter-domain network graphs as a natural way to assign responsibility for interdomain oscillation suppression. This fundamental property is based on the following insight: For every pair of paths connecting the same origin and destination ASes, there is at least one AS (henceforth: the splitting AS) in which the paths split, i.e., the paths contain di erent egress interfaces out of the AS. For every oscillation between two paths, there is thus at least one AS which perceives the oscillation as an oscillation of tra c between egress interfaces, not only as periodic upswings and downswings in the load at one egress interface. Such splitting ASes are the natural candidates for a leading role in inter-domain oscillation suppression, as these ASes are both best informed about the oscillation and in the best position to manage the oscillating tra c.
For illustration of the path-splitting property, Figure 5 shows di erent types of oscillation patterns for paths connecting an origin end-host O and a destination end-host D. In the simplest cases, the oscillation may be perceived at the origin AS (AS A 1 in Figure 5a) or at one intermediate AS (AS A 1 in Figure 5b). However, the oscillation may be perceived at multiple splitting ASes. The di erent paths may pass through a di erent number of egress interfaces at which the mechanism is enforced. For example, path π 3 in Figure 5c only passes through one critical egress interface (at AS A 0 ), whereas paths π 1 and π 2 pass through two critical egress interfaces. Conversely, each path in Figure 5d passes through two egress interfaces at which a load-balancing mechanism is enforced. Any stabilization mechanism may thus be applied repeatedly and with di erent frequency to ows belonging to the same oscillation-prone system.
In the intra-domain context, the mechanism-enforcement architecture envisages a centralized oscillation-suppression service (OSS) in each AS. The OSS is capable of interacting with the border routers at the egress interfaces. For a splitting AS, this OSS functions as displayed in Figure 6. By collecting aggregate load statistics from the border routers, the OSS in the splitting AS can identify the egress interfaces between which oscillation occurs (through correlation). As the presence of such oscillation means that the AS is obliged to enforce a stabilization mechanism, the OSS equips every oscillation-perceiving border router r i with data M i that is necessary to enforce the mechanism (e.g., start time of the next interval). By further collecting load statistics from the egresses, the OSS monitors and continuously adapts the execution of the mechanism. The border routers communicate with the origins of the oscillating ows by appending mechanism-relevant information to passing packets.

FLOSS in Practice
In the following, we discuss how the FLOSS mechanism could be applied by the mechanism-enforcement architecture from §7.2, while conforming to the practicality requirements laid out in §7.1, namely limited overhead and coordination-freeness.

Limited Overhead.
Registration on routers. In order to signal that end-hosts must register for an upcoming time interval, a border router appends the start time t i of the next interval to passing packets. If an end-host witnesses such a call for registrations in its packets, it can send a packet with a registration request over the desired egress. A border router can keep track of registrations using a Bloom lter, which approximates a set of ow IDs. A Bloom lter o ers constant complexity for both lookup and insertion, although su ering from false positives. When checking for registrations, false positives result in unregistered ows being able to send over an egress and being rewarded like loyal ows. However, the enforced migration rate ρ can simply be discounted by the false-positive rate of the Bloom lter such that the desired migration rate is enforced despite the presence of lucky unregistered ows.
Enforcement of single registration. In order to avoid that an endhost registers on multiple egresses, a border router forwards all registrations to the OSS, which keeps track of egress-speci c registration by ows and can therefore spot multiple registrations by the same ow. If multiple registrations are detected, the OSS pushes a blacklist update for the malicious ow ID to the border routers. In order to avoid introducing DoS attacks where a malicious actor provokes the blacklisting of an end-host by sending multiple registrations, we assume some form of lightweight source authentication, which is typically o ered by path-aware Internet architectures [31].
Selective admission of migrating ows. Border routers need an e cient way to decide whether to grant registration applications to ows that are willing to switch paths, while preserving the property that a maximum share ρ of ows migrates. Such selective admission can be implemented using a publicly know hash function h, which maps the ow ID f to the interval [0, 1]. If h(t i | f ) < ρ, the registration is granted, where t i is the beginning time of the next registration-enforcement interval. This construction has the advantage that an end-host can locally check whether it will be accepted on the alternative ingress, as h, t i , and f are known to the end-host. Therefore, the border router is not bothered by registration requests from end-hosts that would be rejected. Furthermore, it is important to choose the ow ID f based on attributes that the source end-host cannot easily in uence without compromising its communication, e.g., source and destination IP, but not source or destination port.
Small tra c allowance for unregistered ows. While unregistered end-hosts should not be able to properly use an egress, these endhosts should be able to send a few packets over the egress to measure the latency of the corresponding path. Also, short ows, e.g., DNS requests, should not be required to obtain a registration. Such a limited tra c allowance can be e ciently achieved by applying the mechanism only to a subset of packets, e.g., by sub-sampling. If registrations are only checked for a sub-set of packets, even an unregistered ow has a high chance of getting a few packets through the egress, while still experiencing severe disruption when sending a large number of packets over the egress. Due to the structure of congestion-control algorithms, sub-sampling rates as low as 1% already cause enough packet drops to make a path completely unusable for unregistered ows [24]. Moreover, sub-sampling reduces the workload on border routers.
Addition of new ows. In reality, new ows appear during the execution of the mechanism. Clearly, these ows cannot register in advance for an enforcement interval, as these ows do not exist beforehand. Therefore, new ows are also allowed to register at one path of their choice during an enforcement interval. In order to distinguish new ows from ows that merely pretend to be new, the FLOSS mechanism samples the active ows at both egresses in every interval and inserts them into a Bloom lter. These previously active ows are supposed to have a registration in the subsequent interval. In contrast, truly new ows can be identi ed with a lookup failure in the mentioned Bloom lter. Due to false positives, a truly new ow might be mistaken for a previously active ow and thus be denied a retroactive registration. However, given a small falsepositive probability, the probability that such a mistake appears at multiple egresses is negligible such that registration at one path should always be possible in practice. As all new ows (except the false-positive new ows) during an interval must be expected to ock to the cheaper path, the migration allowance must be discounted by the birth rate of ows.

Coordination-Freeness.
If there is one splitting AS for an oscillation-prone system, there are no unintended e ects due to distributed application of the mechanism. However, as explained in §7.2, there may be multiple mechanism-enforcing ASes along a path. If n i is the number of splitting ASes along path π i , the costs for obtaining a registration for π i and for using π i without a registration are n i · c a and n i · c p , respectively. In cases where n i is the same for every path π i of an oscillation pattern (such as in Figure 5d), the incentives for the end-hosts thus do not change compared to a single-application scenario. However, if n i is di erent for the paths π i in the oscillation-prone system (such as in Figure 5c), the registration cost for di erent paths may be di erent. For example, the registration cost for obtaining a registration of path π 3 in Figure 5c is c a , whereas the corresponding cost for paths π 1 and π 2 is 2c a . Since c p = ∞ > n i c a for all nite n i , registering for a path is still worthwhile. However, an equilibrium between the two egresses of AS A 0 is only reached if (f π 1 + f π 2 ) p + 2c a = f p π 3 + c a , which implies stability at unequal load. However, since the cost c a for obtaining a registration is modest (just a single packet as explained in §7.3.1), the resulting load imbalance between the ASes is also modest. Therefore, no explicit inter-AS coordination is needed.

CROSS in Practice
In this section, we discuss the CROSS mechanism with respect to the two practicality requirements.

Limited
Overhead. Compared to FLOSS, the only additional piece of functionality needed for CROSS is puzzle verication. E cient puzzle-solution veri cation on border routers is performed by a hash function evaluation with the appropriate arguments, among which is the solution value provided by the data packet (cf. §6.2).

Coordination-Freeness.
Like FLOSS, CROSS su ers from the minor issue that some paths may require more registrations than other paths. Concerning backup registrations, multiple applications of the mechanism do not constitute a problem, as an end-host always has to solve only one puzzle to obtain a backup registration. For example, an end-host in the network of Figure 5c could insure against path failure as follows. At AS A 0 , the end-host would obtain a normal registration for π 3 and a backup registration for π 1 and π 2 . Such a combined backup registration is possible by including only the respective egress of AS A 0 in the puzzle solution, not the speci c path. At AS A 1 , the end-host can then obtain a normal registration for one of these paths, e.g., π 1 . If the end-host desires an additional insurance against failure of path π 1 , the end-host can solve a puzzle to obtain a backup registration for π 2 at AS A 1 . Since only one puzzle per backup path is needed, no explicit inter-AS coordination is necessary to preserve the incentives of the CROSS mechanism.

RELATED WORK
Prior research has devised tra c-engineering tools to improve network stability. However, due to the traditional paradigm of networkcontrolled path selection, most tools assume that packet forwarding is performed by series of decisions taken by the hops along a path. Systems such as AMP [17], ReplEx [11], Homeostasis [23], and HALO [25] thus prescribe how routers along a path should take forwarding decisions, mostly by adapting tra c-splitting ratios based on network information. If packets must be forwarded along a path chosen by the end-host, these schemes cannot be used.
An alternative line of work is generally compatible with the emerging paradigm of end-point path selection. Assuming source routing, this avor of research prescribes path-selection strategies that lead to convergence. However, such convergent path-selection strategies are always designed for an intra-domain context, i.e., for path selection within a domain where end-points are under control of the network operator. Due to the sel shness of end-hosts in the inter-domain context, these schemes are thus impractical. For example, Proportional Sticky Routing [27] relies on self-restraint of end-points, which leads to persistent preference of shortest paths over alternative paths even when alternative paths are more attractive. The convergence of MATE [7] and the rerouting strategy designed by Kelly and Voice [22] is built on the assumption that the end-points restrain themselves to a maximum speed when reallocating tra c on cheaper paths, which cannot be expected from sel sh end-hosts. In TeXCP [21], end-points are expected to comply with maximum tra c-reallocation allowances dynamically set by the network. Similarly, the rerouting policies designed by Fischer and Vöcking [14] require that end-hosts do not exceed a certain probability for switching to a cheaper path. Finally, OPS [20] also demands behavior from end-hosts that is irrational in a game-theoretic sense, in particular the probabilistic usage of sub-optimal paths.
Inter-domain tra c engineering by means of incentives has only been studied in context of the BGP ecosystem, thus not accounting for path choice by end-hosts. Given rational ASes, there are di erent methods to achieve stability for inter-domain tra c: incentivecompatible yet oscillation-free BGP policies [9,39], egress-router selection under QoS constraints [18], cooperative tra c-engineering agreements between ASes reached by Nash bargaining [36], and the use of prices as tra c-steering incentives [26].

CONCLUSION
In this work, we have set up a game-theoretic framework that allows to test path-selection strategies on their viability for sel sh endhosts, i.e., to show whether it is rational for an end-host to adopt a path-selection strategy, given that all other end-hosts use said path-selection strategy. Only strategies that form such equilibria may be adopted in an Internet environment, where end-hosts are self-interested and uncontrolled.
Using this framework, we have shown that the non-oscillatory path-selection strategies traditionally proposed in the literature are not rational strategies and thus cannot be expected to be adopted by sel sh, unrestricted end-hosts. This insight suggests that end-hosts must be incentivized to abstain from oscillatory path selection by means of stabilization mechanisms. We have designed two stabilization mechanisms and proved their incentive compatibility.
We understand our work as a rst step and we believe that it opens several interesting avenues for future research. In particular, it would be interesting to quantify the cost of oscillation to a network and to investigate its relationship to the network type. Comparing the oscillation cost to the overhead of stabilization mechanisms would then allow to characterize the conditions under which the employment of stabilization mechanisms is appropriate.

A EXAMPLE OF STABILITY
The oscillation-prone system from Section 2.3 is stable if a su cient number of end-hosts anticipate the greedy strategy σ g with an antagonist strategy σ a . An end-host adopting the antagonist strategy always selects the path with the higher perceived cost, speculating that the seemingly cheaper path will soon be overloaded by greedy-strategy players: Conversely, u a (π , t |π ) = 1 − u a (π , t |π ).

B EXAMPLE OF PSS EQUILIBRIUM ANALYSIS
In this section, we illustrate the calculation of strategy costs of the form set out in §2.4 by investigating whether the strategies described in Appendix A form PSS equilibria. Proving that a strategy pro le is not a PSS equilibrium amounts to nding a deviant strategy that reduces an end-host's cost. Indeed, there exist such deviant strategies for the strategy pro le with (σ g ) = q and (σ a ) = 1 − q for all q ∈ [0, 1]. For the case q ≤ 1/2, there is no inversion of link costs and a deviant agent can always assume that f π (t) > fπ (t) if the agent perceives f π (t − T ) > fπ (t − T ). The best strategy given such a strategy pro le thus consists of switching to the cheaper pathπ in a deterministic and immediate fashion, as in the greedy strategy σ g presented in §2.3. Every delay of switching simply translates into more time needlessly spent on a strictly more expensive path. As the greedy strategy σ g allows an end-host to reduce its cost, (σ g ) would quickly rise from q as more end-hosts adopt this strategy. Therefore, any strategy pro le with q ≤ 1/2 is not a PSS equilibrium.
For q > 1/2, the periodic dynamics are structured as where t = t − t + (t), For showing that the antagonist strategy σ a allows an end-host to improve its cost if q ∈ (1/2, 1], we construct a mixed strategy σ p (q ). This strategy σ p (q ) plays the greedy strategy σ g with probability q and the antagonist strategy σ a with probability 1−q . We show that an end-host minimizes its cost by choosing q = 0 given q ∈ (1/2, 1 , i.e, the antagonist strategy σ p (0) = σ a is the better strategy than the greedy strategy σ p (1) = σ g .
As mentioned in §2.4, the cost of a strategy in periodic oscillating systems is computed over a single periodic interval. For the dynamics above, it is even su cient to calculate the strategy cost between two turning points t + 0 and t + 1 , as the costs of the paths α and β would simply be reversed in the subsequent turning-point interval. Without loss of generality, we thus operate on a turningpoint interval [t + 0 , t + 1 ] during which path α is perceived to be the cheaper path and f α (t + 0 ) < f β (t + 0 ). The time-dependent strategy cost C(σ p (q ), t) for the deviant agent is calculated based on a linear combination of the two path costs, weighted by q : We further assume R ≤ W , as any choice of higher R forces an agent to select a path that is sub-optimal during at least time R −W . Using this limitation, it is possible to derive a formula for the strategy cost C(σ p (q )|O) that is a linear function of q , where γ is constant w.r.t. q and the slope m is using the abbreviation a = A + q − 1. The cost function steepness is assumed to be p = 1, as the integral in Equation (25) is not tractable otherwise.
The slope m can be shown to be positive for all R > 0, r ∈ [0, 1], and T ≥ T (R), where T (R) is such that W = R. Showing this property is feasible in a two-step proof, where we rst show m(T ) > 0 for T = T (R) and ∂/∂T m(T ) > 0 for all T > T (R). The positiveness of m implies that the minimum of the strategy cost C(σ p (q )|O) is achieved for q = 0, i.e., the antagonist strategy σ a .
Given a strategy pro le with q > 1/2, the adoption rate q of the greedy strategy would thus quickly decrease in favor of the antagonist strategy σ a . Therefore, no strategy pro le with q > 1/2 represents a PSS equilibrium.

C PROOF OF OBSERVATION ??
We can numerically show that there exist oscillation-prone systems where the greedy strategy σ g ensures a lower cost than an underdamped convergent strategy σ c . In fact, the oscillation-prone system O assumed in Figure 2 is such an oscillation-prone system where the strategy σ c in an underdamped fashion does not yield the optimal cost. Using the de nition of strategy cost introduced in §2.4, we calculate both C(σ c |O) and C(σ g |O).
In the calculation of C(σ c |O), we choose u(π , t | π t ) as de ned in Equation (11). Furthermore, we can assume that (π t |t) = f π t (t), because an agent applying strategy σ c allocates its tra c in accordance with all other agents and its probability distribution of being on a certain path is equivalent to the general tra c distribution over the paths. As for the calculation of C(σ g |O), we know that u(π , t |π ) = 1 if c π (t − T ) < cπ (t − T ), 0 otherwise, In Figure 7, the comparison of strategy costs for σ c and σ g are shown for all R ∈ [0, 1] and the mentioned oscillation-prone system O. Clearly, given the oscillation-prone system O where agents universally apply an underdamped convergent strategy σ c , any single agent would have an incentive to switch to a greedy strategy σ c . The underdamped convergent strategy σ c is thus not a PSS equilibrium.

D PROOF OF OBSERVATION ??
The ow-allocation vector F ∼ before projection is given by (using the abbreviation f π for f π (t − T )) The projection on the feasible allocation set is the intersection of the line describing the feasible set F β = d − F α and the line through F ∼ which is orthogonal to the feasibility line: This intersection is at F α = 1/2 · d − F β + F α + γ (c β − c α ) . The change in an end-host's ow on path α is thus If path α appears to be the more expensive path, this change is performed by the re-evaluating end-hosts on path α, and otherwise by the re-evaluating end-hosts on path β. Multiplying by the number of re-evaluating end-hosts thus yields the aggregate dynamics where ∆(t − T ) = c β (t − T ) − c α (t − T ).

E CROSS STABILITY ANALYSIS
To prove Theorem 6.2, we show that stability at approximately equal load arises given universal adoption of path-selection strategy σ C , i.e., end-hosts use a path if they have a registration for that path and only use a backup path in case of a path failure.
For stability at approximately equal load with parameter ϵ, we assume that an end-host does not reallocate tra c at time t if the imbalance between paths ∆(t − T ) = | f α (t − T ) − f β (t − T )| is less than ϵ and thus the perceived cost di erence is too small to justify path migration. If the imbalance ∆(t) can be kept below ϵ for a period of length T , i.e., ∆(t) < ϵ for all t ∈ [t,t + T ), there will be no reallocation during the following interval [t + T ,t + 2T ) and, by extension, also none in all subsequent intervals.
In any balancing trial with start t i , there will result a tra c imbalance ∆(t i ) = | f α (t i ) − f β (t i )|. This imbalance remains constant during time [t i , t i + T ), as the end-hosts only perceive the imbalance at time t i + T . Thus, if ∆(t i ) < ϵ, stability at approximately equal load is reached and enforcement of the mechanism can be suspended. However, if ∆(t i ) ≥ ϵ, stability is not achieved and the balancing trials are repeated until ∆(t i ) < ϵ.
Since an end-host selects each path with probability 1/2, the distribution of f α (t i ) on [0, 1] can be approximated with a normal distribution N possessing mean µ = 1/2 and variance σ 2 that depends on the number of end-hosts. If Φ(f α ) is the CDF of N , then the probability that ∆(t i ) < ϵ is p <ϵ = Φ((1+ϵ)/2)−Φ((1−ϵ)/2) > 0. With an increasing number of balancing trials over time t, the probability that ∆(t i ) < ϵ goes to 1 for t → ∞. Therefore, for t → ∞, it also holds that ∆(t) < ϵ, which is stability at approximately equal load. Theorem 6.2 thus holds.
Indeed, the CROSS mechanism eventually achieves stability at approximately equal load even without relying on the computational puzzles mentioned in §6.1. However, it is desirable that oscillation can already be avoided during the execution of the mechanism. In particular, if a balancing trial fails and ∆(t i ) ≥ ϵ, no oscillation should take place until the start of the next balancing trial, i.e., during time [t i + T , t i+1 ). If the imbalance ∆(t i ) becomes visible to end-hosts at time t i +T , the end-hosts on pathπ with a backup registration for path π could migrate. However, since CROSS ensures that an end-host with a backup registration only uses its backup path in case of a path failure (see next section), no migration takes place at all during [t i + T , t i+1 ). Therefore, in absence of a path failure, the load distribution remains constant during the whole duration [t i , t i+1 ) of a balancing trial.