A Holistic Approach for Locating Traffic Differentiation in the Internet

The worldwide debate over Network Neutrality (NN) has been raging on for nearly two decades. According to NN principles, all traffic in the Internet must be treated with impartiality. In particular, unfair Traffic Differentiation (TD) is not allowed. Several strategies have been proposed for detecting TD, but locating the source of TD is still an under-explored topic. In this work, we present a holistic approach for unifying TD detection solutions into a single framework with the purpose of locating the source of TD. We propose an algorithm for combining measurements from multiple vantage points, and a strategy for selecting good vantage points. Our proposals leverage Internet peering properties to infer the behavior of individual Autonomous Systems (ASes), without requiring knowledge of the exact routes traversed by measurement probes. To evaluate our proposals, we first ran several experiments to confirm that indeed Internet routes do present the required properties. Then, several simulations were performed to assess the efficiency of our proposals. Results show that our approach is capable of locating TD under several different conditions. Another finding is that issuing measurements from a few end-hosts of core Internet ASes achieves similar results than from a much larger number of end-hosts at the edge.


Introduction
Network Neutrality (NN) has been the focus of hot debates around the world since Tim Wu coined the term in 2002 [1]. NN states that all traffic in the Internet must be treated with impartiality, regardless of its origin, destination and/or content. This effectively means that unfair Traffic Differentiation (TD), such as prioritizing or degrading specific traffic flows, are prohibited [2]. Arguments in favor of the implementation of this principle claim that TD may threaten the open nature of the Internet as an environment that fosters innovation, fair competition, and consumer's freedom of choice [3,4]. Arguments against NN claim that giving Internet Service Providers (ISPs) more freedom to manage their own networks fosters competition and innovation [5,6].
Several countries around the world have established NN regulations for preventing unfair TD [7]. However, ISP compliance cannot be ensured by the regulations alone. Furthermore, even on a nonregulated environment, it is important to ensure the transparency of traffic management practices adopted by ISPs. In this context, there are several proposals in the literature for monitoring NN violations on the Internet [8]. These proposals focus mostly on detecting the presence of TD between end-hosts, employing a myriad of different measurement techniques and statistical methods. However, those solutions only detect TD, they are not capable of locating where exactly in the network discrimination was introduced. Locating TD is important to enforce * Corresponding author. regulations and to empower consumers by revealing potentially discriminatory behaviors from certain ISPs. To the best of our knowledge, there has been very little work on locating TD [9][10][11][12], all of which rely on unrealistic assumptions, such as prior knowledge of the complete network topology (at the host level), and knowledge about the precise paths traversed by measurement traffic.
In this work, we address the problem of locating TD with more realistic assumptions than those of previous works. In our proposal we recognize the fact that it may not be possible to know the exact host-level path between end-hosts, there may even be multiple paths, and the path actually traversed can change over time [13]. We take a holistic approach that aims at unifying the several existing TD detection solutions. We propose an algorithm that locates TD by combining TD detection measurements issued from multiple vantage points. By taking advantage of Internet peering properties instead of assuming complete knowledge about the paths between end-hosts, the proposed algorithm is able to infer which Autonomous System (AS) was responsible for discriminating traffic. We also propose a strategy for selecting vantage points that will effectively contribute to locate TD. Using both the proposed algorithm and the proposed strategy as building blocks, we finally describe a complete solution for locating TD in the Internet. An earlier version of the proposals and some of the results presented in this work were published as a conference paper in [14]. The proposed algorithm combines measurements taken by any existing TD detection solution -thus it can be seen as a ''meta'' strategy for aggregating measurements from multiple sources. By taking advantage of inter-AS routing properties, we list the possible AS-level paths between the measurement points [15]. Inference is made by checking which ASes are present in all possible paths between each pair of measurement hosts. For instance, if a given AS is in all paths between two hosts, and no TD was detected between them (i.e. as indicated by some TD detection tool), then it is possible to reach the conclusion that AS is not employing TD, since the measurement traffic surely went through it and yet no TD was detected.
The strategy for selecting measurement points searches for pairs of measurement points in such a way that all possible paths between the selected points traverse certain given ASes. The idea is that measurements issued from the selected points, when combined using the proposed algorithm, will effectively contribute to infer the behavior of the given ASes.
Finally, we present the complete solution for locating TD. The solution identifies which AS is discriminating traffic between a given pair of end-hosts. The main idea is to filter out the ASes that are not employing TD until only one AS remains -the one responsible for discriminating traffic. This is done by selecting measurement points using the strategy proposed, and combining the measurements issued from them using the proposed algorithm.
We performed two different sets of experiments to evaluate our proposals. We first executed experiments on the PlanetLab global testbed [16] to check whether our assumptions regarding the properties of AS-level paths are valid in the wild. Next, several simulations for assessing the efficiency of the solution for locating TD under different scenarios were executed. Results from the first set show that the peering properties we assume are valid for the majority of paths observed. Furthermore, results also reveal that path discovery techniques, employed by related work, may not return reliable results. Then the second set of experiments show that the proposed solution is capable of locating TD under different scenarios. Moreover, similar results were observed when combining measurements obtained with a large amount of ASes in the edge of the Internet and those from a small number of ASes in the core. The simulations also show which metrics should be employed to find more effective measurement points.
The main contributions of this paper are thus: • This work advances the state of the art regarding the underexplored problem of locating TD in the Internet • We propose an algorithm for inferring the behavior of ASes by combining measurements from multiple vantage points without requiring complete knowledge of the exact path between end-hosts • A strategy for selecting measurement points that can effectively help locate TD is proposed • A complete solution for locating TD that uses the proposed algorithm for combining measurements and the strategy for selecting measurements points is described and evaluated through simulation • Metrics for selecting good measurement points are proposed and evaluated through simulation • We show through simulation results the relation between the assumed Internet peering properties and the efficiency of our solution under different scenarios • We make an innovative use of Internet peering properties • We report experiments that show the limitations of path discovery techniques in the Internet, which are employed by related work • We report experiments that show to which extent Internet peering properties are valid in the wild The rest of this work is organized as follows. Section 2 presents related work. Then, an overview of key background concepts follows in Section 3. Next, we present the system model in Section 4. Then, the proposed algorithm for combining measurements is described in Section 5. Section 6 describes the proposed strategy for selecting measurements points. The complete solution for locating TD in the Internet is finally presented in Section 7. In Section 8, we describe the PlanetLab experiments for validating the assumed routing properties. Simulations for evaluating the complete solution for locating TD are presented in Section 9. Finally, we draw conclusions in Section 10.

Related work
Detecting TD in the Internet is a research topic widely explored in the literature [17][18][19][20][21][22][23][24]. A comprehensive survey [8] describes multiple TD detection solutions that rely on different types of network measurements and statistical methods. Measurement probes may be issued from one or several end-hosts. Other strategies rely on passive monitoring. The type of traffic employed by the probes may also differ. In general, TD detection is performed by comparing the measurements obtained to identify whether any set of measurements was statistically different from other sets, which may characterize a discriminatory treatment of network traffic. The solutions proposed in the present work make use of any of the existing TD detection solutions, rather than presenting yet a new alternative.
Few strategies have been proposed to locate where in the network discrimination was introduced. Most solutions [9][10][11] rely on path discovery techniques, in particular the traceroute tool [25]. Traceroute is supposed to discover the exact path traversed by measurement traffic between end-hosts. The idea is to try to detect the presence of TD on different subsets of the paths between end-hosts, in order to identify which portion of the path (or specific point) was responsible for introducing the discrimination. The major shortcomings of using these techniques is that traceroute is not reliable, since it is not always able to obtain the complete path traversed. Even when the path is discovered with precision, there is no guarantee that the measurement traffic traversed the same path as the application traffic under investigation. The experiments presented in Section 8 confirm some of the limitations of traceroute-like techniques. Our proposals for locating TD presented in this work take into account all the possible paths traffic may traverse between end-hosts, avoiding the shortcomings described above.
In [12] the authors propose an algorithm based on network tomography to detect TD and also locate in which host or link the discrimination occurred. The idea is to combine end-to-end measurements between several pairs of end-hosts. A system of equations is built with measurements as sums of intermediate values, each corresponding to a link traversed. Inconsistencies on the resolution of this system may indicate the presence of TD. The strategy relies on two strong assumptions: the exact host-level topology of the network is known, and as are the exact paths traffic traverses between all pairs of endhosts. These assumptions may represent a problem in practice, as the host-level topology of the Internet is dynamic and inferences can be misleading. Furthermore, the path between a pair of end-hosts may change at any moment due to different reasons, e.g. load balancing, traffic exchange policies, router faults. In comparison, the present work only assumes knowledge of the AS-level topology, which can be easily obtained [26] -using Border Gateway Protocol (BGP) routing tables, for example. Moreover, our proposals do not assume that traffic always traverse the same path.
In another related work [27], we proposed an architecture for collecting and combining TD-related measurements from a plethora of sources. These include for instance, receiving inferences from a TD detection service running on the Cloud, or collecting NN-related measurements from an Internet of Things gateway. The architecture follows a hybrid active/passive approach, in which measurements are passively collected and combined, but active measurements can be requested on demand in order to investigate suspicious cases detected by aggregating the passive measurements. We argue that the proposals we introduce in the present work are possible directions for implementing that architecture.

Background
This section presents an overview of AS-level Internet routing properties, and describes a specific solution as an example of an effective TD detection solution that can be employed in our proposals.

AS-level routing properties
ASes are independent networks, owned by different organizations, each with a different set of assigned IP prefixes. The Internet consists of the interconnection of these independent networks. Several ASes may be traversed by any given traffic flow in the Internet. An AS-level path is defined as the sequence of ASes traversed by a data packet from the source end-host to the destination end-host. The path that is traversed by each packet depends both on its final destination and on the traffic exchange agreements each AS has with its neighbors. As a packet arrives, the AS decides to which neighboring AS the packet will be forwarded to. This decision is done by consulting a BGP table, which contains the neighboring ASes that can reach the final destination. One of these ASes is chosen according to a set of policies adopted.
ASes connect to other ASes in order to gain access to parts of the Internet which are not reachable directly from the local AS itself or through customer ASes. Traffic exchange agreements between ASes are not publicly available. However, it is possible to abstract the relationship between ASes into four categories [26]: peer-to-peer ( 2 ), sibling-to-sibling ( 2 ), customer-to-provider ( 2 ), and providerto-customer ( 2 ). A 2 relationship means that the two ASes exchange traffic between them and their customers without payments. When the two ASes are owned by the same company, they may exchange traffic freely in a 2 relationship. In the case of 2 , a customer AS purchases transit services from a provider AS. Similarly, in a 2 relationship an AS provides transit services to a customer AS, i.e. access to other parts of the network. A widely accepted model for characterizing Internet paths is the Gao-Rexford model [15]. According to this model, AS-level paths in the Internet follow the valley-free property. In an AS-level path, ASes are customers of other ASes that provide transit services, and a transit provider AS is paid by the customer AS. An AS providing transit services without being paid by anyone configures a valley in the path, hence the name of the property. In a valley-free path, for each AS providing transit services there is a customer AS neighboring it, i.e. paying for the service. Thus the valley-free property states that valid paths in the Internet comply with the following pattern: any number of 2 links, followed by up to one 2 link, followed by any number of 2 links. There may be any number of 2 links anywhere along a path. Fig. 1 shows an example of a real AS-level topology with the corresponding relationships, inferred by CAIDA [28]. In the figure, the path Copel → RNP → UFPR is valley-free, since the transit provider (RNP) is being paid by its customer (UFPR). However, Copel → Sercomtel → Level 3 is a valley path, since no one is paying the transit provider Sercomtel.
Between any pair of end-hosts in the Internet, there may be several valley-free paths. Furthermore, each packet may traverse a different path, even packets of the same flow. This depends on the traffic exchange agreements in place and the routing policies of each AS [29]. The actual path traversed may also change over time [13]. For instance, in the topology shown in Fig. 1, paths Sercomtel → Copel → Level 3 and Sercomtel → ALGAR → Level 3 are both valley-free. In this particular case, the Sercomtel AS may prefer to exchange traffic with ALGAR through the 2 link, since it would be cheaper than using the 2 link with Copel.
We present experiments for evaluating to what extent the valley-free property is valid in the Internet in Section 8.

A representative solution for detecting TD
As described in Section 2, a large number of solutions for detecting TD have been proposed. In this work, TD is located by combining measurements issued by one or more of these solutions. As a confirmation that there are effective TD detection strategies, we give a brief overview of a representative solution next.
Wehe [23,24] is effectively able to detect TD by comparing measurements from two traffic flows transmitted between the same pair of end-hosts: a real traffic flow from an application under investigation, and that same traffic flow encrypted through a VPN tunnel. The authors report results based on a measurement campaign that lasted a whole year conducted by end-users with the solution running on their mobile devices. 1,045,413 measurements were obtained, from 126,249 users connected to 2735 different ISPs in 183 countries/regions. From the obtained data, the authors were able to perform a large-scale investigation of TD practices in the Internet. TD was detected in 7 different countries, and the majority of the TD cases detected affected video streaming services.
In the remainder of this work, we call TD detectors these solutions for detecting the presence of TD between end-hosts in the Internet. We argue that the inferences made by different TD detectors, and/or by multiple instances of the same TD detector, may complement each other to solve the problem of locating TD. Note that identifying whether the located discrimination is legal, beneficial, or not, is out of the scope of this work. We refer the reader to [30] for a detailed discussion about applications that require Quality of Service guarantees and NN regulations.

System model
In this section, we define the system model for locating TD. We first describe our assumptions in Section 4.1, followed by the model in Section 4.2. Finally, Section 4.3 describes how valley-free paths between ASes are obtained -which are a fundamental part of our proposals. Table 1 summarizes the main notations presented in this section and in the remainder of this work.

Assumptions
The proposed strategies for locating TD have requirements in terms of the AS-level Internet Topology assumed, routing properties of the network in which measurements are taken, the type of discrimination that can be located, and the availability of both TD detectors and measurement ASes. These assumptions are described next.
AS-level Internet Topology: The AS-level Internet topology is assumed to be known. Datasets that infer the Internet AS-level topology are publicly available, including the AS Rank project [28] by CAIDA, which is employed in this work. In particular, the relationships between ASes Set of initial pairs in a simulation scenario was inferred from AS Rank data and employed to build an AS-level topology graph, as described in Section 8.

Valley-free Property:
Another assumption is that the valley-free property is valid. This property is accepted as essential to guarantee the convergence of the BGP routing protocol [31]. In Section 8 we present experiments that assess to which extent this property is actually valid in the Internet.
Discrimination Types: In this work we assume TD based on packet content, such as application protocol, destination port, or even payload (through DPI -Deep Packet Inspection). TD based on the origin or destination of packets, for example, is out of the scope of this work. Recent work [24,32,33] reports results that indicate that TD based on content is common in the wild.
In this context, we assume that ASes always discriminate the same types of traffic, regardless of its origin or destination, or which ingress/egress points the traffic has traversed. As a consequence of this assumption, if traffic between a pair of end-hosts corresponding to a certain application is discriminated by an AS, traffic from the same application between any other pair of end-hosts will be discriminated by that AS as well.
TD Detectors: We assume the availability of at least one TD detector that is able to recognize the presence of TD between two given ASes. For instance, such TD detector can consist of an application running on end-hosts connected to the ASes, or a Virtualized Network Function (VNF) deployed within their networks. Note however that TD detectors are not able to locate TD (only detect). Detection alone leaves questions unanswered, as the paths between a given AS pair often consists of multiple ASes, any of which can be the responsible for TD. Locating TD is about pinpointing the exact one discriminating traffic.
Measurement ASes: We assume that a set of ASes are accessible for running TD detectors. We call them measurement ASes. In a real deployment, the set of available measurement ASes may change over time, for example if the set of connected end-hosts varies dynamically. However, in this work, we only focus on locating TD using the measurement ASes available during a specified period of time. Furthermore, we do not take into account specific characteristics of the end-hosts connected to measurement ASes (e.g. mobility, energy consumption).

System model
The AS-level topology of the Internet is represented as a directed graph = ( , , ), where is the set of ASes in the network and is the set of connections between ASes. Let = { 2 , 2 , 2 , 2 } be the set of possible relationships between ASes: ∶ → is the function that maps a link ∈ to the corresponding relationship ∈ .
The set of measurement ASes is a subset of , i.e. ⊆ . A path = { , … , } is a sequence of ASes connecting ASes and . Let , be the set of all paths between ASes and . Furthermore, is the sequence of links of a path ∈ , , and = { ( ) | ∈ } is the sequence of relationships between the corresponding ASes. A path ∈ , is valley-free if follows the valley-free property as described in Section 3. The set of all valley-free paths between two ASes and is denoted by , ⊆ , .
We classify a given AS with respect to TD as either discriminatory, neutral, or unknown. An AS is classified as discriminatory if has been found to employ TD. Otherwise, it is classified as neutral. Finally, an AS is classified as unknown if after running all available solutions for detecting and locating TD it was not possible to infer whether it was employing TD or not. Let be the inferred behavior of AS ∈ . Similarly, a pair of ASes ( , ), , ∈ is can be classified as either neutral or discriminatory. For a neutral pair of ASes, no end-to-end TD was detected between them, but if TD has been detected, then the pair is classified as discriminatory. Let , be the inferred behavior of the pair of ASes ( , ), i.e. the output of a TD detector.

Searching for valley-free paths
A modified Depth-First Search (DFS) is employed in order to find the valley-free paths between ASes on the graph representing the topology. This search works as a traditional DFS, but discards all paths that do not follow the valley-free property. For this DFS, we employ a parameter ( ) that sets a maximum limit to the length of the paths. When searching for valley-free paths between two ASes, parameter corresponds to the number of additional links that can be added to the size of the shortest valley-free path between those ASes. The authors in [34] show that the complexity for listing all paths with bounded length between two vertices in a weighted directed graph is ( + 2 log ), in which is the number of vertices (| |) and is the number of edges (| |).
For instance, with = 1, the DFS will find all shortest valley-free paths, as well as all valley-free paths one link larger than the shortest path. It is common for AS-level paths in the Internet to be larger than the shortest possible path. The experiments presented in Section 8 show to what extent this happens. This upper bound is important to keep the search computationally feasible. Let ∈ , be the shortest valley-free path between ASes and . We define , = { ′ | ′ ∈ , , | ′ | ≤ | |+ } as the set of valley-free paths between ASes and with length not larger than the length of the shortest path plus . Let } be the set of all ASes in all possible valley-free paths between 1 and 2 . Fig. 2 shows all the possible valley-free paths between the ASes 1 and 2 , i.e.

, 2
, for = 0. There are two possible paths between the pair: { 1 , 1 , 3 , 2 }, and { 1 , 2 , 3 , 2 }. Since we do not know which path would be effectively traversed by traffic between 1 and 2 , we consider all possible paths. Moreover, in this example T. Garrett et al.

Algorithm for combining measurements
In this section, we present the algorithm proposed for combining inferences made by TD detectors. The goal of this algorithm is to identify which ASes are neutral, discriminatory, or unknown. The algorithm aggregates measurements from TD detectors running on multiple endhosts -it is thus a holistic approach that relies on the outcome from any TD detector, including those not yet proposed. The main idea is to first identify neutral ASes, and then identify discriminatory ASes through a process of elimination.
The algorithm receives as input a set of measurements is a tuple containing two ASes ( and ) and the corresponding inferred behavior , . The algorithm outputs a tuple ( , , ), in which is the set of neutral ASes, the set of discriminatory ASes, and the set of unknown ASes. The algorithm consists of two steps. First, all the neutral AS pairs in are evaluated to identify neutral ASes. The second step consists of a search for discriminatory ASes in . A pseudo-code of the proposed algorithm is presented in Algorithm 1. We further describe the algorithm next.   classified as neutral, and added to the output set (line 9). Note that contains at least ASes and themselves. If any of these ASes had employed TD, , would be discriminatory since the traffic from the TD detector would have surely traversed them. The rationale is that it is not possible to know which path the TD detector traffic actually took, thus the algorithm looks for the ASes that are present in all paths. For instance, in the example shown in Fig. 2, if TD was not detected for the pair ( 1 , 2 ) ( 1 , 2 = ), then the ASes 3 , 1 , and 2 would be inferred as neutral, since they are in all possible paths. However, nothing can be inferred for ASes 1 and 2 , thus their classification remains unknown.
In the second step (lines 12-24), AS pairs for which TD was detected are evaluated. Let ′′ = {( , , , ) | ( , , , ) ∈ , , = } be the set of measurements that classified AS pairs as discriminatory. For each measurement ( , , , ) ∈ ′′ , the algorithm removes from each possible path ∈ , the ASes in (line 14). Let ′ = { ⧵ | ∈ , } be the set of valley-free paths that remain after removing the neutral ASes. If all non-empty paths ′ ∈ ′ contain only a single AS , then is classified as discriminatory and added to (lines [19][20]. However, if there is more than one AS in the non-empty paths, they remain classified as unknown -it is not possible to know which of these ASes were traversed by traffic from the TD detector. Fig. 3 shows two examples using the same pair of ASes as in Fig. 2. Suppose that TD was detected between 1 and 2 . In Fig. 3(a), suppose that ASes 1 , 2 and 1 are already known to be neutral. Since there are two other ASes, 2 and 3 , through which measurement traffic may have traversed, it is not possible to know which one of them is responsible for TD. Therefore, both 2 and 3 remain unknown. However, let us suppose then that 3 is also known to be neutral. As shown in Fig. 3(b), the only remaining unknown AS would be 2 , thus it becomes possible to infer 2 = . The complexity of Algorithm 1 is derived as follows. The search for all valley-free paths is performed once for every measurement in . Considering the complexity of listing all valley-free paths between two ASes with bounded length , as described in Section 4, the resulting complexity is ( + 2 log ), in which is the number of vertices (| |), is the number of edges (| |), and is the number of measurements being combined by the algorithm (| |).

Strategy for selecting measurement points
In this section, we propose a strategy for selecting pairs of measurement ASes that can effectively help infer the behavior of a single given AS when TD measurements are combined. We call this single given AS a suspect AS. In order to be able to classify the behavior of the suspect AS, a measurement campaign is carried out from the selected measurement ASes. This strategy is a building block for the complete TD location solution proposed next in Section 7.
Let be the suspect AS. The proposed strategy searches for a pair of measurement ASes = ( , ), , ∈ that has not been previously selected, and for which all valley-free paths ∈ , traverse , i.e., ∀ ∈ , , ∈ . The rationale is that if the suspect AS is in all possible paths between the selected measurement ASes, then it is guaranteed that the traffic from TD detectors will traverse the suspect AS, which may contribute to eliminating or confirming the suspicions about that AS when measurements are combined by Algorithm 1.
The search for AS pair is limited by a parameter , which sets the maximum valley-free distance from up to the limit at which measurement ASes are to be checked. Therefore, the proposed strategy tries to form an AS pair starting from the measurement ASes closer to , up to the measurement ASes that are at distance to . The valleyfree property improves the efficiency of this search, since it reduces the number of paths to check. For instance, if itself is available for measurement ( ∈ ), then measurement pairs ( , ) are employed, such that the distance from to varies from 1 to . But if ∉ , then measurement pairs are formed with ASes that are at distance 1 from . In case those are also not available, pairs of ASes that are at a distance 2 from are tried, and so on, up to distance . Fig. 4 shows an example using the same portion of the graph as in Fig. 2. In this example, = 2. There are four measurement ASes within distance 2 of (the suspect AS to be investigated): 1 , 2 , 3 , 4 ∈ . The pair ( 1 , 2 ) follows the criteria described above and could be selected for investigating , since all possible paths between 1 and 2 traverses . However, pair ( 3 , 4 ) would not be selected, since there is a possible path between them that does not traverse , which is { 3 , 4 , 2 , 4 }.

Complete solution for locating TD
In this section, we describe the complete solution proposed for locating TD in the Internet. This solution uses both the algorithm proposed in Section 5 and the strategy proposed in Section 6 to identify which AS is discriminating traffic between a given pair of end-hosts. The main idea is to filter out the neutral ASes from a list of suspects until only the discriminatory AS remains. The proposed solution receives as input a pair of ASes = ( 1 , 2 ), 1 , 2 ∈ , which we call the initial pair. The goal is to locate which AS in the paths between 1 and 2 is discriminatory. The output of the solution is a tuple ( , , ), in which is the set of neutral ASes, the set of discriminatory ASes, and the set of unknown ASes.
The proposed solution investigates each AS in the possible paths between the initial pair, filtering out the neutral ASes until only the discriminatory AS remains, by a process of elimination. The solution is divided in 5 steps, shown in Fig. 5. The Initialization step builds a set of suspect ASes (the ASes to be investigated). Then, in the AS Pair Selection step, two ASes are selected from the set of measurements ASes. Measurements are executed between the selected pair in the TD Detection step. The outcomes of these measurements, together with all previous measurements, are combined in the Inference step. The AS Pair Selection, TD Detection, and Inference steps are repeated until a halting condition is met. The solution finishes in the Completion step, returning the output. We describe each step in more detail next.

Initialization
In this step, a set of suspect ASes = 1 , 2 is selected, which consists initially of all the ASes in the valley-free paths between the initial pair . The behavior of all ASes in is initialized as unknown. Fig. 2 shows an example of an initial pair = ( 1 , 2 ) and all the possible valley-free paths

AS pair selection
This step consists of choosing a suspect AS and using the strategy proposed in Section 6 to select a pair of measurement ASes to investigate the chosen suspect. We take all discriminatory pairs ′ = ( , ), , ∈ , , = for which measurements have already been taken in the TD detection step. The first time this step is executed, if the initial pair = ( 1 , 2 ) is available for measurement (i.e. ⊆ ) then is selected. Then, we count how many times each suspect ∈ is present in all possible paths , . The suspect that appears less times is selected to be investigated. This heuristic relies on the fact that AS is less likely to be discriminatory, and thus might be filtered earlier. If no discriminatory pair has been found yet, the first suspect in is selected.

TD detection
In this step, the presence of TD between the pair of measurement ASes = ( , ), , ∈ , selected in the previous step, is assessed. A TD detector is executed on and , and returns , .

Inference
This step consists of combining all measurements made by TD detectors using Algorithm 1. The input set consists of all measurements obtained so far. The set of suspects is updated using the output of the algorithm: neutral ASes are removed, and unknown ASes are added.

Completion
The halting conditions of the proposed solution for locating TD are the following: (i) an AS in the paths between the initial pair ( 1 , 2 ) is classified as discriminatory, thus TD has been located; (ii) all ASes in 1 , 2 are classified as neutral, thus no TD was found; and (iii) all measurement AS pairs have already been used to investigate the suspects -in this case TD could not be located, and one or more suspect ASes remain classified as unknown. If any of these conditions is met, the TD location solution finishes. The final output is tuple ( , , ), i.e. the output of Algorithm 1 executed as part of the last Inference step.

Evaluation: AS-level graph and paths
In this section we describe experiments for checking our assumptions related to Internet routing. The results shown in this section are based on an AS-level topology graph built using a dataset from the CAIDA research group, within their AS Rank project [28]. The same topology graph is also employed by the simulations presented in Section 9. Therefore, in this section we first describe the graph and the dataset from which it was built in Section 8.1. Next, the experiments for validating our assumptions are described in Section 8.2.

AS-level topology graph
The CAIDA dataset used to built the AS-level topology graph we employ in this work contains relationships between ASes in the Internet, inferred based on BGP data [26]. The dataset we used includes 86,622 unique ASes. 24,815 of these ASes have no relationship with other ASes, and were thus ignored in our evaluations. Therefore, we built a topology graph containing 61,807 different ASes. The graph = ( , , ) was built by creating a vertex for each AS (set ), and an edge between each pair of ASes with a relationship in the dataset (set ). The type of relationship ( 2 , 2 , 2 , or 2 ) is indicated by a label on each edge (function ).
In this work, we employ two centrality metrics extracted from the topology graph: the betweenness and the valley-free betweenness. The betweenness centrality measures to which extent a vertex is present in the shortest paths between all other pairs of vertices. To be precise, the betweenness of a vertex is the sum of the fractions of shortest paths between all other pairs of vertices in which the vertex is present [35]. We call valley-free betweenness centrality a variation of this metrics that takes into account only the shortest valley-free paths. The strategy we propose for selecting measurement ASes rely on finding paths that traverse specific ASes, the suspects. Therefore, these metrics may be a good indication of the ability of ASes to be employed as measurement points. For instance, ASes with higher betweenness are present in more paths, which may turn them more likely to be selected for measurement.

AS-level paths in the internet
We conducted an experiment on the PlanetLab global testbed with three goals: (i) to assess to which extent the valley-free property is valid in the Internet; (ii) to determine the length of AS-level paths in the Internet; and (iii) to assess to which extent traceroute is a reliable tool for obtaining paths in the Internet.
In this experiment, we obtained the paths between numerous Internet IP prefixes and 29 PlanetLab hosts. The list of Internet prefixes employed, along with the corresponding ASes, was produced by CAIDA [36]. we chose a single prefix for the ASes with multiple prefixes. There were also a few ASes in the list of prefixes that do not appear in the AS-level topology graph we built. Such ASes were then discarded. In the end we employed 60,578 prefix/AS pairs in the experiment.
The paths to all prefixes/ASes were continuously measured from each PlanetLab host using the traceroute tool, from January 10, 2019 to February 1, 2019, for a total of 22 days of measurements. Each measurement resulted in a list of IP addresses, from a PlanetLab host to an Internet prefix, i.e. a host-level path. We converted all hostlevel paths to AS-level paths by mapping the IP addresses to the corresponding ASes. This mapping was performed using the same list of prefixes from CAIDA found in [36]. However, a common issue with traceroute measurements is that some hosts along the path do not send a reply after the probes, or reply with an invalid address. In these cases, we may not know that the corresponding AS is in the path, unless another host in the same AS replies to the probe. We describe how we addressed this issue below.
The AS-level paths obtained were then classified as valley, valleyfree, or unknown. Valley-free paths present the valley-free property in the topology graph, as described in Section 3, while valley paths do not present the property. For the paths that presented measurement errors, such as described above, we first checked if they presented the valleyfree property when ignoring the errors. In these cases we classified the paths as valley-free, assuming that another host in the same AS replied to the traceroute probes. Otherwise, we classified the paths as unknown, since we cannot know if the measured path is complete (there might be ASes missing in the obtained AS-level path) and thus cannot know the actual classification. There were also a few paths containing links not present in the graph, which were excluded from our results.
A total of 75,597,104 traceroute measurements were collected, out of which 1,801,089 (2.38%) had links not present in the graph and were thus excluded, resulting in 73,796,015 AS-level paths. A total of 40,837,151 unknown paths were observed (55.34%). The remaining paths consisted of 32,703,036 (44.31%) valley-free paths, and 55,828 (0.35%) valley paths. A total of 48,283 unique ASes were reached through the valley-free paths (79.7% of all prefixes measured).
We also investigated parameter , to discover how frequently and by how much AS-level paths are larger than the shortest paths. For each of the 32,703,036 valley-free paths measured, we compared its length with the length of the shortest valley-free path between the same pair of ASes in the graph. Results show that 55.78% of the paths measured had the same length as the shortest path in the graph, 31.87% were one link larger, and 10.34% were two links larger.
Results show that the vast majority of paths that were successfully measured (i.e. they are not unknown) followed the valley-free property, which is a key assumption of the present work. On the other hand, more than half of the measurements resulted in unknown paths, which shows the limitations of the traceroute-like techniques, which are employed by other existing proposals for locating TD as described in Section 2. Finally, the majority of the observed valley-free AS-level paths in the Internet (87.65%) has length at most a single link larger than the corresponding shortest path.

Evaluation: Locating TD
In this section, we present simulation results executed to evaluate the complete solution for locating TD proposed in Section 7. The main goals of the experiments are: (i) to evaluate whether the proposed solution is capable of locating TD under different scenarios; (ii) to evaluate how measurement points with different characteristics impact the efficiency of the proposed solution; and (iii) to identify between which pairs of ASes it is more efficient to locate TD with the proposed solution.
The rest of this section is organized as follows. We first describe the methodology in Section 9.1. Then we give details about the implementations in Section 9.2. Next, the simulation scenarios are presented in Section 9.3. We then describe the parameters employed in the simulations, and how we chose their values in Section 9.4. Section 9.5 presents results comparing several different sets of measurement ASes, while Section 9.6 compares different sets of initial pairs. We then present results based on different assumptions: in Section 9.7 do not assume that the initial pairs are available for measurement, and in Section 9.8 we consider paths larger than the shortest paths. Finally, we discuss the results and limitations of our evaluation in Section 9.9.

Simulation roadmap
We evaluated the complete solution for locating TD described in Section 7 under several different scenarios, varying the initial pair of end-hosts between which TD is to be located, the set of measurement ASes , as well as parameter . Results are evaluated according to three criteria: (i) the success rate, i.e. the percentage of simulations that located TD successfully in each scenario; (ii) the average number of measurement AS pairs selected by the solution across all simulations for each scenario, i.e. the average number of measurements; and (iii) the number of measurement ASes available that can be selected in each scenario, i.e. the size of set .
The optimal set of measurement ASes is the one that achieves the highest success rate, issuing the smallest number of measurements, and containing the smallest number of ASes available for measurement. The rationale is that the number of ASes available for measurement may be limited. Furthermore, issuing a large number of measurement campaigns imposes an overhead on the network.
We do not compare our solution with related work, since as described in Section 2 our solution relies on different assumptions and thus addresses a different problem. We do not claim that our solution achieves better success rates or requires less measurements than other existing solutions. We do claim that our proposals rely on more realistic assumptions with respect to path knowledge.

Implementation
We implemented the proposed solution in C++, using the Boost Graph Library. 1 The implementation followed a modular design, allowing any TD detector to be used as a module of the software. For the purpose of evaluation, in addition to and , we added another two parameters to the implementation, and . Parameter is an upper bound for the number of AS pairs that are checked to investigate a suspect. For instance, when searching for an AS pair to investigate a suspect AS , if for different AS pairs the paths between them do not all traverse , the strategy no longer tries to investigate . Parameter is an upper bound on the number of AS pairs that may be selected to investigate a given suspect. Thus, after AS pairs have been selected to investigate a suspect, no more pairs will be selected for that suspect. The goal of these parameters is to limit the search space with respect to measurement AS pairs, in order to make it feasible to run a large number of simulations. We describe how the values of these parameters were set in Section 9.4.
The simulator itself was also implemented in C++, and executes the solution for locating TD under different scenarios, and uses an ''oracle'' as the TD detector. The oracle receives as input two ASes , ∈ , and returns the inferred behavior , . The oracle has perfect knowledge about which AS is discriminatory. The oracle checks if AS is in any 1 https://www.boost.org/doc/libs/1_76_0/libs/graph/doc/index.html Otherwise, , = . The oracle considers that if is in at least one path between and , then that would be the path traversed by traffic, i.e. it always assumes the worst case. The rationale for not using a real TD detector is that our goal is to evaluate the proposals for locating TD, which rely on any type of TD detector.
All simulations presented in this section were executed on a server machine based on an Intel Xeon E5-2690 v2 processor with 200GB of RAM memory, running Linux Mint 19.1.

Simulation scenarios
The simulator executes the solution for locating TD under multiple scenarios. Each simulation scenario receives as input set ∈ of initial pairs, and a set ⊆ of measurement ASes. In each experiment and for each scenario several simulations are executed. For each initial pair = ( 1 , 2 ), we take each AS ∈ 1 , 2 (the ASes in the paths between 1 and 2 ) and execute a simulation in which is the AS responsible for TD. The simulation is considered to be successful if AS is classified as discriminatory. Furthermore, we also execute a simulation with no AS employing TD, in which case the simulation is successful if all ASes ∈ 1 , 2 are classified as neutral. Therefore, each scenario results in ∑ ∈ (| 1 , 2 | + 1) simulation runs. All simulations were executed on the same AS-level topology graph , built from the CAIDA dataset, as described in Section 8. We assume at first that on each simulation the ASes in are also available for measurement, in addition to those ASes in set . We also present results without this assumption in Section 9.7.
Sets and were built based on metrics extracted from the graph, as well as on the classification of ASes available on the PeeringDB website [37]. PeeringDB is an online database in which operators share information regarding their networks. According to [38], the number of ASes registered on the website as transit, access and content providers is representative of the corresponding sets in the Internet. We obtained the list of ASes of these types from PeeringDB in June 20th, 2019. Furthermore, we ordered the ASes based on degree, betweenness centrality, and betweenness centrality taking into account only valleyfree paths. Table 2 shows the sets of ASes employed. The columns of the table indicate for each set: name, description, and number of ASes. The first three sets were taken from the PeeringDB website. The last three sets consist of the ASes with the highest values for the corresponding metrics. The values of we employed were: 10, 50, 100, 500, and 1000.
We created six sets of initial pairs, shown in Table 3, using the sets of ASes described above. Each of these sets contains 1000 different pairs of ASes. The table also shows the total number of simulations executed on scenarios employing each set. Set pdb-a2a contains 1000 pairs randomly selected from the ASes in the pdb-access set, i.e., from all possible pairs between access providers (from PeeringDB), we randomly picked 1000 pairs. This set represents a common situation in the Internet: two end-hosts, connected to access providers, communicating with each other, such as in a P2P application. Analogously, sets pdb-c2c and pdb-t2t are composed of ASes from sets pdb-content and pdb-transit, T. Garrett et al. respectively. Moreover, the pdb-a2c set contains 1000 pairs randomly selected in such a way that one of the ASes in each pair is from the pdb-access set and the other from the pdb-content set. This represents another common situation: an end-user accessing a content provider, such as a video streaming service. Similarly, sets pdb-a2t and pdb-c2t are composed of access/transit providers and content/transit providers, respectively.

Parameters
Our proposals employ two parameters, and . We employed = 0 in most experiments presented in this section, thus we examine only the shortest valley-free paths between ASes. We do, however, present results for = 1 in Section 9.8, since paths one link larger than the shortest path are common in the Internet, according to the experiments described in Section 8. As for parameter , we set = 2 on all simulations, thus only measurement ASes up to 2 hops away from the suspects are considered. Higher values would significantly increase the search space, since a large portion of the graph would be at a distance of 3 or more hops from the suspects. Furthermore, as we observed in the results presented later in this section, measurement ASes farther from the suspects are rarely selected. Fig. 6 shows the Cumulative Distribution Function (CDF) of the valley-free distances for all pairs of ASes in the graph. The figure shows for each distance value the rate of pairs of ASes distant to each other up to that value. For instance, about 5% of all pairs of ASes are up to 2 hops away from each other. On the other hand, for a distance of up to 3 hops, the rate raises to about 35% of the AS pairs.
In order to choose values for parameters and , we ran several simulations employing different values. In these simulations, we employed a set of initial pairs containing 1000 pairs selected randomly from all ASes in the graph. We employed two different sets of ASes as the measurement ASes : degree-le-2 and vfbet-top-1000. These two sets presented the best results overall, as described later in this section.
First, we ran several sets of simulations employing a fixed large value for , and several different values for . We employed = 100, while ranged from 10 to 100, in increments of 10. For each value of , 8479 simulations were executed. Fig. 7 shows the results obtained from these simulations. The success rate achieved by each set of measurement ASes for each value of is shown in Fig. 7(a), while Fig. 7(b) shows the average number of probes issued when using each set and for each value of . It is possible to see that both the success rate and the average number of probes did not vary much as the value of increased. We chose = 20 for our simulations, which is the value for which the success rate had the largest increment for both sets of measurement ASes. Therefore, we discard a measurement AS after 20 attempts for each suspect.
Next, we ran simulations with = 100, and ranging from 10 to 100. Fig. 8 shows the results obtained from these simulations. The success rate achieved by each set of measurement ASes for each value of is shown in Fig. 8(a), while Fig. 8(b) shows the average number of probes issued when using each set and for each value of . The success rate and average number of probes for the vfbet-top-1000 did not increase much as the value of increased. However, for the degree-le-2 set, both the success rate and the number of average probes increased significantly. We chose = 40, since larger values would significantly increase the search space, and thus also the execution times, but without achieving significantly better results. Therefore, in our simulations, up to 40 AS pairs are selected for each suspect.

Results: Comparing measurement ASes
We first present results comparing the following metrics: degree, betweenness and valley-free betweenness. The sets of measurements ASes degree-top-, bet-top-and vfbet-top-, for ∈ {10, 50, 100, 500, 1000}, were built based on these metrics, respectively. The success rate achieved by each of these sets is shown in Fig. 9, on scenarios employing = pdb-a2a. For all values of , the highest success rates were achieved by sets vfbet-top-(from 29% for vfbet-top-10 to 93% for vfbettop-1000), and the lowest by sets degree-top-. Given the best success rates were achieved by vfbet-top-1000, we will not show results for the other sets in the remainder of this work.
The distance between ASes in vfbet-top-and suspects is generally smaller, in comparison with ASes in degree-top-and bet-top-. For instance, the average valley-free distance from ASes in vfbet-top-1000 to suspects was 0.79, while it was 0.87 for degree-top-1000 and 0.85 for bet-top-1000. Being closer, there are less paths and less ASes between pairs of ASes from vfbet-top-1000, and thus it is less likely that the discriminatory AS is present on the measurements between them, which results in suspects being filtered earlier (inferred as neutral).
We evaluate next the sets of measurement ASes ( ) degree-eq-1, degree-le-2, pdb-access, pdb-transit, and vfbet-top-1000. Results for these sets are shown in Fig. 10, with = pdb-a2a. The success rates of each set across (i) all simulations, (ii) simulations in which was in the initial pair , and (iii) simulations in which was not in are shown in Fig. 10(a). The average number of probes (i.e. requests to the oracle) across (i) all simulations, (ii) successful simulations, and (iii) unsuccessful simulations are shown in Fig. 10(b). Beside each bar there are two values, one indicating the total number of unique ASes that were selected for measurements across all simulations for each set , and another indicating the total number of ASes available for measurement (| |).
Sets degree-le-2 and vfbet-top-1000 achieved the highest success rates, 94% and 93%, respectively. However, simulations employing degree-le-2 issued significantly more probes on average. The distance between ASes in degree-le-2 is usually larger, which makes more likely to be within the paths between those ASes, causing more pairs to be selected in order to find neutral ASes. The average valley-free distance between ASes from degree-le-2 was 2.01, and 1.48 between ASes from vfbet-top-1000. Similarly, the average distances to the suspects were 1.8 and 0.79, respectively. Regarding the simulations in which no TD was present, the success rates achieved by degree-le-2 and vfbettop-1000 were 94% and 91%, while the average numbers of probes T. Garrett et al.  were 5.27 and 5.15, respectively. On these simulations, there was no discriminatory AS, thus the AS pairs selected always resulted in a suspect being filtered out. From the 41,247 available ASes in degree-le-2, 8269 were selected for measurement on all simulations, while 615 (from a total of 1000) were selected from vfbet-top-1000. This shows that the vfbet-top-1000 set achieved a similar result in terms of success rate using significantly less different ASes (615 vs 8269). As discussed later in this section, ASes in the core of the Internet are better positioned than those in the edge. Therefore, having access to a much smaller number of core ASes is enough (1000 vs 41,247). Note that the graph employed in our simulations has 61,807 ASes in total.
A slightly lower success rate, 88%, was observed for set pdb-transit, relative to vfbet-top-1000. The average number of probes was also similar, but a larger number of unique ASes were selected for measurement (811 from a total of 2293) for pdb-transit. The lowest success rates observed correspond to sets degree-eq-1 (77%) and pdb-access (71%). 6271 ASes, out of 21,220 available, were selected from degree-eq-1, while 2177, out of 5263 available, were selected from pdb-access. Furthermore, set degree-eq-1 issued significantly more probes than pdbaccess on average, due to the same reasons described above for set degree-le-2. It is also possible to observe that the average number of probes on successful simulations is significantly lower than on unsuccessful simulations, for all sets of measurement ASes employed. This happens due to the halting conditions adopted by our strategy. On unsuccessful simulations, all possible pairs of measurement ASes are selected before the strategy finishes its execution.

Results: Comparing initial pairs
We now present results comparing different sets of initial pairs ( ). We present results for sets of measurement ASes degree-le-2 and vfbettop-1000, which presented the highest success rates and contain ASes at different parts of the Internet -edge (degree-le-2) and core (vfbet-top-1000). Fig. 11 shows the success rates (for all simulations, ∈ , and ∉ ) for the sets of initial pairs pdb-a2a, pdb-c2c, pdb-t2t, pdb-a2c, pdb-a2t, and pdb-c2t. Fig. 11(a) shows the results for = vfbet-top-1000, while Fig. 11(b) for = degree-le-2. Furthermore, Fig. 12 shows the average number of probes for (i) all simulations, (ii) simulations that were successful, and (iii) unsuccessful simulations, for each set of T. Garrett et al. initial pairs. Fig. 12(a) shows the results for = vfbet-top-1000, while Fig. 12(b) for = degree-le-2.
It is possible to conclude that both measurement sets had similar success rates for all sets of initial pairs. The success rates for vfbet-top-1000 ranged from 89% to 93%, while the success rates for scenarios with degree-le-2 ranged from 94% to 96%. The main difference between the two sets was that scenarios with degree-le-2 employed significantly more probes on average, ranging from 73.12 to 102.48. The number of different ASes selected for measurement from degree-le-2 ranged from 6756 to 9084 (from a total of 41,247). For scenarios with vfbet-top-1000, the average number of probes ranged from 9.16 to 10.28, and the number of ASes selected for measurement ranged from 544 to 666 (from a total of 1000).

Results: ⊄
We also simulated scenarios considering that the initial pair is not available for sending probes. The goal of this set of simulations is to check if it is possible to detect TD between two ASes that we do not have access to (in order to run TD detectors on). In these scenarios, only the ASes in are available to be selected for measurement -and in case the ASes in the initial pair are present in , we remove them from the set for that simulation scenario to ensure they are not available. Fig. 13 shows the success rates for the sets of measurement ASes vfbettop-1000 13(a) and degree-le-2 13(b), on scenarios with different sets . Similarly, Fig. 14 shows the average number of probes for the sets of measurement ASes vfbet-top-1000 14(a) and degree-le- 2 14(b). Results show that the success rates for both sets of measurement ASes were similar. The success rates for all simulations on scenarios with vfbet-top-1000 ranged from 49% to 56%, while the success rate for degree-le-2 ranged from 50% to 57%. As expected, the success rates for both sets were significantly lower than those of the scenarios previously presented (for ⊂ ). However, for both sets, the success rates for the simulations in which ∉ were significantly higher than for simulations with ∈ . For vfbet-top-1000, the success rates when ∉ ranged from 83% to 90%, and for degree-le-2 ranged from 72% to 81%. When ∈ , the success rates ranged from 0% to 1% for vfbettop-1000, and from 8% to 37% for degree-le-2. We explain these results below.
Due to the valley-free property, there may be no paths between ASes of the Internet core that traverse ASes in the edge of the Internet (or closer to the edge). ASes in the core, such as the ASes in vfbet-top-1000, are mostly connected to other ASes through 2 or 2 relationships -they are on the top of the Internet hierarchy (Tiers 1 and 2). For instance, only 1.8% of the relationships from ASes in vfbet-top-1000 to other ASes are 2 . Therefore, the paths between these ASes usually consist of other ASes with the same characteristics. If a path between two such core ASes traverses an AS in the edge it would violate the valley-free property, since at some point there would be a 2 link to the AS in the edge, followed by a 2 link going back to an AS in the core -i.e., a ''valley''. In this set of simulations, since the initial pair of ASes is not available for measurement, our strategy needs at least one measurement pair for which the paths traverse the discriminatory T. Garrett et al. AS : new suspects are then found, potentially better positioned so it is possible to measure and find them. However, it is often not possible to find such measurement pair when ∈ . The ASes in degree-le-2 are in the edge of the Internet, so it was possible to find paths traversing some of the ASes in the initial pairs, hence the higher success rates.
Furthermore, the average number of probes in unsuccessful simulations was significantly lower for ⊄ , when compared to the results presented previously in this section. However, the average number of probes in successful simulations is similar. For instance, let us take = vfbet-top-1000 and = pdb-a2a. The average number of probes in successful simulations for this configuration and ⊂ was 7.39, as can be observed in Fig. 12, while in unsuccessful simulations the average was 49.22. For ⊄ (Fig. 14), the average in successful simulations was 5.36, while the average in unsuccessful simulations was 13.72. The reason for this behavior is the same as described above: in the unsuccessful simulations, our strategy was able to find a much lower number of suitable AS pairs for issuing probes from when ⊄ . In the successful cases, a similar number of AS pairs was necessary.

Results: = 1
In the experiments previously described in Section 8, 55.78% of the measured valley-free paths had the same size of the corresponding shortest valley-free path in the graph, while 31.87% of the measured valley-free paths were one link larger than the shortest path. These represent 87.65% of all valley-free paths observed in the experiments. Therefore, we executed several simulations with = 1 to check if our proposal is capable of locating TD with a larger number of possible paths between end-hosts. Fig. 15 shows the success rates for the sets of measurement ASes vfbet-top-1000 15(a) and degree-le-2 15(b), on scenarios with different sets . Similarly, Fig. 16 shows the average number of probes for the sets of measurement ASes vfbet-top-1000 16(a) and degree-le-2 16(b).
For the vfbet-top-1000 set, the success rates ranged from 88% to 90% for all simulations. These values were similar to the success rates for = 0 (Fig. 11), which ranged from 89% to 93%. For the degree-le-2 set, the success rates for = 1 ranged from 84% to 87%. For = 0, the success rates ranged from 94% to 96% (Fig. 11). It is also possible to observe that the average number of probes increased significantly for both sets in comparison with the results presented previously in Fig. 12.
In all results presented in previous subsections, the success rates were always slightly higher for degree-le-2. However, for = 1, the success rates are slightly higher for the vfbet-top-1000 set. Since ASes in vfbet-top-1000 are generally closer to each other, the number of possible paths between measurement ASes increases much more for degree-le-2 then for the vfbet-top-1000 set when = 1. We also present results for = 1 and ⊄ , i.e., not considering that the initial pair is available for measurement. Fig. 17 shows the success rates for the sets of measurement ASes vfbet-top-1000 17(a) and degree-le-2 17(b), on scenarios with different sets and ⊄ . Surprisingly, the success rates for both sets of measurement ASes were higher than those of the results presented previously in Fig. 13 (for = 0). The success rates of all simulations for vfbet-top-1000 ranged from 73% to 75% (49% to 56% in previous results). For the degree-le-2 set, the success rates ranged from 62% to 68% (50% to 57% in previous results). The reason for this behavior is that there are more possible paths between measurement ASes with = 1, thus it is easier to find a measurement pair for which the paths contain the discriminatory AS . When such pair is found, new suspects start to be investigated, which are better positioned (relative to the valley-free property) than the initial suspects, as we explained previously in Section 9.7: when = 0, it is less likely that will be in the paths between the measurement ASes.

Discussion
The results presented in this section show that the proposed solution is capable of inferring the behavior of ASes by combining inferences from multiple TD detectors running on different ASes. Table 4 shows a summary of the different simulation scenarios and the main conclusions that can be drawn from the results. Selecting measurement points based on the valley-free betweenness centrality presented good results in our simulations. Another finding was that having a large number of ASes available for measurement in the edge of the Internet (41,247 from the degree-le-2 set) achieved similar success rates than having access to only a few ASes in the core (1000 from the vfbet-top-1000 set). Furthermore, much less probes were issued when employing core ASes, since they are usually closer to a larger portion of the network, compared to ASes in the edge. Thus, in order to achieve higher success rates, a much larger number of edge ASes may be necessary at multiple portions of the network, covering several vantage points. Finally, results show that it is possible to locate TD between any two ASes, even if we do not have access to them for issuing probes. However, locating TD that is happening in the core of the Internet is easier then locating TD in the edge.
We highlight that our simulations never produced false-negatives (a discriminatory AS was never inferred as neutral) nor false positive results (neutral ASes were never inferred as discriminatory). However, some of our assumptions may not always be true in the wild. Traffic may traverse valley paths, ASes may differentiate traffic based on their origin or destination, and the real AS-level topology may be slightly different. In such cases, the proposed solution for locating TD may Valley-free betweenness centrality presented the best results among the metrics extracted from the graph Comparing sets of measurement ASes 9.5 10 Sets degree-le-2 and vfbet-top-1000 achieved the highest success rates, but degree-le-2 issued significantly more probes on average Comparing sets of initial pairs 9.6 11, 12 Similar success rates were observed for all sets of initial pairs Initial pairs not available as measurement ASes ( ⊄ ) 9.7 13, 14 Success rates were significantly higher when the discriminatory AS is not in the initial pair Considering longer paths ( = 1) 9. 8 15,16 Success rates were similar to previous results, but the number of probes were significantly higher Considering longer paths and that initial pairs are not available as measurement ASes 9. 8 17 Surprisingly, success rates were higher than in Fig. 13  result in false-positives or false-negatives. It is also worth noticing that in our simulations, the oracle always assumed the worst case, i.e., it considered traffic would always follow the path containing the discriminatory AS . However, in a real situation the actual path may not traverse , in which case fewer probes might be necessary to locate TD, since suspects might be filtered earlier. In the case of unsuccessful simulations, we did not evaluate how close the solution was to locating TD. Even when the solution is not able to find exactly which AS was discriminatory, the set of suspects may have been filtered to just a few, which can be helpful. Another limitation of our evaluations is that in each simulation we considered that only a single AS was discriminatory. Although in real conditions this may not be true, a more controlled environment was employed in order to (i) evaluate if the proposed solution was actually capable of locating TD in the first place, (ii) to identify the trade-offs between success rate and number of measurements, and (iii) to identify the relations between our proposals, the valley-free property, and different types of measurement ASes and initial pairs. Finally, parameters and limited the number of measurements that could have been issued during the simulations. Without these parameters, higher success rates could be achieved, but at the expense of more measurements.

Conclusion
In this work we addressed the problem of locating TD in the Internet under more realistic assumptions than existing strategies. A solution for locating the exact AS that is discriminating traffic is proposed. The solution considers all possible AS-level valley-free paths, instead of relying on the exact knowledge of host-level paths (as other existing strategies do). The solution consists of an algorithm for combining measurements from TD detectors running on different ASes plus a strategy for selecting the best ASes to run measurements.
To evaluate our proposals, we first conducted a series of experiments on PlanetLab, in which we executed traceroute a large number of times for determining the paths from a set of end-hosts to several Internet prefixes. Results show that traceroute-like techniques employed by other solutions for locating TD may not be reliable, and that the vast majority of the paths that were successfully measured do follow the valley-free property. We then executed simulations for evaluating the proposed solution for locating TD. We defined several scenarios, varying the location of TD and the measurement points employed. We draw four main conclusions from the results obtained on these simulations: (i) few measurement ASes in the core of the network achieve similar results as a much larger number of measurement ASes in the edge; (ii) it is possible to locate TD between any two ASes in the Internet, even if they are not accessible for issuing probes from; (iii) due to the valley-free property, it is easier to locate TD in the core than in the edge; and (iv) the valley-free betweenness centrality is a good metric for selecting measurement ASes.
Future work includes using our proposals to implement a system capable of continuously monitoring TD in the Internet. That could be for instance a crowdsourcing system, in which participating users report measurements and rely on the system to monitor whether they are being victims of TD; those users should also allow their devices to be used as measurement points for other users. Another direction is the development of a hybrid version of our solution using traceroute-like techniques: if the exact path between ASes can be obtained, it may not be necessary to consider all possible paths. Furthermore, our proposals consider only TD based on application. Detecting and locating TD based on the origin/destination is still an under-explored topic. Evaluating our proposals on scenarios where multiple ASes may be discriminatory is also left for future work. Finally, another research direction is to design a system that, after locating which AS is discriminating traffic, deviates traffic through a path known to be fully neutral, circumventing the discriminatory AS.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.